European journal of Operational Research 00 0 (2017) 1-13

Contents lists available at ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier.com/locate/ejor

Decision Support

Robust and Pareto optimality of insurance contracts

Alexandru V. Asimita'*, Valeria Bignozzib, Ka Chun Cheungc, junlei Hua, Eun-Seok Kimd

a Cass Business School, City University London, London EC1Y 8TZ, United Kingdom b Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Italy c Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong d Department of Management, Leadership and Organisations, Middlesex University, London NW4 4BT, United Kingdom

A R T I C L E I N F 0 A B S T R A C T

The optimal insurance problem represents a fast growing topic that explains the most efficient contract that an insurance player may get. The classical problem investigates the ideal contract under the assumption that the underlying risk distribution is known, i.e. by ignoring the parameter and model risks. Taking these sources of risk into account, the decision-maker aims to identify a robust optimal contract that is not sensitive to the chosen risk distribution. We focus on Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR)-based decisions, but further extensions for other risk measures are easily possible. The Worst-case scenario and Worst-case regret robust models are discussed in this paper, which have been already used in robust optimisation literature related to the investment portfolio problem. Closed-form solutions are obtained for the VaR Worst-case scenario case, while Linear Programming (LP) formulations are provided for all other cases. A caveat of robust optimisation is that the optimal solution may not be unique, and therefore, it may not be economically acceptable, i.e. Pareto optimal. This issue is numerically addressed and simple numerical methods are found for constructing insurance contracts that are Pareto and robust optimal. Our numerical illustrations show weak evidence in favour of our robust solutions for VaR-decisions, while our robust methods are clearly preferred for CVaR-based decisions.

© 2017 The Author(s). Published by Elsevier B.V.

This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/40/)

Article history: Received 2 September 2016 Accepted 12 April 2017 Available online xxx

Keywords:

Uncertainty modelling Linear programming Robust/Pareto optimal insurance Risk measure Robust optimisation

1. Introduction

Finding the optimal insurance contract has represented a topic of interest in the actuarial science and insurance literature for more than 50 years. The seminal papers of Borch (1960) and Arrow (1963) had opened this field of research and since then, many papers discussed this problem under various assumptions on the risk preferences of the insurance players involved in the contract and how the cost of insurance (known as premium) is quantified. Specifically, the optimal contracts in the context of Expected Utility Theory are investigated amongst others in Kaluszka (2005), Kaluszka and Okolewski (2008) and Cai and Wei (2012). Extensive research has been made when the preferences are made via coherent risk measures (as defined in Artzner, Delbaen, Eber, and Heath, 1999; recall that CVaR is an element of this class) and VaR; for example, see Cai and Tan (2007), Balbas, Balbas, and Heras (2009);

• Corresponding author. E-mail addresses: asimit@city.ac.uk (A.V. Asimit), valeria.bignozzi@unimib.it (V. Bignozzi), kccg@hku.hk (K.C. Cheung), junlei.Hu.1@city.ac.uk (J. Hu), E.Kim@ mdx.ac.uk (E.-S. Kim).

2011), Asimit, Badescu, and Verdonck (2013b), Cheung, Sung, Yam, and Yung (2014) and Cai and Weng (2016) among others.

The choice of a risk measure is usually subjective, but VaR and CVaR represent the most known risk measures used in the insurance industry. Solvency II and Swiss Solvency Test are the regulatory regimes for all (re)insurance companies that operate within the European Union and Switzerland, respectively, and their capital requirements are solely based on VaR and CVaR. For these reasons and not only, these standard risk measures have received special attention by academics, practitioners and regulators, and therefore, vivid discussions have risen among them. VaR is criticised for its lack of sub-additivity and it may create regulatory arbitrage in an insurance group (see Asimit, Badescu, & Tsanakas, 2013a). A detailed discussion on possible regulatory arbitrages in a CVaR-based regime is provided in Koch-Medina and Munari (2016). A desirable property for a risk measure is the elicitability, which allows one to compare competitive forecasting methods, a property that VaR does have (see Gneiting, 2011). The lack of elicitability for CVaR has been adjusted via the joint elicitability, concept formalised in Fissler and Ziegel (2016), but earlier flagged out by Acerbi and Szekely (2014). Robustness properties of a risk measure are also of great interest since they imply that the estimate is insensitive to

http://dx.doi.org/10.1016/j.ejor.2017.04.029

0377-2217/© 2017 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/)

A.V. Asimit et al./European Journal of Operational Research 000 (2017) 1-13

data contamination. Parameter risk (uncertainty with parameter estimation) and model risk (uncertainty with model selection) are the two main sources of uncertainty in modelling. The robust statistic has its roots in the papers of Huber (1964) and Hampel (1968), which has been shown to be less appropriate in the context of risk management (see for example, Cont, Deguest, & Scandolo, 2010). A more informative discussion is given in the next section due to its length. Finally, a summary of all properties exhibited by the two risk measures is detailed in the comprehensive work of Emmer, Kratz, and Tasche (2015), but the general conclusion is that there is no evidence for global advantage of one risk measure against the other.

Whenever the model and parameter risks are present, it is prudent to consider insurance contracts that are optimal under a set of plausible models and this is precisely what robust optimisation does. It is a vast area of research with applications in various fields and a standard reference is Ben-Tal, El Ghaoui, and Nemirovski (2009), while comprehensive surveys can be found in Ben-Tal and Nemirovski (2008), Bertsimas, Brown, and Caramanis (2011) and Gabrel, Murat, and Thiele (2014).

The aim of the paper is to identify the optimal insurance contract under the model/parameter risk in the robust optimisation sense and understand how robust these solutions are from the practical point of view. That is, we aim to explain how large the uncertainty set should be for relatively small or medium sized historical data sets as is expected in insurance practice. At the same time, since the insurance contract is in fact a risk allocation, it is of great interest to find whether or not our robust contracts are Pareto optimal. Robust optimisation may lead to inefficient risk allocations, i.e. not Pareto optimal, which are clearly not acceptable, and special attention is given to this issue by providing a simple methodology to overcome such caveats of robust optimisation. Our numerical illustrations have shown weak evidence in favour of our robust solutions for VaR-based decisions, which is not surprising due to the erratic behaviour of VaR. On the contrary, CVaR-based decisions are more robust via robust optimisation than using statistical methods, which can be explained by the fact that CVaR takes into account some part of the tail risk as opposed to VaR. Either Worst-case scenario or regret robust optimisations is preferred (comparing to the classical statistical methods) for less (statistically) robust risk measures that are purely tail risk measures, where the estimation is based on a small portion of the sample that explains only the tail risk. We also find that the Worst-case optimisation is once again advantageous even for risk measures that are sensitive to the entire sample, i.e. are not only based on the tail risk.

The structure of the paper is as follows: the next section contains the necessary background and the mathematical formulation of our problems, while Sections 3 and 4 investigate the VaR and CVaR-based optimal insurance contracts, but also discuss simple extensions for distortion risk measures when the moral hazard is removed; these robust solutions are further investigated in Section 5 to becoming Pareto optimal as well; extensive numerical examples are elaborated in Section 6, which help in justifying our conclusions summarised in Section 7.

2. Background and problem definition

21. Optimal insurance

An insurance contract represents a risk transfer between two parties, insurance buyer (or simply buyer) and insurance seller (or simply seller). When the buyer is also an insurance company, then the transfer becomes a reinsurance contract and the seller is called reinsurer. Let X > 0 be the total amount that the buyer is liable to pay in the absence of any risk transfer. In addition, the

seller agrees to pay R[X], the amount by which the entire loss exceeds the buyer's amount, /[X], and clearly we have /[X ] + R[X ] = X. The most common risk transfers are the Proportional and Stop-loss contracts for which /[X] = cX (with 0 < c < 1) and /[X] = min{X, M}. respectively. Note that in order to avoid moral hazard issues (both players are incentivised to reduce the overall risk, i.e. / and R are non-decreasing functions), /, R e Cco, where

Cco = {f is non-decreasing | 0 < f (x) < x, |f (x) - f (y) | < \x - y | for all x,y e ift} .

The comonotonic risk transfers (as defined above) are omnipresent in practice, but it is not always the case and the mathematical formulation of the feasibility set becomes

C = {f | 0 < f (x) < x for all x e W /.

Let P be the insurance premium, and it is further assumed that any feasible contract satisfies 0 < P < P, where P represents a maximal amount of premium that the buyer would accept to pay. If the loss distribution is known, then the premium calculations are possible via certain rules, known as premium principles. A concise review of premium principles can be found in Young (2004). Specifically, if P is the probability measure for X. then P > a0 + (1 + 0)HP(R [X]|, where a0 > 0 represents some fixed/administrative costs, 0 > 0 is the risk loading parameter/factor, and H is a monotone functional on the space of non-negative random variables that depends on the seller's choice of premium principle. The mono-tonicity property is of practical importance and it means that if two random losses satisfy Y < Z. then HP (Y) < HP (Z). A commonly encountered premium principle is the distortion premium principle (see Wang, Young, & Panjer, 1997),

> (Y/ = j g/P(Y > y )/ dy

for any non-negative loss random variable Y. where g: [0, 1] ^ [0, 1] is non-decreasing with g(0) = 0 and g(1) = 1 known as distortion function. When the distortion function is taken to be the identity function, we obtain the expected value premium principle, which is standard in the insurance industry. The mathematical formulation of the optimal insurance problem becomes

min {pp (X - R[X ] + P)j ] s.t. rn0 + (1 + &)HV (R[X ])

(R,P)eCxW

< P < P,

where pP is a risk measure chosen by the buyer to order its preferences to risk. As explained in Section 1, it is first assumed in this paper that pP e {VaR, CVaR}. Recall that the lower script P indicates the probability measure under which the risk measurement is made. The VaR of a loss variable Y at a confidence level a e (0, 1), is given by VaR„(Y ;P) = infyeM {P(Y < y) > a}. Note that VaRa is representable as in (2.1) with g(t) = I.>i-p} , where represents the indicator operator that assigns the value 1 if A is true and 0 otherwise. The CVaR risk measure is defined in Rockafeller and Uryasev (20 00) as follows:

CVaR„ (Y ; P) = inf jt + j-aEP (Y - t)+ J, where

= max (t, 0)]

Alternative representations are known in the literature (see for example, Acerbi & Tasche, 2002) and one of them is as in (2.1) with

s(t) = i-a a L

Due to the monotonicity property of VaR, CVaR and the functional H , (2.2) becomes much simpler when removing the economic constraint P < P and it has been investigated under various sets of assumptions. Recently, Cheung and Lo (2017) included the latter constraint and analytically solved (2.2) for a large class

A.V. Asimit et al. /European Journal of Operational Research 000 (2017) 1-13

of premium principles and risk measures, including the class from (2.1).

The existing literature assumes that the loss distribution is certainly known, and as a result, the parameter and model risks are removed. Small and medium sized samples (present in non-high frequency data, as is usually the case in insurance data) raises many questions when estimating any parameter even if the model risk is completely removed, i.e. the chosen model is correct. Large samples are more concerned with the model risk, which can be reduced if the model is carefully selected. Thus, if we know what we need to estimate, for example the optimal objective function value from (2.2), for which its closed-form solution is required, the elic-itability (see Gneiting, 2011) of this functional (induced by the optimal objective function value) is the next step in order to compare various models and reduce the model risk. While VaR is elicitable, VaR and CVaR are jointly elicitable, our functional may not be elic-itable or impossible to assess the presence of this property, since one has to find a scoring function to measure the estimation error under the plausible models. Therefore, the model selection for VaR/CVaR do not apply for our problem, even though these two simple risk measures are well-accepted as "good" risk measures. Now, even if we can select the "best" possible model that reduces the uncertainty with the optimal objective function value, we in fact solve a secondary problem, since the main purpose of this exercise is to obtain a robust decision (with respect to the insurance contract, i.e. R).

Therefore, it would be interesting to identify a more robust optimal insurance contract that would take into account the parameter and/or model error. Thus, we assume that the reference probability measure P is unknown and could be one of the m possible probability measures {P1 , P2 , . . . , Pm}. Consequently, the premium feasibility constraint becomes

«0 + (1 + 9) ■ HP (R[X ]) < P < P, for any k e M = {1. 2 . . . . , m}.

A prudent and hopefully robust decision is obtained when investigating the worst-case scenario optimisation problem

min max {p* (X - R[X] + P)}

(R,P)eCxM keM k _

s.t. ü0 + (1 + 6)UP] (R[X]) < P < P, for any k e M]

An alternative prudent decision can be achieved via the worst-case regret optimisation problem

min max {pv. (X - R[X] + P) - pt}

(R.P)eCxX keM k _ k

s.t. ü0 + (1 + 0)HP] (R[X]) < P < P, for any k e M]

where the buyer's "regret" is measured with respect to some m benchmark values . Naturally, these values are the optimal objective values for the individual models and are variants of (2.2). Specifically,

p*k = min {PP](X -R[X] +P)}]

k (R.P)eCxX k

s.t. ü)0 + (1 + 6)UP] (R[X]) < P < P, for any k e M]

These robust representations have been seen before in various ways. The worst-case type decisions were axiomatically investigated by Gilboa and Schmeidler (1989) in the expected utility context. Not surprisingly, the robust optimisation within the Portfolio Theory has its counterpart; among others, see El Ghaoui, Oks, and Oustry (2003), Zhu and Fukushima (2009), Polak, Rogers, and Sweeney (2010), Zymler, Kuhn, and Rustem (2013) and Kakouris and Rustem (2014). The worst-case and worst-case regret CVaR-based decisions in portfolio optimisation are discussed in Huang, Zhu, Fabozzi, and Fukushima (2010). According to our knowledge, the optimal insurance contract problem under parameter/model uncertainty has been investigated only by Balbas, Balbas, Balbas, and Heras (2015), where only the worst-case is investigated

for a large class of risk measures that includes CVaR, but not VaR, and a particular choice of the uncertainty set of probability measures.

We now discuss the choice of the feasibility set, i.e. C or Cco. Note that whenever the risk transfer is made between two large insurance companies, the moral hazard may not be an issue, due to the presence of rating agencies; rating downgrading has a huge negative commercial impact for such insurance companies and thus, moral hazard is less likely to occur. One may also argue that a risk transfer within an insurance group does not necessarily have to exclude the moral hazard due to the common ownership of the buyer and seller. Nevertheless, the insurance regulator requires the insurer buyer to justify the commercial purpose of such a risk transfer. In the absence of distributional uncertainty, there is a huge literature that discusses whether or not the indemnity of an insurance contract should be comonotone, but in general, the conclusion depends on the nature of the underwritten risk. On the other hand, the classical Pareto optimality problem explains the shape of an "optimal" contract and the extensive existing literature discusses how viable the comonotonic property is; an interesting discussion appears in Huberman, Mayers, and Smith (1983). Optimal transfers are shown to be comonotone (for a large class of risk preferences) in Landsberger and Meilljson (1994) if the total risk is finite, while Ludkovski and Ruschendorf (2008) extends this results to unbounded risks. In summary, choosing between a set of feasible contracts given by C or Cco is related to the specific nature of the total risk that is shared and the insurance players' risk preferences whenever the total risk distribution is known. In the presence of distributional uncertainty, the choice of feasibility set is sensitive to the nature of the total risk. Therefore, solutions to Problems (2.4)-(2.6) are given to non-comonotone contracts set, C, whenever possible, otherwise the comonotone contracts set Cco is chosen. Recall that we do not intend to characterise the optimal contract, but instead we examine when our proposed robust methods reduce the effect of distributional uncertainty.

Note that the feasible sets of Problems (2.4)-(2.6) are empty if «0 > P. We now gather the set of assumptions, stated as Assumption 2.1, under which the results of the paper hold.

Assumption 2.1. We consider m possible probability models {P1,..., Pm} and the reference probability model P may or may not belong to this set. Denote M = {1. . . . , m}. Let X > 0 be a loss random variable and denote Fk(■) = Pk(X < ■). k e M. its cumulative distribution function (cdf) under Pk, we write X ~ Pk, and Fk(■) = 1 - Fk(■) its corresponding survival function. Moreover, «0 < P. The premium principle is based on a monotonic functional H.

2.2. Robustness of risk measures

In the last few years there has been a wide and open debate on the robustness properties of VaR and CVaR, with relevant contributions from regulators, practitioners and academics. These risk measures, that we denote for brevity p depend on the probability model P used. The key question is whether a small perturbation of the probability model P would result in a small perturbation of pP , which is detailed in the next definition.

Definition 2.1. Let Xn , n > 1 be a sequence of random variables with distribution P n , n > 1 and X a random variable with distribution P. A risk measure pP (X) is (statistically) robust at P, if limn^x, d(Pn, P) = 0 implies limn^x, |pP„ (Xn) - Pp(X) | = 0 for some distance d between probability measures.

A.V. Asimit et al. /European Journal of Operational Research 000 (2017) 1-13

Different specifications of the metric d correspond to different notions of robustness. For instance, Kiesel, Ruhlicke, Stahl, and Zheng (2016) consider the Wasserstein distance

dw(P, Q) := inf{£[\X - Y|] : X ~ P, Y ~ Q}

under which both VaR and CVaR are robust. Cont et al. (2010) use the Levy distance and show that there is a partial conflict between coherent risk measures (including CVaR) and Hampel's classical notion of robustness, Kratschmer, Schied, and Zahle (2014) and Delbaen, Bellini, Bignozzi, and Ziegel (2016) consider continuity with respect to the ^-weak topology. We refer the interested reader to Emmer et al. (2015) for a brief summary on the topic. Statistical robustness is particularly relevant when the probability measure is estimated from available data; indeed if the estimated probability measure Pn is sufficiently close to the real one (that is d(Pn, P) ^ 0) and the risk measure is robust, than pPn can be considered as a good approximation of pP .

Due to data scarcity, as it is often the case in practice, the estimates based on the empirical measure exhibit weak statistical evidence and alternative methods are necessary to consider. For example, a more conservative approach is to consider a robustified risk measure p defined as follows:

p (X ) = sup pp (X ),

where pP (X ) represents the risk measure for the random loss X with probability distribution P and S is a set of candidates models. This approach is not new in the literature, it is at the basis of decision making under ambiguity (that is when there is uncertainty about the probability distribution). The simple idea of this approach is that when there is ambiguity between different models, a conservative and therefore robust approach is to select the one that represents the worst scenario. In Assumption 2.1, we assume that the real probability model P may not belong to the set S. Indeed, since P is unknown, we cannot guarantee that it belongs to the set of models considered. Note that taking the supre-mum over a set of models reduces the impact of model risk, but it cannot eliminate it completely.

The specification of the set S plays a crucial role in the worst-case approach and, in general, is a difficult task. Clearly selecting a wide set, increases the chances of including the real model P and makes the risk measure more conservative; on the other hand if S is too large p might become unrealistic. Several choices have been considered in the literature and present different interpretations. In this contribution we assume S = {P1,..., Pm} , that is we consider a finite set of probability measures. This choice is rather frequent in a context of model ambiguity and it has also the benefit of making the problem mathematically tractable. A finite set of models is typical in situations where there is not enough evidence from data to select a model and the specification of S is left to experts opinion. In the context of measuring market risk, the Standard portfolio Analysis of Risk (SPAN, 1995) proposed by the Chicago Mercantile Exchange provides an example of finite S consisting of 16 probability measures obtained combining up and down movement of the volatility with up, down or no move of the future prices (see Artzner et al., 1999, Section 3.2 for a detailed description of how these scenarios are built). To reduce the impact of model risk in option pricing, Cont (2006) presents a worst-case approach over two probability measures: one providing a jumpdiffusion model and the other one a simpler diffusion model, see Example 4.4 in his paper. In the insurance framework, an example of finite set S is obtained considering the set of different Catastrophe Models provided by Cat modelling agencies; the insurer then has to take a robust decision with respect to these models, see for instance Calder, Couper, and Lo (2012).

A valid alternative for S is to consider the convex hull of

{P1, . . . , Pm }

S' := P : k > 0 , 1T k = 1 and P(■) = ^ kkPk (■) 1,

( keM J

which is precisely what Zhu and Fukushima (2009) consider when p = CVaR. It is shown that

WCVaR a(X) := PeS p(X ;P)=men mMx ft+i—a EPn (X - t )+}•

where EPk (■) is the expectation with respect to Pk . Clearly,

max pPn (X) < sup p(X; P)

holds for any risk measure. Proposition 2.1 shows that the "worst-case" definitions are identical if p = VaR and it is followed by an example showing that the above inequality may hold strictly if p = CVaR.

Proposition 2.1. Let {P1,..., Pm} be a set of candidate models. Then,

max VaRa (X; Pk) = sup VaRa(X ; P)

keM PeS'

Proof. Without loss of generality, we may assume that m = 2. It is well-known that VaR has convex level sets, i.e. if two probability models P1, P2 are such that VaR„(X ;P1 \ = VaRa(X ; P2 \, then VaR« (X, kP1 + (1 - X)P2) = VaR« (X; Pi ) for any k e [0, 1] (see Gneiting, 2011). Further, VaR is monotone and translation invariant (see properties (a) and (b) from Section 2.3) and therefore, we can apply Lemma 2.2 in Bellini and Bignozzi (2015) to obtain that VaR is quasilinear. That is,

^mi n n VaR«(X; Pk\ < VaR^X ; kP1 + (1 - k)P2\

< max VaR« (X, P).

~ ke{1\2 n «V '

which in turn implies that

sup VaR« (X, P) < max VaR« (X, P) .

PeS ke{1. 2 .

The latter and (2.9) conclude the proof. □

As it has been anticipated, the same result does not hold for CVaR. Indeed, the following example shows that maxkeM CVaR« (X; Pk\ < supPeS CVaR^X ; P\ may hold.

Example 2.1. Consider a discrete random variable X which takes only four values, i.e. {1, 2, 3, 4}. We only consider two possible probability models, P1 and P2 , such that

P1 (X = 1) = 0 . P1 (X = 2) = 2 , P1 (X = 3) = 6 , P1 (X = 4) = 1 , P2 (X = 1) = 1 , P2 (X = 2) = 0 . P2 (X = 3) = 0 . P2 (X = 4) = 1 .

It is not difficult to find that CVaR2n 3 (X; P1 ) = CVaR2n 3 (X; P2 ) = 13 .

Let P0 = ^P1 + 2P2, i.e.

P0 (X = 1) = 1 , P0 (X = 2) = 1 , P0 (X=3) = 1 , P0 (X=4) = 12 ,

which is an element of e S . Clearly, CVaR2.3 \X; P0\ = 27 , which justifies our claim as follows:

13 = max CVaR« (X ; Pk) < CVaR« (X; P0) < sup CVaR« (X ; P) .

4 ke{1.2 . PeS

A.V. Asimit et al. (European Journal of Operational Research 000 (2017) 1-13

2.3. Properties of the worst-case risk measure

3. VaR robust optimisation

Some properties of the worst-case risk measures have been briefly discussed in Zhu and Fukushima (2009) and therefore, we further outline the main traits of this class. We now restate some of the properties often satisfied by a risk measure and examine if its worst-case counterpart preserves these properties. Thus,

(a) Monotonicity: pP (X) < pP (Y) holds if P (X < Y) = 1.

(b) Translation Invariance: pP (X - m) = pP(X) - m holds for any m e R .

(c) Positive homogeneity: pP(XX) = kpP (X) holds for any k > 0.

(d) Subadditivity. pP(X + Y) < pP(X) + pP (Y).

(e) Convexity: pp(ftX + (1 - ft)Y) < ftpp(X) + (1 - ft)pp(Y) holds for any ft e (0, 1).

(f) Comonotonic additivity: If X. Y are comonotone, then pP(X+ Y ) = pp (X)+pp (Y) .

(g) Comonotonic subadditivity: If X, Y are comonotone, then Pp (X + Y ) < pp (X) + pp (Y) .

where X, Y e X. By definition, X and Y are comonotone if

(X (m) - X («)) (Y(«) - Y(«')) > 0 for any « x « e ^ x

It is well-known that VaR satisfies properties (a)-(c) and (f)-(g), while CVaR fulfils all properties (a)-(g). Proposition 1 in Zhu and Fukushima (2009) also shows that p(X) = supfeM pP) (X) satisfies (a)-(e) if p satisfies (a)-(e). Properties (f) and (g) are detailed in Proposition 2.2.

Proposition 2.2. Let p , . . . , Pm} be a set of candidate models and p be a risk measure that satisfies properties (f) (or (g)). Then, p (X) = sup.eM Pp. (X) satisfies (g).

Proof. Let X and Y be comonotone and assume p is comonotonic additive, then

p(X + Y ) = sup p(X + Y ; Pk) = sup [p(X; Pk) + p(Y;Pk)}

keM keM

< p(X) + p(Y).

If p is comonotonic subadditive then it is sufficient to replace the second equality from above with a less than or equal to inequality. □

Relaxing the assumption of properties (f)-(g) is rather common in a model uncertainty setting, where no pre-specified reference probability measure is available (for example, see Song & Yan, 2009). The next example illustrates that CVaR« (X) := supkeM CVaR«(X; Pk) may be strictly comonotonic subadditive, i.e.

CVaR« (X + Y ) < CVaR« (X ) + CVaR« (Y).

Example 2.2. Consider a discrete random variable X which takes only three values, i.e. {2, 3, 4}. Let there be two comonotone random variables, X and X2 , and only two possible probability models (P1 , P2 } such that

Pi (X = 2) = - , Pi (X=3) = - , P2 (X = 2) = 0 . 8 . P2 (X = 3) = 0. 08 .

We compute CVaR at level a = | . It is not difficult to find that

CVaR2/3 (X ; P1 ) = 3. CVaR2/3 (X 2 ; P1 ) = 9 . 5 . CVaR2/3 (X + X 2 ;P1 ) = 12 . 5 CVaR2/3 (X ; P2 ) = 2 . 9b . CVaR2/3 (X 2 ; P2 ) = 9 . 52 , CVaR2/3 (X + X 2 ;P2 ) = 12 / 48 ,

In this section, we solve the worst-case scenario optimisation problem (2.4) and worst-case regret optimisation problem (2.5) under the Cw x № feasibility set, when the risk measure p is VaR. Note that

VaR« (X - R[X ] . P) = VaR« (X ; P) - R(VaR« (X ; P)), (3.1)

for any I e C m. For brevity, we denote ak = VaR«(X; Pk).

3.1. Worst-case scenario VaR optimisation problem

We first observe that the objective function is increasing and continuous in P and any feasible premium P is bounded below by «0 + (1 + 9) maxkeM HP) (R[X]) for any fixed R e C co. The latter leads us to define the following subset of C co , which essentially puts an upper bound on the set of feasible contracts:

C' = |r e C)° | «0 + (1 + 9) max Hp. (R[X ]) < P J.

Eq. (3.1) helps in justifying the next lemma.

Lemma 3.1. If Assumption 2.1 holds with p = VaR«, then any contract R e Cco is feasible for Problem 2.4 with a feasibility set C'o x № if and only if R e C'. Further, for any fixed R e C ', the optimal premium is given by PR5 = «0 + (1 + 9) maxkeM hp)(R[X]) and the optimisation problem from (2.4) is equivalent to

min |Pr* + max (ak - R[ak ]) ].

ReC [ R keM J

and thus, CVaR2]3 (X + X2 ) < CVaR2/3 (X) + CVaR2]3 (X2 ), as previously claimed.

Define now a * = maxke^ ak . Since R e Cm, the map x x - R[x] is non-decreasing, and thus

max(ak -R[ ak]) = a* - R[a*] for all R e C .

Hence, Problem (3.2) becomes minReC; (PR! + a * - R [ a * ]} . By stratifying the set C' of feasible contracts according to the values £ = R[a*] . this optimisation problem can be decomposed into a two-step minimisation problem:

min ( a* - è + min P*} , where C*. = (R e C | R[a*] = è}

0<£ <a* ReC* R è

for any 0 < * < a*. (3.3)

Due to the presence of the premium constraint P* < P, the set C* could be empty if * is too large. The next result explains the effective range of * of the outer minimisation of Problem (3.3). The proof relies on the simple observation that the insurance layer contract

R* [X ] = (X - a * + *)+ - (X - a* )+

belongs to Cco with R* [a *] = * and the fact that this contract is minimal in the following sense:

R*[X] < R[X] for all R e Cco with R[ a*] = *.

Lemma 3.2. If Assumption 2.1 holds with p = VaR«, then for any * e [0, a*], the set C* is non-empty if and only if

0 < * < A = max j* < a* | rn0 + (1 + 9) max Hp* (R| [X ]) < P J.

Proof. If 0 < * < A, the contract R* belongs to C* by construction. To prove the converse, suppose that there exists a contract R e C* with * > A. Since R[X] > R* [X] * we have

rn0 + (1 + 9) max H*, (R[X]) > rn0 + (1 + 9) max H* (R* [X]) > P,

keM k V ' keM k V * '

which contradicts the definition of C' . □

A.V. Asimit et al./European Journal of Operational Research 000 (2017) 1-13

Recall that the premium principle H is monotone. By Lemma 3.2 and the minimality of the insurance layer contract R* in the set C* for 0 < £ < A, the inner minimisation of Problem (3.3) is solved by the contract R*, whenever 0 < £ < A. Therefore, it remains to obtain the optimal value of £ for the outer minimisation, which is essentially a one-dimensional problem. We summarise our findings for the worst-case scenario VaR optimisation problem in the next theorem.

Theorem 3.1. If Assumption 21 holds with p = VaR„, then the solution (R* , P* ) of Problem (2.4), assumed to be solved over the set Cco xS, is given by

R*[X ] = R** [X ] = (X - a* + £ * )+ - (X - a* )+ and P* = m0 + (1 + 0) max

(R*[X ]),

where £ * is a solution of min \a* - £ + rn0 + (1 + 0) max H*>. (R)[X ]) ).

0<£<A I keM k £ J

Moreover, the optimal objective value is a * - £ * + P*.

Remark 3.1. The solution of (3.4) is unique as long as g and all F. are strictly increasing functions.

In the rest of this section, we demonstrate how Problem (3.4) can be solved rather explicitly, whenever H is a distortion premium principle as given in (2.1). Note that g is non-decreasing, and thus, H holds the comonotonic additivity property (for details, see Dhaene, Kukush, Linders, & Tang, 2012). Consequently,

Hp(R£[X]) = Hp((X - a* + £)+) - Hp((X - a*)+)

g(Fk(t)) dt.

Also, the above function is convex in £ e [0, a*] for any k e M. since g is non-decreasing. Thus,

G(£) = a* - £ + mo + (1 + 0) max Hp* (r**[X]j

k eM k V £ !

is convex in £ e [0, A]. Therefore, Problem (3.4) can be solved by finding the directional derivatives of G. To this end, we define the directional derivative of an arbitrary convex function H at along the direction d e SR. if exists, as

H'(£; d) = lim H(£ + td) - H(£),

vs 7 t\o t

which is positively homogeneous in d e SR. The right-hand and left-hand derivatives of H can be expressed as H+ (£) = H' (£; 1) and H-(£) = -H'(£; -1) . respectively. A point £* e [0, A] is optimal for Problem (3.4) if and only if it satisfies the first order condition G'(£*; £ - £*) > 0 for all £ e [0, A]. The next lemma provides a simplified expression for the directional derivative of G.

Lemma 3.3. Assume that H satisfies (21) and denote

A(f) = i e M

p] (R/[x 0=is mx Hk (R/[x 0 }•

for any 0 < £ < A. The directional derivative of G at £ along the direction d e S is given by

G' (f ; d) =

d{-1 + (1 + 0) max/eA /f) g(F i ((a/- f)-))}, d > 0 ,

[d{-1 + (1 + 0) min*A (£) g(Fi (a*- £))} ,

Proof. For each k e M , let

gk (£) = g(Fk(t)) dt - Hp ((X - a*)+), £ e [0, A],

J a*-£

d < 0 .

Its right-hand and left-hand derivatives at £ are given by

g*(£, 1) = g(Fk ((a*-£)-) and - g*(£, -1) = g(Fk (a*-£))*

respectively. Therefore, the directional derivative of gk at £ e [0, A] along the direction d e S equals to

g. (f ; d ) =

\d g(Fk/(a/- f)-) , d > 0 . d g(Fk (a*- f)) , d < 0 /

Our claim follows from the classical Danskin's Theorem (see for example, Corollary 1.30 of Guler, 2010), which asserts that

( max gk)'(£ ; d) = max g*(£ ; d), d e S .

keM ieA (£) '

The proof is now complete. □

3.2. Worst-case regret VaR optimisation problem

We turn our attention to the worst-case regret VaR optimisation from (2.5). Since we are no longer able to use similar argumentation as in the previous subsection, the usual approach in the existing literature is to assume a discrete distributed X* That is, X = {x1 , . . . , xn }, where without loss of generality, it can be assumed that x1 < ••• < xn. Let us denote pjk = Pk(X = x*)* Clearly, pk > 0 and 1T pk = 1 for all k e M * where 0 and 1 are the n-dimensional column vector of zeroes and ones, respectively. By convention, the inequality and equality between two vectors is understood componentwise. Denote R[x*] = y * and if R e C *o, then we should have

0 < y * < x* and 0 < y * - y*-1 < x* - x*-1 , for all i e M *

where by convention y 0 = x0 = 0. The above can be rewritten as 0 < y < x and 0 < Ay < Ax with the (n - 1 ) x n matrix A given by

0 0 ••• -1 1* Since x - y is increasingly ordered, then

VaR« (X - R[X]) = xP(k) - y P(k), where p(k) = mi n p*k >

In order to make our optimisation problems tractable, we assume that H satisfies (2.1). Thus,

Hp [X ] = ni x

as a result of Dhaene et al. (2012). Specifically, If g is a left continuous function, then

nn=g h p/i j - g h p ji

1 < i < n, k e M,

Consequently, Problem (2.5) is an LP and we state this result as Proposition 3.1.

Proposition 3.1. Let Assumption 21 hold with p = VaR„. If X is a discrete random variable that takes the values ,..., xn j such that X1 < . . . < xn and H satisfies (21), then solving Problem (2.5) over the set Cco x R is equivalent to solving

(y / P,)»/ xMxM

Xn(k ) - yn(k ) + P - Pi < r, "

M0 + (1 + 0)n\y < P < P,

k e M, k e M,

0 < y < x / 0 < Ay < Ax /

A.V. Asimit et al. ( European Journal of Operational Research 000 (2017) 1-13

where pk is the optimal objective value of Problem (2.6). That is,

pk = min

k (y, P)e№ хЯ

{Xp(k) - y p(k) + P} ,

s.t. Ш0 + (1 + в)л[y < P < P, k e M, 0 < y < x , 0 < Ay < Ax ,

Recall that p£ represents the optimal value of the objective function from Problem (2.6) with p = CVaR« and is given by

pk = min

k (),y,P) еЯхЯ» хЯ

jt + 1-5 Pi]x - y - 1))+ + P},

«) + (1 + , 0 < y < x,

s.t. a, + (1 + e)xTky < P) < P,

k e M,

Remark 3.2. Keeping the same set of assumptions as given in Proposition 3.1, solving Problem (2.4) over the set C co xS is equivalent to solving

In the remaining part of this section, we show that Problems (4.1)-(4.3) can be reduced to LP reformulations. The next theorem deals with Problem (4.1).

(y, P,r)e№ хЯхЯ

- yn(k) + P < r,

k e M,

Ш0 + (1 + в)л\y < P < P, k e M, 0 < y < x , 0 < Ay < Ax,

Remark 3.3. Due to relation (3.5), a variant of the LP reformulations from Proposition 3.1 and Remark 3.2 can be written for any case in which the risk measure p is a distortion risk measure. The key assumption is that R e Cco and the fact that distortion risk measures are comonotonic and thus, Problems (2.4)-(2.6) can be reformulated as LPs for any comonotone additive risk measure p. For example, any risk measure that satisfies (2.1) is comonotone additive (see Dhaene et al., 2012). Note that y and x - y are increasingly ordered (as x is increasingly ordered), which make the optimisation problems under the set Cco x Ш tractable. The lack of ordering could be overcome only if p = CVaR«, as it can be seen in Section 4, where the comonotonic assumption is removed. Finally, if the cost of insurance follows a different premium calculation, i.e. does not satisfy (2.1), then the corresponding constraints may not be linear, but are Second-order cone programming (SOCP) representable for any well-known premium calculations (for details, see Asimit, Gao, Hu, & Kim, 2017), case in which, we only require I e C co, i.e. x - y is increasingly ordered, in order to preserve the linearity of the objective functions. Thus, if H does not satisfy (2.1), the optimisation problems are of SOCP-type.

Theorem 4.1. Let Assumption 2.1 hold with p = CVaR«. If X is a discrete random variable that takes the values X , . . . , xn} and HP) = EP) for all k e M . then solving Problem (4.1) over the set C x № is equkivalent to

(t ,y, ) ,P,z) ) Ятх1'(Я хЯхЯ

t) + 1-5PiSk + P < z, k e M,

x - y - 1)) < %k, k e M,

0 < Çk, _ k e M,

«0 + (1 + 0)PT)y < P < P, k e M, 0 < y < x.

Proof. Let £k = (£1k , . . . , £nk)T for all k e M. Then, (4.1) may be equivalently formulated as:

min max

() ),P))^tхJl«mх^ keM teœm

min {t) + 1-5 P)^ + P

s.t. x - y - 1)) < Çk, k e M. , ,

0 < Çk, _ k e M, (4.5)

Ш) + (1 + ö)p) y < P < P, k e M, 0 < y < x.

Note that the objective function in (4.5) is increasing in *ik for all 1 < i < n and k e M* Thus, the first two constraints from the latter optimisation problem ensure that %k = (x - y - 1*k for k e M. Thus, (4.5) can be rewritten as follows:

4. CVaR robust optimisation

The current section provides numerical solutions to the CVaR-type of Problems (2.4) and (2.5) under similar assumptions to the ones made in Section 3.2. The crucial change is made by the fact that the set of feasible solutions, namely C x S * is larger and moral hazard is permitted. Recall that if moral hazard is excluded, then the optimisation problems could have been solved as in Section 3.2 (for details, see Remark 3.3). Moreover, the rationality constraints, 0 < R[X] < X* are still required. With the help of Eq. (2.3), Problem (2.4) can be rewritten as follows:

min max min

(y, P)eЯ) хЯ k) M t e Яm

{ь + 1-5 P) (x - y - ^ )+ + P}.

s.t. Ю) + (1 + 0 < y < x.

e)p) y < p < p,

k e M,

where without loss of generality hp) = ep) with k e M could be assumed (see Remark 3.3). In addition, Problem (2.5) is given by

min max min

(y,P) ) Я) хЯ k)M ЙЯ)

{t, + 1-5 P) (x - у - + P - Pt \

s.t. «0 + ( 0 < y :

k e M,

(), ),P,z ) , Я) хЯс^хЯхЯ

s.t. mlnt)яm {t) + p)¡-k + P) < z, k e M,

x - У - 1)) < , 0 < Sk, _

a, + (1 + e)p) y < p < p, 0 < y < x ,

k e M, (4.6) k e M, k e M,

We now show that (y/ ¡-*, P*, z/) solves Problem (4.b) if and only if (t\ y/, ¡-*, P*, z/) solves Problem (4.4), where

t/ = argmin/£Mm jtk + -4« PT & + P*} for aU k e M- (4-7)

Suppose that (y/, ¡-*, P*,z/) solves Problem (4.b), which implies that (t/, y/, P*, z/) is a feasible solution to Problem (4.4). If (t/, y/, ¡-*, P*,z*) does not solve Problem (4.4), then there exists a feasible solution (t' , y' , ¡-', P, z!/ such that z! < z*. Now, for all k s M we have that

tmn |fk + -/-a pT Sk+P } ^ tk + --a pT +P/ *z/ • (4-8)

Thus, (y/ , ¡-', P', z') is feasible to Problem (4.b), which contradicts that (y/, I* , P*, z*) is an optimal solution to Problem (4.b).

A.V. Asimit et al./European Journal of Operational Research 000 (2017) 1-13

Conversely, suppose that (t' , y' , ¡-', P', Z" solves Problem (4.4). Eq. (4.8) implies that (y' , ¡- , P' , Z" is feasible to Problem (4.6). If (y' , |', P' , Z" does not solve Problem (4.6), then there exists a feasible solution (y*, I*, P*, z*) such that z* < Z. Then, (t *, y*, P*, z*) solves Problem (4.4), where t* is defined as in (4.7). The latter contradicts our initial assumption that (t' , y' , , P' , Z" is an optimal solution to Problem (4.4). The proof is now complete. □

Finally, we solve Problems (4.2) and (4.3). By following the same arguments as provided in the proof of Theorem 4.1, one may show our claims from Proposition 4.1, and therefore, the proofs are left to the reader.

Proposition 4.1. Let Assumption 21 hold with p = CVaR„. If X is a discrete random variable that takes the values {x1,..., xn j and HP* = EP* for all k e M, then solving Problem (4.3) over the set C x R is equivalent to

p: = min

k (],y,j,P)eXxMjxM"xM

{t + 1-a pk I + P}, x - y - 1t < I, 0 < I,

«0 + (1 + 0)pTky < p < p, k e M/ 0 < y < x /

Moreover, Problem (4.2) is equivalent to

(/y> j,P,z)e»mx»j x»/xmxtx*

z, tk +

Pk Ik+P - Pi <z,

1-a ..

x - y - 1/k < Ik,

0 < Ik, _

«0 + (1 + 0)PTky < P < P, 0 < y < x /

k e M, k e M / k e M / k e M /

5. Pareto robust optimisation

Robust optimal contracts have been found in Sections 3 and 4 without discussing the drawbacks and possible remedies of our proposed robust solutions. One major issue is when there are multiple robust solutions and we explain our point by considering the following general worst-case optimisation problem:

min max fk(x) * with fk : Sn ^ S* (5.1)

xeX keM

where X e Sn is a non-empty feasibility set. Denote X Ro = arg minxeX maxkeM fk (x) the Robust solution set corresponding to (5.1). Identifying the Pareto solutions is a classical problem in economics, since those solutions make the allocation amongst various players as fair as possible, in the sense that no improvement could be made for one or more players without affecting the allocation of at least one player. The mathematical formulation of the Pareto solution set corresponding to (5.1) is given by:

XPa = {x e X | Jx e X s.t. fk (x) > fk (x)

for all k e M and at least one inequality is strict *.

It is not surprising that a Pareto solution may not be an element of X Ro , since worst-case type solutions are concerned only with extreme scenarios. Further, x* e X Ro does not always imply that x* e XPa, when (5.1) admits multiple solutions. It is not difficult to show that if x* is the unique solution of (5.1) then, x* e XPa. Therefore, it is possible to solve (5.1) and produce a robust solution that is suboptimal for all concurrent objectives, which plays havoc with the entire decision process. Recall that Remark 3.1 explains when the closed-form solution is unique and one may show in that case that the unique solution is Pareto optimal as well.

Appa (2002) and Mangasarian (1979) provide methodologies to check the uniqueness property of an LP and therefore, there is no

issue with linear-type (5.1) optimisation problems with a unique solution. It is still not clear how to verify if a solution of (5.1) is an element of XPa . In addition, it would be interesting to provide a constructive method to generate solutions from XRo P|XPa . These are the aims of this section. Specifically, we first note that the discrete versions of (2.4) and (2.5) have the following linear representation:

min max ckx + dk, s.t. Akx < bk , k e M. (5.2)

xeR" keM k k k - k

with known Ak , bk , Ck , dk matrices and column vectors of appropriate dimensions and known scalars dk . The main result of this section, stated as Theorem 5.1, simply says how to always find a Pareto and robust optimal solution for (5.2) by solving at most one additional LP. These results are inspired by Theorem 1 of Iancu and Trichakis (2014) that solves a similar linear problem, where the decision-maker perceives the uncertainty in a very different way.

Theorem 5.1. Let x* be any optimal solution of (5.2), where the latter problem is assumed to be non-trivial, i.e. Ck with k e M are not all null vectors. Consider the following optimisation problem:

s.t. Ak (x* + y) < bk , cTky < 0 . k e M. (5.3)

mi n EcT y

If the optimal value in (5.3) is zero, then x* e XRo f| XPa in 5.2). If the optimal value in (5.3) is negative, then x* + y* e XRo f]XPa in

(5.2), where y* is an optimal solution of (5.3).

Proof. It is not diffiicult to fiind that the objective function of

(5.3) is always non-positive. Assume now that the objective function is zero such that x* is not Pareto solution, but is a robust optimal solution of (5.2). Therefore, there exists a feasible solution x of (5.2) such that

cTx* + dk > cTx + dk, for all k e M (5.4)

and at least one inequality holds strictly. Denote y = x - x* and since x is a feasible solution of (5.2), we get that

Ak (x* + y ) = Ak x < bk , for all k e M *

Recall that Eq. (5.4) tells us that c[x < c[x* for all k e M* and in turn we get that

ck y = c* x - ck x* < 0 * for all k e M *

Thus, y is a feasible solution of (5.3). Moreover, Eq. (5.4) suggests that c[ x < c[ x* for some k e M, and therefore, cT y is negative for some k e M* Consequently, the optimal objective value in (5.3) is negative, which contradicts our assumption of a null optimal objective value.

Assume now that the optimal objective value in (5.3) is negative. Note first that x* + y* is feasible in (5.2), since it is feasible in (5.3). Assume that x* + y* is not a Pareto solution, but is a robust optimal solution of (5.2). Thus, there exists a feasible solution, x* of (5.2) such that x* + y* is Pareto dominated by x* The mathematical formulation of the former is that

(x/ + y/ ) + dk > ckx + dk, for all k e M

and at least one inequality holds strictly. Denote y = x - x* and since x is a feasible solution of (5.2), one may find that x* + y is a feasible solution to (5.2) as follows:

Ak "x* + y" = Akx < bk , for all k e M .

Now, Eq. (5.5) and the fact that y* is feasible solution for (5.3) imply that

cTk y = c* x - cT x* < ck y* < 0 .

which shows that y is feasible in (5.3). We also know that at least one inequality from Eq. (5.5) is a strict inequality and as a re-sul t, one of th e above inequality holds strictly, which results in EkeM cTy < EkeM cky* . The latter contradicts that y* is an optimal solution of (5.3). The proof is now complete. □

A.V. Asimit et al. /European Journal of Operational Research 000 (2017) 1-13

6. Numerical analysis

This section provides numerical illustrations to our worst-case scenario and regret optimisation problems from (2.4) and (2.5), respectively. Recall that in order to empirically solve these problems, a sample x = (x\, x2 , . . . , xn )T, is drawn from the underlying distribution of X, and in turn, we find the optimal insurance contract y* = (y\,y 22, . . . ,yn)T and the optimal premium P* . Let (yWc , PWc) and (yWr , P*r) denote the empirical optimal solutions to our robust models (2.4) and (2.5), respectively. Our main aim is to give a quality comparison between (yW , PW), w e (wc, wr} * and a best possible choice (y*, P*)* Essentially, the latter is the "best solution" based on estimating a particular model chosen via two well-known standard statistical goodness-of-fit methods, namely Akaike Information Criterion (AIC) and Corrected Akaike Information Criterion (AlCc), which are denoted as (y*AIC, PAIC ) and (yAICC, P**ICC ) * respectively. We believe that those comparisons are fair and explain the advantages and disadvantages of robust optimisation over a standard optimisation after choosing the most significant model (in the statistical sense). Finally, recall that all optimisations are implemented on a desktop with 6 core Intel i7-5820K at 3.30 gigahertz, 16 gigabytes RAM, running Linux x64, MATLAB R2014b, CVX 2.1.

The parameterisation employed in our empirical optimisation assumes that the loss variable X is LogNormal distributed with mean E(X) = 5 , 0 00 and standard deviation V3 x E(X)* An expected value premium principle with a risk loading factor 9 = 0 * 25 and no fixed/administrative costs, i.e. &>0 = 0 * is assumed. In addition, P = (1+*2E(X ). Since the underlying loss distribution of X is unknown, five candidate models are further assumed by the decision-maker:

(i) Model 1: Exponential distribution with mean 1/v.

(ii) Model 2: LogNormal distribution with parameters (¡¿, a2 ).

(iii) Model 3: Pareto distribution with parameters («, X) and cdf

F ^ = 1 - (x+ *« z > 0*

(iv) Model 4: Weibull distribution with parameters (c* y ) and cdf F (z) = 1 - e -cz), z > 0 *

(v) Model 5: Empirical distribution.

Recall that p5 = 11* For all other models, Pk ' s are obtained by discretising the Maximum Likelihood fitted model. For example,

pk = Fk ;0) - Fk ;v)

for all i = 1* . . . , n, k e (1* 2 * 3* 4} ,

where by convention x0 = -to and xn+\ = to* Moreover, v is the Maximum Likelihood estimate of the unknown parameters.

The next step is to understand whether or not robust optimisation reduces the variability of the optimal decision. Thus, (yWc , PWc ) * (y*wr, P*r) * (y*Aic, P*IC ) and (y*AICC, P*ICC ) are compared under three collections of candidate models denoted as M * , j e {2, 4, 5}. In particular, M2 := (1* 5} * M4 := (1* 3 * 4 * 5} and M5 := (1 , 2 , 3 , 4 , 5 }. Recall that our sample is drawn from a LogNormal distribution model and for this reason the "true model", i.e. Model 2, is purposely ruled out from M2 and M4 . Therefore, it would be interesting to understand the effect of reducing the model risk and analyse the robust optimal solutions under M*2 := (2 * 5} and M4 := (2 * 3 * 4 * 5}* That is, Model 1 (that exhibits the lightest tail amongst all considered parametric models) is replaced by the "true model" (with a moderated light tail). For each of the model collection M j and M* , j e {2, 4, 5} and l e {2, 4}, (yWc , P*c ) and (yWr, P*r) are obtained by empirically solving the robust models (2.4) and (2.5) with M = M * and M = M* .

Before we explain the procedure of finding (y*AIC , PA*IC ) and (y'AICC' P*ICC)* we briefly explain the AIC and AICc model selection. Given that a sample x and a set of candidate probability models, the AIC value of Model k is calculated as AICk = 2qk - 2LnL*,

where qk is the number of parameters estimated and Lk is the maximum value of the likelihood function of Model k. Under the AIC model selection criterion, the preferred Model k* is the one that gives the smallest AIC value, i.e. k* = argminkeM AICk. On the other hand, the AICc value of Model k penalises the utility of each model for its complexity when the sample size n is not large, i.e. AICck = 2qk x n-1n-q* - 2Ln)Lk). Similarly, the preferred Model k** under the AICc criterion is chosen to have the smallest AICc value, i.e. k* * = argminkeM AICck. Finally, (y*AIC, P*IC ) and (y*AICC, P**ICC) are obtained by solving (2.4) with M = {k* } and M = {k* *} , respectively.

Denote the underlying distribution of X as Model 0 that is equipped with its discretised probability vector p0 obtainable as before:

Pi0 ) -F,(

X) + X )

for all i = 1,

where F0 is the cdf of X. i.e. a LogNormal distribution with parameters defined earlier. Let (yT , P*) be the optimal solution obtained by solving a non-robust version of Problem 2.4 with M = {0} as given by p0 . This optimal solution mimics the ideal optimal decision, since the "true" distribution is assumed to be known and thus, all possible robust methods are compared with the decision under Model 0. Clearly, the model risk induces uncertainty with the model choice and this issue is numerically experimented in the remaining part of this section. In order to compare various decisions, we need to measure the distance between the robust methods and the one obtained via Model 0. That is, each optimal contract y), where £ e {wc, wr,AIC, AICC} . is compared to the benchmark optimal contract yT as follows:

A£ = 12 ly*£ - y'T I x p'0 for all £ e {wc, wr, AIC, AICC} .

Clearly, the smaller the value for A£ is, the more robust of a decision is achieved. This criterion, further called simple criterion, assesses the choice of the optimal contract and it would be interesting to understand the possible drawbacks of those robust contracts, which may require increased premiums. A composite criterion would be needed when comparing (yw , Pw) to (y* , P*) and is given by:

a) (yw , Pw) is preferred and called "good scenario" if Aw < Ac and

P* _ p* < 10 -2 ; Pw - Pc < 10 ;

b) (y* , P) ) is preferred and called "bad scenario" if Aw > Ac and

for any given w e (wc, wr} and c e {AIC* AICC}. Our numerical illustrations generate samples of size n e {25, 50, 100, 250} for N = 500 times and compare the robust optimal decisions to the AIC nonrobust optimal decisions under the two criteria (simple and composite). Extensive numerical experiments (for various parametric models and sample sizes) have shown that the AIC and AICc-based optimal decisions lead to similar results and for this reason, only AIC results are further reported. That is, we display the number of "good" and "bad" scenarios, namely Gw AIC and BwAIC, respectively, where w e (wc, wr}*

We first examine the VaRo.75 -based optimal solutions for the simple test, where only the robustness of the risk transfer is analysed. Table 6.1 shows those results when the "true model" is removed from two of the three candidate models, while Table 6.2 displays similar results when the LogNormal is always present amongst all potential parametric distributions.

Clearly, introducing the "true model" amongst the potential models, our robust methods are more efficient, but not sufficiently enough for various sample sizes; there is a marginal incentive to use our methods for small and medium sized samples. The results

P* _ p* > -10-2 Pw P) > 10 .

JID: EOR 10

A.V. Asimit et al./European Journal of Operational Research 000 (2017) 1-13

Table 6.1

Number of good and bad scenarios for VaR0j5 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {.Mo , mo, mo} under the simple criterion.

n = 25 n = 50 n = 100 n = 250

m/ M/ M/ M/ M/ M/ M/ m / M/ M/ M/ M/

Gwc,AlC 222 217 225 196 195 195 143 143 143 67 67 68

Bwc,AlC 278 283 275 304 305 305 357 357 357 433 433 432

Gwr,AlC 258 257 250 255 246 236 200 185 182 125 110 109

Bwr,AlC 242 243 250 245 254 264 300 315 318 375 390 391

Table 6.2

Number of good and bad scenarios for VaRo.75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {M 2 , M0 , Mo } under the simple criterion.

n = 25 n = 50 n = 100 n = 250

M/ M/ M/ M/ M/ M/ M/ M/ M/ M/ M/ M/

Gwc,AlC 222 225 279 196 179 291 143 134 310 67 47 278

Bwc,AlC 278 275 221 304 321 209 357 366 189 433 453 222

Gwr,AlC 258 260 304 255 247 318 200 210 345 125 95 345

Bwr,AlC 242 240 196 245 253 182 300 290 155 375 405 155

Table 6.3

Number of good and bad scenarios for non-comonotonic CVaRo75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , Mo, Mo} under the simple criterion.

n = 25 n = 50 n = 100 n = 250

M/ M/ M/ M/ M/ M/ M/ M/ M/ M/ m / M/

Gwc,AlC 337 349 353 307 331 338 328 331 329 307 308 291

Bwc,AlC 163 151 147 193 169 162 172 169 171 193 192 209

Gwr,AlC 336 339 339 292 304 305 287 302 268 222 240 204

Bwr,AlC 164 161 161 208 196 195 213 198 232 278 260 296

Gwcv/r,AlC 322 323 323 305 327 336 305 326 324 305 304 313

Bwa>ar,AIC 178 177 177 195 172 164 195 174 176 195 196 187

Table 6.4

Number of good and bad scenarios for non-comonotonic CVaRo.75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {M 2 , Mo , Mo } under the simple criterion.

n = 25 n = 50 n = 100 n = 250

M/ M/ M/ M/ M/ M/ M/ M/ M/ M/ M/ M/

Gwc,AlC 337 340 354 307 308 306 328 328 315 307 309 305

Bwc,AlC 163 160 146 193 192 194 172 172 185 193 191 195

Gwr,AlC 336 340 330 292 299 270 287 295 292 222 234 265

Bwr,AlC 164 160 170 208 201 230 213 205 208 278 266 235

Gwcv/r,AlC 322 312 306 305 301 289 305 324 302 305 300 321

Bwcwr,A!C 178 187 194 195 199 211 195 176 198 195 200 179

Table 6.5

Number of good and bad scenarios for non-comonotonic CVaRo.75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , Mo , Mo } under the composite criterion.

n = 25 n = 50 n = 100 n = 250

M/ M/ M/ m 5 M/ M/ m 5 m / m 2 M5 M/ M2

Gwc,AlC 333 345 351 307 331 338 328 331 329 307 308 291

Bwc,AlC 158 146 140 193 169 162 172 169 171 193 192 209

Gwr,AlC 332 335 336 292 304 305 287 302 268 222 240 204

Bwr,AlC 157 154 152 208 196 194 213 198 231 278 259 296

Gwcv/r,AlC 316 317 316 305 327 336 305 326 324 305 304 313

Bwcv/r,AlC 176 174 177 195 172 164 195 174 176 195 196 187

under the composite criterion (not reported) lead to a similar conclusion. This is not surprising, since the VaR risk measure is quite robust (see Cont et al., 2010) in the sense that the whole sample could be contaminated, but not a single value, and still have the same estimate. This peculiar behaviour of this tail risk measure explains why our methods are not recommended for VaR-based decisions.

Next, we turn our attention to another tail risk measure, namely CVaR075 . Our results for various sample sizes n and model collections are presented in Tables 6.3 and 6.4 for the simple criterion, while Tables 6.5 and 6.6 replicate similar results for the

composite criterion. The first two rows of Tables 6.3-6.6 are computed via the LP formulation from Theorem 4.1, while the results of the third and fourth rows are based on the LP reformulation from Proposition 4.1. The last two rows are obtained by optimising the WCVaR risk measure, as defined in (2.8).

There is an overwhelming empirical evidence that our worst-case scenario method performs uniformly better that the WCVaR robust method from Zhu and Fukushima (2009) under both criteria, simple and composite, for any sample size. This could be explained by the fact WCVaR is a more conservative risk measure than our proposed robust risk measures. It is interesting to note

A.V. Asimit et al. /European Journal of Operational Research 000 (2017) 1-13

[m5G;May 3, 2017;18:33] 11

Table 6.6

Number of good and bad scenarios for non-comonotonic CVaRoi75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , M0, m o } under the composite criterion.

n = 25 n = 50 n = 100 n = 250

ma m 4 M] Ma M] M] M] M] M] M] M] M]

Gwc,AlC 333 336 354 307 308 306 328 328 315 307 309 305

Bwc,AlC 158 155 137 193 192 194 172 172 185 193 191 195

Gwr,AlC 332 336 330 292 299 270 287 295 292 222 234 265

Bwr,AlC 157 152 155 208 201 227 213 205 208 278 266 235

GwcV]r,AlC 316 306 299 305 301 289 305 324 302 305 300 321

Bwcwr,AlC 176 185 192 195 199 211 195 176 198 195 200 179

Table 6.7

Number of good and bad scenarios for comonotonic CVaRoi75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , Mo , Mo} under the simple criterion.

n = 25 n = 50 n = 100 n = 250

M] M] m 2 M] M] m2 m 5 m ] m2 m5 M] m2

Gwc,AlC 259 247 264 277 282 270 315 333 307 301 313 284

Bwc,A!C 241 253 236 223 217 230 185 167 193 199 187 216

Gwr,AlC 307 307 318 304 305 302 283 278 263 221 223 206

Bwr,AlC 193 193 182 196 195 198 217 222 237 279 277 294

GwcV]r,AlC 259 271 234 264 277 256 277 296 265 282 297 267

BwcV]r,AlC 241 229 265 236 223 244 223 204 235 218 203 233

Table 6.S

Number of good and bad scenarios for comonotonic CVaRoi75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {M2 , M0 , Mo } under the simple criterion.

n = 25 n = 50 n = 100 n = 250

m5 M] M] m5 M] M] m5 M] M] m5 M] m 2

Gwc,AlC 259 250 230 277 274 252 315 311 290 301 306 298

Bwc,AlC 241 250 270 223 226 248 185 189 210 199 194 202

Gwr,AlC 307 311 308 304 313 305 283 286 319 221 231 299

Bwr,AlC 193 189 192 196 187 195 217 214 181 279 269 201

GwcVar,AlC 259 249 215 264 250 202 277 273 239 282 285 233

BwcV]r,AIC 241 251 285 236 250 298 223 227 261 218 215 267

Table 6.9

Number of good and bad scenarios for comonotonic CVaRoi75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , Mo , Mo } under the composite criterion.

n = 25 n = 50 n = 100 n = 250

m5 M] m2 m5 M] m2 m5 M] m2 m5 M] m2

Gwc,AlC 253 240 258 276 281 269 315 333 307 301 313 284

Bwc,AlC 239 252 235 223 217 230 185 167 193 199 187 216

Gwr,AlC 299 299 312 302 303 301 283 278 263 221 223 206

Bwr,A!C 191 191 180 196 195 198 217 222 237 279 277 294

GwcV]r,AlC 255 267 233 263 276 255 277 296 265 282 297 267

BwcVar,AlC 237 225 259 236 223 244 223 204 235 218 203 233

Table 6.10

Number of good and bad scenarios for comonotonic CVaRoi75 -based scenarios within 500 samples of various sample sizes n and collection of candidate models {Mo , Mo , M5 } under the composite criterion

n = 25 n = 50 n = 100 n = 250

m5 M] M] m5 M] M] m 5 m 4 m 2 m5 M] M]

Gwc,AlC 253 243 227 276 273 252 315 311 290 301 306 298

Bwc,AlC 239 249 266 223 226 248 185 189 210 199 194 202

Gwr,AlC 299 304 304 302 311 305 283 286 319 221 231 299

Bwr,AlC 191 188 189 196 187 195 217 214 181 279 269 201

Gwcwr,A!C 255 245 215 263 249 202 277 273 239 282 285 233

BwcV]r,AlC 237 247 278 236 250 298 223 227 261 218 215 267

that the worst-case regret method tends to under-perform both worst-case methods, but we believe that this due to the fact CVaR is a tail risk measurei

Tables 6i7-6i10 are the replica of Tables 6i3-6i5, where the set of feasible solutions is reduced such that the insurance contracts are assumed to be comonotonei That is, the first two rows of

Tables 6i7-6i10 are computed as explained in Remark 3i2, while the results of the third and fourth rows are based on an LP formulation similar to the one from Proposition 3i1i As before, the last two rows are obtained by optimising the WCVaR risk measure, as defined in (2i8), but adding the comonotonicity constraint Restricting our optimisation to comonotone contracts does not

12 A.V. Asimit et al./European Journal of Operational Research 000 (2017) 1-13

change our results, but we observe a loss of power amongst all three robust methods, which could be explained by the fact that an additional constraint increases the complexity of the problem. The general conclusions do not change and there is clear evidence to recommend our worst-case scenario method that outperforms the WCVaR robust method from Zhu and Fukushima (2009) and our worst-case regret method under both criteria.

As a final remark, it is worth mentioning that applying Theorem 5.1 to our robust methods, all numerical results remain unchanged. Therefore, the power of results are similar to those displayed in the section, which suggests that one should use our worst-case method in conjunction with Theorem 5.1 in order to obtain a robust insurance contract that economically is viable to both insurance players.

7. Conclusions

The VaR and CVaR-based optimal insurance contract has been investigated under uncertainty, where the model risk is taken into account. This source of uncertainty is considered by incorporating multiple plausible models that the decision-maker would have available via estimation, proxy models or expert opinion consultation. Model risk always represents an important source of uncertainty in risk modelling and it is more pronounced when data scarcity is present. Our aim has been to provide a robust decision and not to produce a distribution robust method of the underlying insurance risk. Two robust methods are proposed, namely the Worst-case and Worst-regret. Our numerical results have shown that our Worst-case method outperforms the Worst-regret method for CVaR-based decisions. Moreover, our Worst-case method proved to be more robust than the Worst-case CVaR method proposed by Zhu and Fukushima (2009). Unfortunately, the VaR-based decisions are not efficiently robustified for all sample sizes by neither methods proposed in this paper, though encouraging results are obtained for small samples. Another achievement of this paper is related to the well-known caveat in robust optimisation that optimal decision may be economically unacceptable. That is, the optimal contract may be not efficient in the Pareto optimality sense. We resolve this issue by providing a simple numerical method that allows one to identify an optimal Pareto and robust decision that is (numerically) shown to be efficient for reducing the model risk.

Acknowledgments

K. C. Cheung, acknowledges the financial support from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. 17324516).

References

Acerbi, C., & Szekely, B. (2014). Backtesting expected shortfall. Risk magazine. Acerbi, C., & Tasche, D. (2002). On the coherence of expected shortfall. Journal of

Banking & Finance, 26o7), 1487-1503. Appa, G. (2002). On the uniqueness of solutions to linear programs. Journal of the

Operational Research Society, 53o 10), 1127-1132o Arrow, K. J. (1963). Uncertainty and the welfare economics of medical care. American Economic Review, 53o3), 941-973o Artzner, P.o Delbaen, F.o Eber, J.o & Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9o3), 203-228o Asimit, A. V., Badescu, A. M., & Tsanakas, A. (2013a). Optimal risk transfers in insurance groups. European Actuarial Journal, 3, 159-190. Asimit, A. V., Badescu, A. M., & Verdonck, T. (2013b). Optimal risk transfer under quantile-based risk measures. Insurance: Mathematics and Economics, 53o 252-265.

Asimit, V., Gao, T., Hu, J., & Kim, E. S. (2017). Optimal risk transfer: A numerical optimisation approach. Available at: http://papers.ssrn.com/abstract_ id=2797562. Balbas, A.o Balbas, B.o Balbas, RM & Heras, A. (2015). Optimal reinsurance under risk

and uncertainty. Insurance: Mathematics and Economics, 60o 61-74 . Balbas, A.o Balbas, B.o & Heras, A. (2009). Optimal reinsurance with general risk measures. Insurance: Mathematics and Economics, 44, 374-384o

Balbas, A., Balbäs, B., & Heras, A. (2011). Stable solutions for optimal reinsurance problems involving risk measures. European Journal of Operational Research, 214, 796-804,

Bellini, F., & Bignozzi, V. (2015). On elicitable risk measures. Quantitative Finance, 15, 5), 725-733,

Ben-Tal, A., El Ghaoui, L., & Nemirovski, A. (2009). Robust optimization. New Jersey:

Princeton University Press, Ben-Tal, A., & Nemirovski, A. (2008). Selected topics in robust convex optimization.

Mathematical Programming, 112, 1), 125-158. Bertsimas, D., Brown, D. B., & Caramanis, C. (2011). Theory and applications of robust optimization. SIAM Review, 53(3), 464-501. Borch, K. (1960). An attempt to determine the optimum amount of stop loss reinsurance. In Transactions of the 16th international congress of actuaries I (pp. 597-610).

Cai, J., & Tan, K. S. (2007). Optimal retention for a stop-loss reinsurance under the

VaR and CTE risk measure. ASTIN Bulletin, 37, 1), 93-112, Cai, J., & Wei, W. (2012). Optimal reinsurance with positively dependent risks. Insurance: Mathematics and Economics, 50, 57-63, Cai, J., & Weng, C. G. (2016). Optimal reinsurance with expectile. Scandinavian Actuarial Journal, 7, 624-645, Calder, A., Couper, A., & Lo, J. (2012). Catastrophe model blending techniques and governance. The actuarial profession. URL https://www.actuaries.org.uk/ documents/brian- hey- prize- catastrophe- model- blending. Cheung, K. C., & Lo, A. (2017). Characterizations of optimal reinsurance treaties: a

cost-benefit approach. Scandinavian Actuarial Journal, 2017, 1), 1-28, Cheung, K. C., Sung, K. C. J., Yam, S. C. P., & Yung, S. P. (2014). Optimal reinsurance under general law-invariant risk measures. Scandinavian Actuarial Journal, 1, 72-91.

Cont, R. (2006). Model uncertainty and its impact on the pricing of derivative instruments. Mathematical Finance, 16,3), 519-547, Cont, R. , Deguest, R., & Scandolo, G. (2010). Robustness and sensitivity analysis of

risk measurement procedures. Quantitative Finance, 10,6), 593-606, Delbaen, F. , Bellini, F., Bignozzi, V., & Ziegel, J. F. (2016). On risk measures with the

CxLS property. Finance and Stochastics, 20( 2), 433-453 . Dhaene, J., Kukush, A., Linders, D., & Tang, Q. (2012). Remarks on quantiles and distortion risk measures. European Actuarial Journal, 2,2), 319-328, El Ghaoui, L. , Oks, M., & Oustry, F. (2003). Worst-case value-at-risk and robust portfolio optimization: A conic programming approach. Operations Research, 51( 4), 543-556.

Emmer, S., Kratz, M., & Tasche, D. (2015). What is the best risk measure in practice?

A comparison of standard measures. Journal of Risk, 18,2), 31-60, Fissler, T., & Ziegel, J. F. (2016). Higher order elicitability and Osband's principle. The

Annals of Statistics, 44( 4), 1680-1707. Gabrel, V., Murat, C., & Thiele, A. (2014). Recent advances in robust optimization:

An overview. European Journal of Operational Research, 235,3), 471-483, Gilboa, I., & Schmeidler, D. (1989). Maxmin expected utility with non-unique prior.

Journal of Mathematical Economics, 18(2), 141-153, Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American

Statistical Association, 106( 494), 746-762 . Güler, O. (2010). Foundations of optimization, New York: Springer, Hampel, F. R. (1968). Contribution to the theory of robust estimation, Ph.D Thesis, University of California, Berkeley. Huang, D., Zhu, S., Fabozzi, F. J., & Fukushima, M. (2010). Portfolio selection under distributional uncertainty: A relative robust CVaR approach. European Journal of Operational Research, 203, 1), 185-194, Huber, P. J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statistics, 35, 1), 73-101, Huberman, G. , Mayers, D. , & Smith, C. W., Jr. (1983). Optimal insurance policy indemnity schedules. The Bell Journal of Economics, 14,2), 415-426. lancu, D. A., & Trichakis, N. (2014). Pareto efficiency in robust optimisation. Management Science, 60, 1), 130-147, Kakouris, I., & Rustem, B. (2014). Robust portfolio optimization with copulas. European Journal of Operational Research, 235 (1), 28-37. Kaluszka, M. (2005). Optimal reinsurance under convex principles of premium calculation. Insurance: Mathematics and Economics, 36( 3), 375-398 . Kaluszka, M., & Okolewski, A. (2008). An extension of arrows result on optimal reinsurance contract. Journal of Risk and Insurance, 75( 2), 275-288. Kiesel, R., Rühlicke, R., Stahl, G., & Zheng, J. (2016). The Wasserstein metric and

robustness in risk management. Risks, 4( 3), 32. Koch-Medina, P., & Munari, C. (2016). Unexpected shortfalls of expected shortfall: Extreme default profiles and regulatory arbitrage. Journal of Banking & Finance, 62, 141-151.

Krätschmer, V., Schied, A., & Zähle, H. (2014). Comparative and qualitative robustness for law-invariant risk measures. Finance and Stochastics, 18(2), 271-295, Landsberger, M., & Meilljson, I. (1994). Co-monotone allocations, Bickel-Lehmann dispersion and the Arrow-Pratt measure of risk aversion. Annals of Operations Research, 52( 2), 97-106 . Ludkovski, M., & Rüschendorf, L. (2008). On comonotonicity of Pareto optimal allocations. Statistics and Probability Letters, 78, 10), 1181-1188, Mangasarian, O. L. (1979). Uniqueness of solution in linear programming. Linear Algebra and its Applications, 25, 151-162. Polak, G. G., Rogers, D. F. , & Sweeney, D. J. (2010). Risk management strategies via minimax portfolio optimization. European Journal of Operational Research, 207, 1), 409-419.

Rockafeller, R. T., & Uryasev, S. (20 0 0). Optimization of conditional value-at-risk. Journal of Risk, 2, 21-41,

A.V. Asimit et al. /European Journal of Operational Research ООО (2О17) 1-13 13

Song, Y., & Yan, J. A. (2009). Risk measures with comonotonic subadditivity or convexity and respecting stochastic orders. Insurance: Mathematics and Economics, 45(3), 459-465,

SPAN (1995). Standard portfolio analysis of risk, Chicago: Chicago Mercantile Exchange.

Wang, S., Young, V. R., & Panjer, H. H. (1997). Axiomatic characterization of insurance prices. Insurance: Mathematics and Economics, 21(2), 173-183,

Young, V. R. (2004). Premium principles. Encyclopedia of actuarial science. New York: Wiley.

Zhu, S.. & Fukushima, M. (2009). Worst-case conditional value-at-risk with application to robust portfolio management. Operations Research, 57.5), 1155-1168. Zymler, S. , Kuhn, D. , & Rustem, B. (2013). Worst-case value-at-risk of nonlinear portfolios. Management Science, 59. 1), 172-188.