Asymptotic theory for over-dispersed chain-ladder models
Jonas Harnau*, University of Oxford; Bent Nielsen, University of Oxford
The chain-ladder technique is ubiquitous in non-life insurance claim reserving. In a Poisson model, the chain-ladder technique is maximum likelihood estimation. The equivalence of mean and variance of the Poisson is usually refuted by the data. Often, an over-dispersed Poisson structure in which mean and variance are proportional is then assumed. Then, the chain-ladder technique is maximum quasi-likelihood estimation. An asymptotic theory is provided for this situation. This leads to closed form distribution forecasts involving the t distribution. Further, an asymptotically F distributed test statistic is proposed to test for adequacy of the chain-ladder technique compared to a more general model with calendar effect. A simulation study suggests that both distribution forecasts and test statistic give reasonable approximations in finite samples. The proposed distribution forecasts are compared with the standard bootstrap approach. The results generalise to age-period-cohort models used in other fields.
Self-assembling insurance claim models
Greg Taylor*, UNSW Australia; Hugh Miller, Taylor Fry Analytics & Actuarial Consulting; Grainne McGuire, Taylor Fry Analytics & Actuarial Consulting
The paper considers claim data sets containing complex features, e.g. simultaneous irregular trends across accident periods, development periods and calendar periods. The literature contains contributions on the modelling of such data sets by various forms of multivariate model, such as the Generalized Linear Model.Such modelling is time-consuming and expensive. The present paper investigates the automation of the modelling process, so that the model assembles itself in the presence of a given data set. This is achieved by means of regularized regression (particularly the lasso) of the claim data with a specified set spline basis functions as regressors.This form of modelling is applied first to a number of simulated data sets whose properties are fully known. The extent to which the model, applied in an unsupervised fashion, captures the known features embedded in the data is investigated.Subsequently, the unsupervised modelling is applied to a real-world data set. Although this set’s properties are, therefore, strictly unknown, the authors have some 15 years’ experience with it, and are therefore familiar with many of its features. It has been modelled for many years with a Generalized Linear Model, the results of which are compared with those from the self-assembled model.The use of regularized regression in this context requires careful consideration of the tuning parameter(s). This is discussed in some detail. Throughout the exposition, emphasis is also placed on the investigation of forecast efficiency of the self-assembled models, and on comparison between candidate models.
Robust Paradigm Applied to Parameter Reduction in Actuarial Triangle Models
Gary Venter*, The Universe
Robust statistics addresses the impact of outliers on parameter estimation, done from the viewpoint of the model as an over-simplified representation of the process generating the data. This perspective of models being imperfectly specified is what I am calling the robust paradigm. It renders problematic much of classical statistical inference, which assumes that the data is generated from the model. In particular, goodness-of-fit measures are no longer sufficient for comparing models. A somewhat standard response is to base model selection on out-of-sample testing. This is reasonable intuitively anyway and has become common practice even without thinking too much about the robust paradigm, but it is essential under this paradigm. If the data is not coming from the model, how well the model works on other data becomes the key issue.A popular way of standardizing out-of-sample testing is LOO – leave one out – which fits the model to every subset of the data that has one fewer observation than the total sample. Then prediction errors on the omitted points become the basis of model comparison. This can be quite burdensome computationally, but recently a fast approximation to LOO has been devel-oped that makes this approach feasible for almost every model.An approach that has looked towards robust testing and LOO in particular is Lasso estimation. Lasso can be formulated as finding the parameters that minimize the negative loglikelihood plus a selected percentage of the sum of the absolute values of the parameters. This effectively reduces the number of parameters, or at least the degrees of freedom used by the model. How-ever absent an out-of-sample testing methodology, the selected percentage is left to modeler judgment.Actuarial triangle models, like those for casualty loss development, historically had parameters for every row and column of the triangle. Then Taylor (1977) popularized using diagonal para-meters as well. Zehnwirth and associates introduced methods to reduce the resulting surfeit of parameters by using linear trends across the parameters, with occasional trend changes as needed. This talk looks at parameter shrinkage methods like Lasso and random effects for estimating the trend changes, then using LOO for determining the optimal degree of shrinkage. This is also applied to mortality triangles, for which actuaries have used very similar models, but generalized by interaction among the row, column and diagonal terms, as in Renshaw-Haberman (2006) . These models can also be used for loss development. Taylor, Greg C., 1977 ― Separation of Inflation and Other Effects from the Distribution of Non-Life Insurance Claim Delays, ASTIN Bulletin 9, pp. 219–230. Renshaw, A.E., and Haberman, S. 2006. ―A Cohort-Based Extension to the Lee-Carter Model for Mortality Reduction Factors. Insurance: Mathematics and Economics 38: 556–570.
Sarmanov Family of Bivariate Distributions for Multivariate Loss Reserving Analysis.
Jean-Philippe Boucher*, UQAM; Anas Abdallah, UQAM; Hélène Cossette, Université Laval; Julien Trufin, Université Libre de Bruxelles
The correlation among multiple lines of business plays a critical role in aggregating claims and thus determining loss reserves for an insurance portfolio. We show that the Sarmanov family of bivariate distributions is a convenient choice to capture the dependencies introduced by various sources, including the common calendar year, accident year and development period effects. The density of the bivariate Sarmanov distributions with different marginals can be expressed as a linear combination of products of independent marginal densities. This pseudo-conjugate property greatly reduces the complexity of posterior computations. In a case study, we analyze an insurance portfolio of personal and commercial auto lines from a major US property-casualty insurer.