Sains Malaysiana 51(7)(2022): 2237-2247

http://doi.org/10.17576/jsm-2022-5107-24

 

Extending the GLM Framework of the Lee-Carter Model with Random Forest Recursive Feature Elimination Based Determinants of Mortality

 (Perluasan Model Kerangka GLM Lee-Carter dengan Faktor Penyebab Kematian Berdasarkan Eliminasi Ciri Rekursif Hutan Rawak)

 

NURUL AITYQAH YAACOB1,2 DHARINI PATHMANATHAN1* & IBRAHIM MOHAMED1

 

1Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Federal Territory, Malaysia

2Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Cawangan Negeri Sembilan, Kampus Kuala Pilah, 72000 Kuala Pilah, Negeri Sembilan Darul Khusus, Malaysia

 

Diserahkan: 23 Jun 2021/Diterima: 1 Januari 2022

 

Abstract

The Lee-Carter (LC) model led to the development of many prominent mortality models. This study aims to modify the generalised linear model (GLM) (Poisson, negative binomial, and binomial) framework of the LC model by incorporating factors that affect mortality into the model. The top three factors which affect the mortality for each of the 14 countries studied were selected using the random forest recursive feature elimination (RF-RFE) method which eliminates the least important factors based on the correlation of the predictors with the log-mortality rate. These selected factors were integrated in the form of additional bilinear variates to the GLM models and compared to their original counterparts. The RF-RFE method is effective in selecting the best determinants of mortality by avoiding multicollinearity among predictor variables. The inclusion of the time-factor modulation based on the factors selected improved the model adequacy significantly. Vast improvement was evident in the Poisson and binomial settings. Furthermore, the modified GLM version fits short-base-period data well. This study shows that the inclusion of exogenous determinants of mortality improves the performance of the model significantly.

 

Keywords: GLM; Lee-Carter model; mortality; random forest; recursive feature elimination

 

Abstrak

Model Lee-Carter (LC) telah membawa kepada perkembangan banyak model mortaliti yang menyerlah. Kajian ini bertujuan mengubah suai kerangka model linear teritlak (GLM) (Poisson, binomial negatif dan binomial) model LC dengan menggabungkan faktor yang mempengaruhi kematian ke dalam model. Tiga faktor teratas yang mempengaruhi kematian bagi setiap 14 negara yang dikaji, dipilih dengan menggunakan kaedah penghapusan ciri rekursif hutan rawak (RF-RFE) yang berfungsi menyingkirkan faktor yang kurang penting berdasarkan korelasi peramal dengan kadar log kematian. Faktor yang dipilih telah diintegrasikan dalam bentuk bilinear tambahan yang bervariasi dengan model GLM dan kajian perbandingan dengan versi model GLM yang asli telah dijalankan. Kaedah RF-RFE berkesan dalam memilih penentu kematian terbaik dan mengelakkan multikolineariti di antara pemboleh ubah peramal. Modulasi faktor masa yang dimasukkan berdasarkan faktor yang dipilih telah meningkatkan kecukupan model dengan lebih bererti. Peningkatan yang besar dapat dibuktikan pada model Poisson dan binomial. Tambahan pula, versi model GLM yang diubah suai turut sesuai digunakan untuk data jangka masa yang pendek. Kajian ini juga mendedahkan bahawa penggunaan penentu kematian luaran telah meningkatkan prestasi model dengan lebih bererti.

 

Kata kunci: GLM; hutan rawak; kematian; Model Lee-Carter; penghapusan ciri rekursif

 

RUJUKAN

Arif, R. 2020. A Simple Introduction to the Random Forest Method.https://arifromadhan19.medium.com/a-simple-introduction-to-the-random-forest-method-badc8ee6c408

Azman, S. & Pathmanathan, D. 2020. The GLM framework of the Lee-Carter model: A multi-country study. Journal of Applied Statistics 49(3): 752-763.

Booth, H., Maindonald, J. & Smith, L. 2002. Applying Lee-Carter under conditions of variable mortality decline. Population Studies 56(3): 325-336.

Brouhns, N., Denuit, M. & Vermunt, J.K. 2002. A Poisson log-bilinear regression approach to the construction of projected lifetables. Insurance: Mathematics and Economics 31(3): 373-393.

Brownlee, J. 2020. Recursive Feature Elimination (RFE) for Feature Selection in Python. Machine Learning Mastery. https://machinelearningmastery. com/rfe-feature-selection-in-python/. Accessed on February 15, 2021.

Cairns, A.J., Blake, D., Dowd, K., Coughlan, G.D., Epstein, D., Ong, A. & Balevich, I. 2009. A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. North American Actuarial Journal 3(1): 1-35.

Chen, L., Islam, R.M., Wang, J., Hird, T.R., Pavkov, M.E., Gregg, E.W., Salim, A., Tabesh, M., Koye, D.N., Harding, J.L., Sacre, J.W., Barr, E.L.M., Magliano, D.J. & Shaw, J.E. 2020. A systematic review of trends in all-cause mortality among people with diabetes. Diabetologia 63(9): 1718-1735.

Currie, I.D. 2016. On fitting generalized linear and non-linear models of mortality. Scandinavian Actuarial Journal 2016(4): 356-383.

Currie, I.D. 2013. Smoothing constrained generalized linear models with an application to the Lee-Carter model. Statistical Modelling 13(1): 69-93.

Darst, B.F., Malecki, K.C. & Engelman, C.D. 2018. Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genetics 19(1): 1-6.

Delwarde, A., Denuit, M. & Partrat, C. 2007. Negative binomial version of the Lee-Carter model for mortality forecasting. Applied Stochastic Models in Business and Industry 23(5): 385-401.

Denuit, M., Hainaut, D. & Trufin, J. 2019. Some generalized non-linear models (GNMs). In Effective Statistical Learning Methods for Actuaries I. Springer, Cham. pp. 363-400.

Fawagreh, K., Gaber, M.M. & Elyan, E. 2014. Random forests: From early developments to recent advancements. Systems Science & Control Engineering: An Open Access Journal 2(1): 602-609.

French, D. 2014. International mortality modelling - An economic perspective. Economics Letters 122(2): 182-186.

French, D. & O'Hare, C. 2014. Forecasting death rates using exogenous determinants. Journal of Forecasting 33(8): 640-650.

Haberman, S. & Renshaw, A. 2008. On simulation-based approaches to risk measurement in mortality with specific reference to binomial Lee-Carter modelling. In Society of Actuaries Living to 100 Symposium.

Hanewald, K. 2011. Explaining mortality dynamics: The role of macroeconomic fluctuations and cause of death trends. North American Actuarial Journal 5(2): 290-314.

Hanewald, K., Post, T. & Gründl, H. 2011. Stochastic mortality, macroeconomic risks and life insurer solvency. The Geneva Papers on Risk and Insurance-Issues and Practice 36(3): 458-475.

Hansen, H. 2013. The forecasting performance of mortality models. AStA Advances in Statistical Analysis 97(1): 11-31.

Human Mortality Database. 2020. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). https://www.mortality.org/.

Hyndman, R.J. & Shang, H.L. 2009. Forecasting functional time series. Journal of the Korean Statistical Society 38: 199-211.

Hyndman, R.J. & Ullah, M.S. 2007. Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis 51(10): 4942-4956.

Kim, T.K. & Lane, S.R. 2013. Government health expenditure and public health outcomes: A comparative study among 17 countries and implications for US health care reform. American International Journal of Contemporary Research 3(9): 8-13.

Kuhn, M. 2020. Caret: Classification and Regression Training. R package version 6.0-86. https://CRAN.R-project.org/package=caret

Kuhn, M. & Johnson, K. 2013. Applied Predictive Modeling. Vol. 26. New York: Springer.

Lee, R.D. & Carter, L.R. 1992. Modeling and forecasting US mortality. Journal of the American Statistical Association 87(419): 659-671.

Lee, R. & Miller, T. 2001. Evaluating the performance of the Lee-Carter method for forecasting mortality. Demography 38(4): 537-549.

Leisch, F. & Dimitriadou, E. 2010. Machine learning benchmark problems. R Package, mlbench.

Liaw, A. & Wiener, M. 2002. Classification and regression by randomForestR news 2(3): 18-22.

McCullagh, P. & Nelder, J.A. 1989. Generalized Linear Models. 2nd ed. London: Chapman and Hall.

National Account Data. 2020. https://unstats.un.org/unsd/snaama/downloads

Nor, S.R.M., Yusof, F. & Norrulashikin, S.M. 2021. Coherent mortality model in a state-space approach. Sains Malaysiana 50(4): 1101-1111.

OECD. 2020. OECD. Stats.https://stats.oecd.org/

Our World in Data. 2020. https://ourworldindata.org/country/taiwan

Pitt, D., Li, J. & Lim, T.K. 2018. Smoothing Poisson common factor model for projecting mortality jointly for both sexes. ASTIN Bulletin: The Journal of the IAA 48(2): 509-541.

Rasoulinezhad, E., Taghizadeh-Hesary, F. & Taghizadeh-Hesary, F. 2020. How is mortality affected by fossil fuel consumption, CO2 emissions and economic factors in CIS region? Energies 13(9): 2255.

Renshaw, A.E. & Haberman, S. 2006. A cohort-based extension to the Lee-Carter model for mortality reduction factors. Insurance: Mathematics and Economics 38(3): 556-570.

Seklecka, M., Pantelous, A.A. & O'Hare, C. 2017. Mortality effects of temperature changes in the United Kingdom. Journal of Forecasting 36(7): 824-841.

Tulu, H.D., Lim, A., Ma-a-Lee, A., Bundhamcharoen, K. & Makka, N. 2020. Prediction of HIV mortality in Thailand using three data sets from the National AIDS Program Database. Sains Malaysiana 49(1): 155-160.

Turner, H. & Firth, D. 2020. Generalized Nonlinear Models in R: An Overview of the gnm Package.  https://cran.r-project.org/package=gnm

Wang, D. & Lu, P. 2005. Modelling and forecasting mortality distributions in England and Wales using the Lee-Carter model. Journal of Applied Statistics 32(9): 873-885.

Wickham, H., Girlich, M. & Ruiz, E. 2021. dbplyr: A 'dplyr' Back End for Databases. R package version 2.1.0. https://CRAN.R-project.org/package=dbplyr

Wilmoth, J.R. 1993. Computational Methods for Fitting and Extrapolating the Lee-Carter Model of Mortality Change. Technical report, Department of Demography, University of California, Berkeley.

World Bank. World Development Indicators. 2020. https://data.worldbank.org/indicator/EN.ATM.CO2E.GF.KT

Yeh, H.H., Westphal, J., Hu, Y., Peterson, E., Williams, L., Prabhakar, D., Frank, C., Autio, K., Elsiss, F., Simon, G., Beck, A., Lynch, F., Rossom, R., Lu, C., Owen-Smith, A., Waitzfelder, B. & Ahmedani, B. 2019. Diagnosed mental health conditions and risk of suicide mortality. Psychiatric Services 70(9): 750-757.

 

*Pengarang untuk surat-menyurat; email: dharini@um.edu.my

 

 

     

sebelumnya