Sains Malaysiana 51(7)(2022):
2237-2247
http://doi.org/10.17576/jsm-2022-5107-24
Extending the GLM Framework of the
Lee-Carter Model with Random Forest Recursive Feature Elimination Based
Determinants of Mortality
(Perluasan Model Kerangka GLM Lee-Carter dengan Faktor Penyebab Kematian Berdasarkan Eliminasi Ciri Rekursif Hutan Rawak)
NURUL AITYQAH YAACOB1,2 DHARINI
PATHMANATHAN1* & IBRAHIM MOHAMED1
1Institute
of Mathematical Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Federal Territory, Malaysia
2Faculty
of Computer and Mathematical Sciences, Universiti Teknologi MARA, Cawangan Negeri
Sembilan, Kampus Kuala Pilah,
72000 Kuala Pilah, Negeri Sembilan Darul Khusus, Malaysia
Diserahkan: 23 Jun 2021/Diterima:
1 Januari 2022
Abstract
The Lee-Carter (LC) model led to the development of many prominent
mortality models. This study aims to modify the generalised linear model (GLM) (Poisson, negative binomial, and binomial) framework of the
LC model by incorporating factors that affect mortality into the model. The top
three factors which affect the mortality for each of the 14 countries studied
were selected using the random forest recursive feature elimination (RF-RFE)
method which eliminates the least important factors based on the correlation of
the predictors with the log-mortality rate. These selected factors were
integrated in the form of additional bilinear variates to the GLM models and
compared to their original counterparts. The RF-RFE method is effective in
selecting the best determinants of mortality by avoiding multicollinearity
among predictor variables. The inclusion of the time-factor modulation based on
the factors selected improved the model adequacy significantly. Vast
improvement was evident in the Poisson and binomial settings. Furthermore, the
modified GLM version fits short-base-period data well. This study shows that
the inclusion of exogenous determinants of mortality improves the performance
of the model significantly.
Keywords: GLM; Lee-Carter model; mortality; random forest; recursive
feature elimination
Abstrak
Model Lee-Carter (LC) telah membawa kepada perkembangan banyak model mortaliti yang menyerlah. Kajian ini bertujuan mengubah suai kerangka model linear teritlak (GLM) (Poisson, binomial negatif dan binomial) model LC dengan menggabungkan faktor yang mempengaruhi kematian ke dalam model. Tiga faktor teratas yang mempengaruhi kematian bagi setiap 14 negara yang dikaji, dipilih dengan menggunakan kaedah penghapusan ciri rekursif hutan rawak (RF-RFE) yang berfungsi menyingkirkan faktor yang kurang penting berdasarkan korelasi peramal dengan kadar log kematian. Faktor yang dipilih telah diintegrasikan dalam bentuk bilinear tambahan yang bervariasi dengan model GLM dan kajian perbandingan dengan versi model GLM yang asli telah dijalankan. Kaedah RF-RFE berkesan dalam memilih penentu kematian terbaik dan mengelakkan multikolineariti di antara pemboleh ubah peramal. Modulasi faktor masa yang dimasukkan berdasarkan faktor yang dipilih telah meningkatkan kecukupan model dengan lebih bererti. Peningkatan yang besar dapat dibuktikan pada model
Poisson dan binomial. Tambahan pula, versi model GLM yang diubah suai turut sesuai digunakan untuk data jangka masa yang pendek. Kajian ini juga mendedahkan bahawa penggunaan penentu kematian luaran telah meningkatkan prestasi model dengan lebih bererti.
Kata kunci: GLM; hutan rawak; kematian; Model Lee-Carter; penghapusan ciri rekursif
RUJUKAN
Arif, R. 2020. A Simple Introduction to the Random Forest Method.https://arifromadhan19.medium.com/a-simple-introduction-to-the-random-forest-method-badc8ee6c408
Azman,
S. & Pathmanathan, D. 2020. The GLM framework of
the Lee-Carter model: A multi-country study. Journal of Applied Statistics 49(3): 752-763.
Booth, H., Maindonald,
J. & Smith, L. 2002. Applying Lee-Carter under conditions of variable
mortality decline. Population Studies 56(3): 325-336.
Brouhns, N., Denuit, M. & Vermunt, J.K. 2002. A Poisson log-bilinear regression
approach to the construction of projected lifetables. Insurance: Mathematics and Economics 31(3): 373-393.
Brownlee,
J. 2020. Recursive Feature Elimination
(RFE) for Feature Selection in Python. Machine Learning Mastery.
https://machinelearningmastery. com/rfe-feature-selection-in-python/.
Accessed on February 15, 2021.
Cairns,
A.J., Blake, D., Dowd, K., Coughlan, G.D., Epstein, D., Ong, A. & Balevich, I. 2009. A quantitative comparison of stochastic
mortality models using data from England and Wales and the United States. North American Actuarial Journal 3(1):
1-35.
Chen,
L., Islam, R.M., Wang, J., Hird, T.R., Pavkov, M.E., Gregg, E.W., Salim, A., Tabesh,
M., Koye, D.N., Harding, J.L., Sacre,
J.W., Barr, E.L.M., Magliano, D.J. & Shaw, J.E. 2020. A systematic review
of trends in all-cause mortality among people with diabetes. Diabetologia 63(9): 1718-1735.
Currie,
I.D. 2016. On fitting generalized linear and non-linear models of mortality. Scandinavian Actuarial Journal 2016(4):
356-383.
Currie,
I.D. 2013. Smoothing constrained generalized linear models with an application
to the Lee-Carter model. Statistical
Modelling 13(1): 69-93.
Darst,
B.F., Malecki, K.C. & Engelman, C.D. 2018. Using
recursive feature elimination in random forest to account for correlated
variables in high dimensional data. BMC Genetics 19(1): 1-6.
Delwarde, A., Denuit, M. & Partrat, C. 2007. Negative binomial version of the Lee-Carter
model for mortality forecasting. Applied
Stochastic Models in Business and Industry 23(5): 385-401.
Denuit, M.,
Hainaut, D. & Trufin, J. 2019. Some generalized non-linear models (GNMs). In Effective Statistical Learning Methods
for Actuaries I. Springer, Cham. pp. 363-400.
Fawagreh, K.,
Gaber, M.M. & Elyan, E. 2014. Random forests: From
early developments to recent advancements. Systems Science & Control Engineering: An Open Access Journal 2(1): 602-609.
French, D. 2014. International mortality
modelling - An economic perspective. Economics
Letters 122(2): 182-186.
French, D. & O'Hare, C. 2014.
Forecasting death rates using exogenous determinants. Journal of Forecasting 33(8): 640-650.
Haberman, S. & Renshaw, A. 2008. On simulation-based approaches to risk
measurement in mortality with specific reference to binomial Lee-Carter
modelling. In Society of Actuaries Living to 100 Symposium.
Hanewald, K.
2011. Explaining mortality dynamics: The role of macroeconomic fluctuations and
cause of death trends. North
American Actuarial Journal 5(2): 290-314.
Hanewald, K.,
Post, T. & Gründl, H. 2011. Stochastic mortality,
macroeconomic risks and life insurer solvency. The Geneva Papers on Risk and Insurance-Issues and Practice 36(3):
458-475.
Hansen,
H. 2013. The forecasting performance of mortality models. AStA Advances in Statistical Analysis 97(1): 11-31.
Human
Mortality Database. 2020. University of California, Berkeley (USA), and Max
Planck Institute for Demographic Research (Germany). https://www.mortality.org/.
Hyndman, R.J. & Shang, H.L. 2009.
Forecasting functional time series. Journal
of the Korean Statistical Society 38: 199-211.
Hyndman, R.J. & Ullah, M.S. 2007.
Robust forecasting of mortality and fertility rates: A functional data
approach. Computational Statistics
& Data Analysis 51(10): 4942-4956.
Kim, T.K. & Lane, S.R. 2013.
Government health expenditure and public health outcomes: A comparative study
among 17 countries and implications for US health care reform. American International Journal of
Contemporary Research 3(9): 8-13.
Kuhn, M. 2020. Caret:
Classification and Regression Training. R package version 6.0-86. https://CRAN.R-project.org/package=caret
Kuhn, M. & Johnson, K. 2013. Applied Predictive Modeling. Vol. 26. New York: Springer.
Lee, R.D. & Carter, L.R. 1992.
Modeling and forecasting US mortality. Journal
of the American Statistical Association 87(419): 659-671.
Lee, R. & Miller, T. 2001. Evaluating
the performance of the Lee-Carter method for forecasting mortality. Demography 38(4): 537-549.
Leisch, F.
& Dimitriadou, E. 2010. Machine learning
benchmark problems. R Package, mlbench.
Liaw, A.
& Wiener, M. 2002. Classification and regression by randomForest. R news 2(3): 18-22.
McCullagh, P. & Nelder,
J.A. 1989. Generalized Linear Models.
2nd ed. London: Chapman and Hall.
National
Account Data. 2020. https://unstats.un.org/unsd/snaama/downloads
Nor,
S.R.M., Yusof, F. & Norrulashikin, S.M. 2021.
Coherent mortality model in a state-space approach. Sains Malaysiana 50(4): 1101-1111.
OECD.
2020. OECD. Stats.https://stats.oecd.org/
Our
World in Data. 2020. https://ourworldindata.org/country/taiwan
Pitt,
D., Li, J. & Lim, T.K. 2018. Smoothing Poisson common factor model for
projecting mortality jointly for both sexes. ASTIN Bulletin: The Journal of the IAA 48(2): 509-541.
Rasoulinezhad, E., Taghizadeh-Hesary, F.
& Taghizadeh-Hesary, F. 2020. How is mortality
affected by fossil fuel consumption, CO2 emissions and economic factors
in CIS region? Energies 13(9):
2255.
Renshaw,
A.E. & Haberman, S. 2006. A cohort-based extension to the Lee-Carter model
for mortality reduction factors. Insurance:
Mathematics and Economics 38(3): 556-570.
Seklecka, M., Pantelous, A.A. &
O'Hare, C. 2017. Mortality effects of temperature changes in the United
Kingdom. Journal of Forecasting 36(7): 824-841.
Tulu,
H.D., Lim, A., Ma-a-Lee, A., Bundhamcharoen, K. & Makka, N. 2020. Prediction of HIV mortality in
Thailand using three data sets from the National AIDS Program Database. Sains Malaysiana 49(1):
155-160.
Turner,
H. & Firth, D. 2020. Generalized Nonlinear
Models in R: An Overview of the gnm Package. https://cran.r-project.org/package=gnm
Wang,
D. & Lu, P. 2005. Modelling and forecasting mortality distributions in
England and Wales using the Lee-Carter model. Journal of Applied Statistics 32(9): 873-885.
Wickham,
H., Girlich, M. & Ruiz, E. 2021. dbplyr: A 'dplyr' Back
End for Databases. R package version
2.1.0. https://CRAN.R-project.org/package=dbplyr
Wilmoth,
J.R. 1993. Computational Methods for
Fitting and Extrapolating the Lee-Carter Model of Mortality Change. Technical report, Department of Demography, University of California, Berkeley.
World
Bank. World Development Indicators. 2020. https://data.worldbank.org/indicator/EN.ATM.CO2E.GF.KT
Yeh, H.H.,
Westphal, J., Hu, Y., Peterson, E., Williams, L., Prabhakar, D., Frank, C., Autio, K., Elsiss, F., Simon, G.,
Beck, A., Lynch, F., Rossom, R., Lu, C., Owen-Smith,
A., Waitzfelder, B. & Ahmedani,
B. 2019. Diagnosed mental health conditions and risk of suicide
mortality. Psychiatric Services 70(9): 750-757.
*Pengarang untuk surat-menyurat; email:
dharini@um.edu.my
|