Sains Malaysiana 46(6)(2017): 989–999

http://dx.doi.org/10.17576/jsm-2017-4606-19

 

Pemodelan Taburan Kebarangkalian Zarah Terampai Melampau di Lembah Klang

(Modelling of Probability Distributions of Extreme Particulate Matter in Klang Valley)

 

MUHAMMAD ASLAM MOHD SAFARI* & WAN ZAWIAH WAN ZIN

 

Pusat Pengajian Sains Matematik, Fakulti Sains dan Teknologi, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor Darul Ehsan, Malaysia

 

Received: 2 October 2015/Accepted: 7 December 2016

 

 

ABSTRAK

Kajian ini bertujuan untuk mengenal pasti model statistik terbaik bagi mewakili set data melampau untuk salah satu bahan pencemaran udara iaitu zarah terampai (PM10). Data dari enam buah stesen pengawasan kualiti udara di sekitar Lembah Klang dari tahun 2009 hingga 2011 digunakan dalam kajian ini. Dalam penentuan taburan terbaik, taburan parametrik dan taburan tak berparameter telah diuji. Dua siri data melampau yang digunakan ialah siri data maksimum bulanan dan siri data melangkaui ambang bagi PM10. Seterusnya, dua taburan parametrik iaitu Taburan Melampau Teritlak (GEV) dan Taburan Pareto Teritlak (GPD) masing-masing dipadankan kepada siri data maksimum bulanan dan siri data melangkaui ambang. Kaedah penganggaran parameter L-momen dan ujian kebagusan penyuaian Anderson Darling digunakan dalam pemilihan taburan parametrik terbaik yang juga menentukan kaedah pemilihan data melampau yang mana lebih baik. Bagi kaedah tak berparameter, penganggaran fungsi ketumpatan kernel (KDE) digunakan untuk menentukan taburan terbaik PM10 melampau. Hasil pengiraan ralat min kuasa dua (MSE) mendapati taburan tak berparameter merupakan taburan terbaik bagi data melampau PM10 di kebanyakan stesen kajian. Taburan terbaik bagi setiap stesen kajian seterusnya digunakan bagi menghitung tempoh ulangan PM10 yang sangat berguna bagi pihak yang terbabit.

 

Kata kunci: Fungsi ketumpatan kernel; L-momen; PM10; taburan Nilai Melampau Teritlak; taburan pareto teritlak; taburan tak berparameter; ujian penyuaian Anderson Darling

 

ABSTRACT

This study aims to identify the best statistical model to represent the data set for one of the air pollutants that is the particulate matter with diameters smaller than 10 micrometers (PM10). Data from six air quality monitoring stations in the Klang Valley from 2009 to 2011 were used in this study. In determining the more appropriate probability distribution, both parametric and non-parametric approaches were tested. Two series of extreme data for PM10 were used, which are the monthly maximum and the Peak over threshold data series. Next, two parametric distributions, which are the Generalized Extreme Value (GEV) and Generalized Pareto (GPD) were fitted to the monthly maximum and the Peak over threshold data series, respectively. L-moment parameter estimation method and Anderson Darling goodness of fit test were used to identify the best parametric distribution as well as the more suitable data series to represent extreme data. For the non-parametric approach, the kernel density estimation (KDE) is used in this study to determine the best distribution for extreme PM10. Based on the mean squared error (MSE) results, it is found that the nonparametric distribution is the best distribution for extreme PM10 data from most of the air quality monitoring stations. The best distribution for each air quality monitoring station is then used to estimate several return periods for extreme PM10 which are very useful for relevant authorities.

 

Keywords: Anderson Darling goodness of fit test; generalized extreme value; generalized pareto; kernel density estimation; L-moments; non-parametric distribution; PM10

 

REFERENCES

 

Abdullah, A.M., Abu Samah, M.A. & Jun, T.Y. 2012. An overview of the air pollution trend in Klang Valley, Malaysia. Open Environmental Sciences 6: 13-19.

Afroz, R., Hassan, M.N. & Ibrahim, N.A. 2003. Review of air pollution and health impacts in Malaysia. Environmental Research 92(2): 71-77.

Altman, N. & Leger, C. 1995. Bandwidth selection for kernel distribution function estimation. Journal of Statistical Planning and Inference 46: 195-214.

Anderson, T.W. & Darling, D.A. 1954. A test for goodness of fit. The Journal of American Statistical Association 49: 765-769.

Awang, M.B., Jaafar, A.B., Abdullah, A.M., Ismail, M.B., Hassan, M.N., Abdullah, R., Johan, S. & Noor, H. 2000. Air quality in Malaysia: Impacts, management issues and future chalanges. Respirology 5(2): 183-196.

Beguería, S. 2005. Uncertainties in partial duration series modelling of extremes related to the choice of the threshold value. Journal of Hydrology 303(1): 215-230.

Bowman, A., Hall, P. & Prvan, T. 1998. Bandwidth selection for the smoothing of distribution functions. Biometrika 85(4): 799-808.

Brook, R.D., Franklin, B., Cascio, W., Hong, Y., Howard, G., Lipsett, M., Luepker, R., Mittleman, M., Samet, J. & Smith Jr, S.C. 2004. Air pollution and cardiovascular disease: A statement for healthcare professionals from the expert panel on population and prevention science of the American Heart Association. Circulation 109(21): 2655-2671.

Coles, S., Bawa, J., Trenner, L. & Dorazio, P. 2001. An Introduction to Statistical Modeling of Extreme Values. Vol. 208. London: Springer.

Davison, A.C. 1984. Modelling excesses over high thresholds, with an application. In Statistical Extremes and Applications. Netherlands: Springer.

Dominick, D., Juahir, H., Latif, M.T., Zain, S.M. & Aris, A.Z. 2012. Spatial assessment of air quality patterns in Malaysia using multivariate analysis. Atmospheric Environment 60: 172-181.

Hosking, J.R. & Wallis, J.R. 1997. Regional Frequency Analysis. Cambridge: The press syndicate of the University of Cambridge.

Hosking, J.R.M. 1990. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society 52: 105-124.

Hurairah, A., Akma Ibrahim, N., Bin Daud, I. & Haron, K. 2005. An application of a new extreme value distribution to air pollution data. Management of Environmental Quality: An International Journal 16(1): 17-25.

Jabatan Alam Sekitar Malaysia. 2015. Indeks Pencemaran Udara. http://www.apims.doe.gov.my/. Diakses pada 8 April 2015.

Katiman, R. 2006. Perbandaran dan perkembangan wilayah metropolitan lanjutan lembah Klang-Langat, Malaysia. Jurnal Elektronik Sains Sosial dan Kemanusiaan1(1): 1-27.

Keywood, M.D., Ayers, G.P., Gras, J.L., Boers, R. & Leong, C.P. 2003. Haze in the Klang Valley of Malaysia. Atmos. Chem. Phys 3: 615-653.

Kuchenhoff, H. & Thamerus, M. 1995. Extreme value analysis of Munich air pollution data. Sonderforschungsbereich 386(4): 1-24.

Kostova, S.P., Rumchev, K.V., Vlaev, T. & Popova, S.B. 2012. Using copulas to measure association between air pollution and respiratory diseases. International Scholarly and Scientific Research & Innovation 6(11): 533-538.

Lang, M., Ouarda, T.B.M.J. & Bobée, B. 1999. Towards operational guidelines for over-threshold modeling. Journal of hydrology 225(3): 103-117.

Masseran, N., Razali, A.M., Ibrahim, K. & Latif, M.T. 2016. Modeling air quality in main cities of Peninsular Malaysia by using a generalized Pareto model. Environmental Monitoring and Assessment 188(1): 1-12.

Mohd Yusoff, M.R., Hassan, H. & Mohd Zain, I. 1989. Kesan angin dan hujan ke atas penyebaran & kepekatan zarahan terampai di Kuala Lumpur. Jurnal Teknologi13: 54-65.

Parzen, E. 1962. On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33(3): 1065-1076.

Polansky, A.M. & Baker, E.R. 2000. Multistage plug-in bandwidth selection for kernel distribution function estimates. Journal of Statistical Computation and Simulation 65: 63-80.

Quintela-del-Rio, A. & Estevez-Perez, G. 2012. Nonparametric kernel distribution function estimation with kerdiest: An R package for bandwidth choice and applications. Journal of Statistical Software 50(8): 1-20.

Quintela-del-Rio, A. & Francisco-Fernandez, M. 2011. Analysis of high level ozone concentrations using nonparametric methods. Science of the Total Environment 409: 1123-1133.

Sahani, M., Zainon, N.A., Wan Mahiyudin. W.R., Latif, M.T., Hod, R., Khan, M.D., Mohd Tahir, M. & Chah, C.C. 2014. A case-crossover analysis of forest fire haze events and mortality in Malaysia. Atmospheric Environment 96: 257- 265.

Santus, P., Russo, A., Madonini, E., Allegra, L., Blasi, F., Centanni, S., Madonini, A., Schiraldi, G. & Amaducci, S. 2012. How air pollution influences clinical management of respiratory diseases: A case-crossover study in Milan. Respiratory Reserch 13(95): 1-12.

Silverman, B.W. 1986. Density Estimation for Statistics And Data Analysis. London: Chapman & Hall.

Stephens, M.A. 1976. Asymptotic results for goodness-of-fit statistics with unknown parameters. Annals of Statistics 4: 357-369.

Tobias, A. & Scotto, M.G. 2005. Prediction of extreme ozone levels in Barcelona, Spain. Environmental Monitoring and Assessment 100: 23-32.

Vogel, R.M. & Fennessey, N.M. 1993. L-moment diagrams should replace product moment diagrams. Water Resources Research 29: 1745-1752.

Wan Mahiyuddin, W.R., Sahani, M., Aripin, R., Latif, M.T., Thach, T.Q. & Wong, C.T. 2013. Short-term effects of daily air pollution on mortality. Atmospheric Environment 65: 69-79.

World Health Organization. 2015. Air Pollution. http://www. who.int/topics/air_pollution/en/. Diakses pada 8 April 2015.

Zhou, S.M., Deng, Q.H. & Liu, W.W. 2012. Extreme air pollution events: Modeling and prediction. Journal of Central South University 19: 1668-1672.

 

*Corresponding author; email: aslammohdsafari@gmail.com

 

 

 

previous