Sains Malaysiana 46(6)(2017): 989–999
http://dx.doi.org/10.17576/jsm-2017-4606-19
Pemodelan Taburan Kebarangkalian Zarah Terampai Melampau di Lembah Klang
(Modelling of Probability Distributions of Extreme
Particulate Matter in Klang Valley)
MUHAMMAD ASLAM MOHD SAFARI*
& WAN ZAWIAH WAN ZIN
Pusat Pengajian Sains Matematik, Fakulti Sains dan Teknologi, Universiti Kebangsaan Malaysia, 43600 UKM Bangi,
Selangor Darul Ehsan, Malaysia
Received: 2 October
2015/Accepted: 7 December 2016
ABSTRAK
Kajian ini bertujuan
untuk mengenal pasti model statistik terbaik bagi mewakili set
data melampau untuk salah satu bahan pencemaran udara iaitu zarah
terampai (PM10).
Data dari enam buah stesen pengawasan kualiti udara di sekitar
Lembah Klang dari tahun 2009 hingga 2011 digunakan dalam kajian
ini. Dalam penentuan taburan terbaik, taburan parametrik dan taburan
tak berparameter telah diuji. Dua siri data melampau yang digunakan
ialah siri data maksimum bulanan dan siri data melangkaui ambang
bagi PM10. Seterusnya, dua taburan
parametrik iaitu Taburan Melampau Teritlak (GEV)
dan Taburan Pareto Teritlak (GPD) masing-masing dipadankan
kepada siri data maksimum bulanan dan siri data melangkaui ambang.
Kaedah penganggaran parameter L-momen dan ujian kebagusan penyuaian
Anderson Darling digunakan dalam pemilihan taburan parametrik
terbaik yang juga menentukan kaedah pemilihan data melampau yang
mana lebih baik. Bagi kaedah tak berparameter, penganggaran fungsi
ketumpatan kernel (KDE)
digunakan untuk menentukan taburan terbaik PM10
melampau. Hasil pengiraan ralat min kuasa dua (MSE) mendapati taburan tak berparameter
merupakan taburan terbaik bagi data melampau PM10
di kebanyakan stesen kajian. Taburan terbaik bagi
setiap stesen kajian seterusnya digunakan bagi menghitung tempoh
ulangan PM10
yang sangat berguna bagi pihak yang terbabit.
Kata kunci: Fungsi ketumpatan
kernel; L-momen; PM10;
taburan Nilai
Melampau Teritlak; taburan pareto
teritlak; taburan
tak berparameter; ujian penyuaian Anderson Darling
ABSTRACT
This study aims to
identify the best statistical model to represent the data set for one of the
air pollutants that is the particulate matter with diameters smaller than 10
micrometers (PM10). Data from six air quality
monitoring stations in the Klang Valley from 2009 to
2011 were used in this study. In determining the more appropriate probability
distribution, both parametric and non-parametric approaches were tested. Two
series of extreme data for PM10 were
used, which are the monthly maximum and the Peak over threshold data series.
Next, two parametric distributions, which are the Generalized Extreme Value (GEV)
and Generalized Pareto (GPD) were fitted to the monthly
maximum and the Peak over threshold data series, respectively. L-moment
parameter estimation method and Anderson Darling goodness of fit test were used
to identify the best parametric distribution as well as the more suitable data
series to represent extreme data. For the non-parametric approach, the kernel
density estimation (KDE) is used in this study to determine
the best distribution for extreme PM10. Based on the mean
squared error (MSE) results, it is found that the
nonparametric distribution is the best distribution for extreme PM10 data
from most of the air quality monitoring stations. The best distribution for
each air quality monitoring station is then used to estimate several return
periods for extreme PM10 which
are very useful for relevant authorities.
Keywords: Anderson
Darling goodness of fit test; generalized extreme value; generalized pareto; kernel density estimation; L-moments;
non-parametric distribution; PM10
REFERENCES
Abdullah,
A.M., Abu Samah, M.A. & Jun, T.Y. 2012. An overview of the air pollution trend in Klang Valley, Malaysia. Open Environmental Sciences 6: 13-19.
Afroz,
R., Hassan, M.N. & Ibrahim, N.A. 2003. Review of air
pollution and health impacts in Malaysia. Environmental Research 92(2):
71-77.
Altman, N. & Leger, C. 1995. Bandwidth selection for kernel distribution function estimation. Journal of Statistical Planning and Inference 46: 195-214.
Anderson, T.W. & Darling, D.A. 1954. A test for goodness of fit. The Journal of American
Statistical Association 49: 765-769.
Awang,
M.B., Jaafar, A.B., Abdullah, A.M., Ismail, M.B.,
Hassan, M.N., Abdullah, R., Johan, S. & Noor, H. 2000. Air
quality in Malaysia: Impacts, management issues and future chalanges. Respirology 5(2): 183-196.
Beguería, S. 2005.
Uncertainties in partial duration series modelling of extremes related to the
choice of the threshold value. Journal of Hydrology 303(1): 215-230.
Bowman, A., Hall, P. & Prvan, T. 1998. Bandwidth
selection for the smoothing of distribution functions. Biometrika 85(4): 799-808.
Brook,
R.D., Franklin, B., Cascio, W., Hong, Y., Howard, G.,
Lipsett, M., Luepker, R., Mittleman,
M., Samet, J. & Smith Jr, S.C. 2004. Air
pollution and cardiovascular disease: A statement for healthcare professionals
from the expert panel on population and prevention science of the American
Heart Association. Circulation 109(21): 2655-2671.
Coles,
S., Bawa, J., Trenner, L.
& Dorazio, P. 2001. An
Introduction to Statistical Modeling of Extreme Values. Vol. 208. London: Springer.
Davison,
A.C. 1984. Modelling excesses over high thresholds, with an application. In Statistical Extremes and Applications. Netherlands: Springer.
Dominick,
D., Juahir, H., Latif, M.T., Zain, S.M. & Aris, A.Z. 2012. Spatial
assessment of air quality patterns in Malaysia using multivariate analysis. Atmospheric
Environment 60: 172-181.
Hosking,
J.R. & Wallis, J.R. 1997. Regional Frequency Analysis. Cambridge: The press syndicate of the University of Cambridge.
Hosking,
J.R.M. 1990. L-moments: Analysis and estimation of distributions using linear
combinations of order statistics. Journal of the Royal Statistical Society 52:
105-124.
Hurairah, A., Akma Ibrahim, N., Bin Daud, I. & Haron,
K. 2005. An application of a new extreme
value distribution to air pollution data. Management of Environmental
Quality: An International Journal 16(1): 17-25.
Jabatan Alam Sekitar Malaysia. 2015. Indeks Pencemaran Udara. http://www.apims.doe.gov.my/. Diakses pada 8 April 2015.
Katiman, R. 2006. Perbandaran dan perkembangan wilayah metropolitan lanjutan lembah Klang-Langat, Malaysia. Jurnal Elektronik Sains Sosial dan Kemanusiaan1(1): 1-27.
Keywood,
M.D., Ayers, G.P., Gras, J.L., Boers, R. & Leong, C.P. 2003. Haze
in the Klang Valley of Malaysia. Atmos. Chem. Phys 3: 615-653.
Kuchenhoff,
H. & Thamerus, M. 1995. Extreme value analysis of Munich air pollution data. Sonderforschungsbereich 386(4): 1-24.
Kostova,
S.P., Rumchev, K.V., Vlaev,
T. & Popova, S.B. 2012. Using copulas
to measure association between air pollution and respiratory diseases. International
Scholarly and Scientific Research & Innovation 6(11): 533-538.
Lang,
M., Ouarda, T.B.M.J. & Bobée,
B. 1999. Towards operational guidelines for over-threshold modeling. Journal of hydrology 225(3): 103-117.
Masseran,
N., Razali, A.M., Ibrahim, K. & Latif, M.T. 2016. Modeling
air quality in main cities of Peninsular Malaysia by using a generalized Pareto
model. Environmental Monitoring and Assessment 188(1): 1-12.
Mohd Yusoff,
M.R., Hassan, H. & Mohd Zain, I. 1989. Kesan angin dan hujan ke atas penyebaran & kepekatan zarahan terampai di Kuala Lumpur. Jurnal Teknologi13: 54-65.
Parzen, E.
1962. On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33(3): 1065-1076.
Polansky,
A.M. & Baker, E.R. 2000. Multistage plug-in bandwidth selection for kernel
distribution function estimates. Journal of Statistical Computation and
Simulation 65: 63-80.
Quintela-del-Rio,
A. & Estevez-Perez, G. 2012. Nonparametric kernel
distribution function estimation with kerdiest: An R
package for bandwidth choice and applications. Journal of Statistical
Software 50(8): 1-20.
Quintela-del-Rio,
A. & Francisco-Fernandez, M. 2011. Analysis of
high level ozone concentrations using nonparametric methods. Science
of the Total Environment 409: 1123-1133.
Sahani, M., Zainon, N.A., Wan Mahiyudin. W.R., Latif, M.T., Hod, R., Khan, M.D., Mohd Tahir, M. & Chah, C.C.
2014. A case-crossover analysis of forest fire haze events and mortality
in Malaysia. Atmospheric Environment 96: 257- 265.
Santus, P., Russo, A., Madonini, E., Allegra, L., Blasi,
F., Centanni, S., Madonini,
A., Schiraldi, G. & Amaducci,
S. 2012. How air pollution influences clinical management of respiratory
diseases: A case-crossover study in Milan. Respiratory Reserch 13(95): 1-12.
Silverman, B.W. 1986. Density
Estimation for Statistics And Data Analysis.
London: Chapman & Hall.
Stephens, M.A. 1976.
Asymptotic results for goodness-of-fit statistics with unknown parameters. Annals
of Statistics 4: 357-369.
Tobias,
A. & Scotto, M.G. 2005. Prediction of extreme ozone levels in Barcelona,
Spain. Environmental Monitoring and Assessment 100: 23-32.
Vogel,
R.M. & Fennessey, N.M. 1993. L-moment diagrams
should replace product moment diagrams. Water Resources Research 29:
1745-1752.
Wan Mahiyuddin, W.R., Sahani, M., Aripin, R., Latif, M.T., Thach, T.Q. & Wong, C.T. 2013. Short-term effects of daily air pollution on mortality. Atmospheric Environment 65: 69-79.
World
Health Organization. 2015. Air Pollution. http://www. who.int/topics/air_pollution/en/. Diakses pada 8 April 2015.
Zhou,
S.M., Deng, Q.H. & Liu, W.W. 2012. Extreme air pollution events:
Modeling and prediction. Journal of Central South University 19:
1668-1672.
*Corresponding author;
email: aslammohdsafari@gmail.com