Sains Malaysiana 48(12)(2019):
2831–2839
http://dx.doi.org/10.17576/jsm-2019-4812-24
A Relative Tolerance Relation of Rough
Set in Incomplete Information
(Perhubungan Toleransi Relatif Set Kasar dalam Maklumat
tak Lengkap)
RD ROHMAT SAEDUDIN1*,
SHAHREEN
KASIM2,
HAIRULNIZAM
MAHDIN2,
MOHD
FARHAN
MD
FUDZEE2,
EDI
SUTOYO1,
IWAN
TRI
RIYADI
YANTO3,
ROHAYANTI
HASSAN4
1School of Industrial
Engineering, Telkom University, 40257, Bandung, West Java, Indonesia
2Faculty of Computer
Science and Information Technology, Universiti
Tun Hussein Onn Malaysia, 86400 Batu
Pahat, Johor Darul
Takzim, Malaysia
3Department of Information
Systems, Universitas Ahmad Dahlan, 55161, Yogyakarta, Indonesia
4Faculty of Computing,
Universiti Teknologi
Malaysia, 81310 Skudai, Johor Darul
Takzim, Malaysia
Received: 21 February 2019/Accepted:
25 December 2019
ABSTRACT
University
is an educational institution that has objectives to increase student
retention and also to make sure students graduate on time. Student
learning performance can be predicted using data mining techniques
e.g. the application of finding essential association rules on student
learning base on demographic data by the university in order to
achieve these objectives. However, the complete data i.e. the dataset
without missing values to generate interesting rules for the detection
system, is the key requirement for any mining technique. Furthermore,
it is problematic to capture complete information from the nature
of student data, due to high computational time to scan the datasets.
To overcome these problems, this paper introduces a relative tolerance
relation of rough set (RTRS).
The novelty of RTRS is that, unlike previous rough set
approaches that use tolerance relation, non-symmetric similarity
relation, and limited tolerance relation, it is based on limited
tolerance relation by taking account into consideration the relatively
precision between two objects and therefore this is the first work
that uses relatively precision. Moreover, this paper presents
the mathematical properties of the RTRS
approach and compares the performance and the existing
approaches by using real-world student dataset for classifying university’s
student performance. The results show that the proposed approach
outperformed the existing approaches in terms of computational time
and accuracy.
Keywords:
Classification; educational data mining; incomplete information
systems; rough set theory
ABSTRAK
Universiti adalah
sebuah institusi
pendidikan yang antara objektifnya adalah untuk meningkatkan penahanan pelajar dan juga untuk
memastikan pelajar
bergraduasi dalam jangka masa yang ditetapkan. Untuk mencapai objektif tersebut, pelajar perlulah memastikan prestasi pembelajaran sentiasa konsisten. Teknik perlombongan data boleh digunakan untuk meramal prestasi pembelajaran pelajar. Namun, isu data hilang atau data tidak lengkap membataskan
keberkesanan teknik
perlombongan data khasnya dalam mengenal pasti hubungan atribut pembelajaran pelajar dan atribut
demografi pelajar.
Isu menjadi lebih
sukar apabila
melibatkan data pelajar yang banyak. Maka, kertas
ini mencadangkan
teknik perhubungan toleransi relatif set kasar (RTRS) bagi
mengatasi isu ini.
Kelainan RTRS dalam
kertas ini adalah dengan menggunakan
ketepatan relatif
antara dua objek
atribut. Selain
itu, kertas ini
turut membentangkan
formula matematik yang digunakan
dalam RTRS. Seterusnya,
prestasi cadangan
teknik RTRS ini dibandingkan dengan teknik asal menggunakan
set data pelajar universiti
untuk mengelaskan
prestasi pelajar tersebut. Hasil menunjukkan bahawa teknik RTRS yang dicadangkan
mengatasi teknik
sedia ada daripada
segi masa komputer
dan ketepatan.
Kata
kunci: Pengelasan; perlombongan data pendidikan; sistem maklumat tidak lengkap; teori set kasar
REFERENCES
Borkar, S.
& Rajeswari, K. 2013. Predicting students academic performance using
education data mining. IJCSMC International Journal of Computer
Science and Mobile Computing 2(7): 273-279.
Bunting, B.P., Adamson, G. & Mulhall,
P.K. 2002. A Monte Carlo examination of an MTMM model with planned
incomplete data structures. Structural Equation Modeling 9(3):
369-389.
Chiroma, H.,
Abdulkareem, S., Muaz, S.A., Abubakar,
A.I., Sutoyo, E., Mungad,
M., Younes, Saadi.,
Eka, Novita, Sari. & Herawan,
T. 2015. An intelligent modeling of oil consumption. Advances
in Intelligent Systems and Computing 320: 557-568.
Chmielewski, M.R.,
Grzymala-Busse, J.W., Peterson, N.W. &
Than, S. 1993. The rule induction system LERS-a version for personal
computers. Foundations of Computing and Decision Sciences 18(3-4):
181-212.
Dobrota, M.,
Bulajić, M. & Radojičić,
Z. 2014. Data mining models for prediction of customers’ satisfaction:
The CART analysis. In Innovative Management and Firm Performance,
edited by Jakšić, M.L., Rakočević,
S.B. & Martić, M. London: Palgrave
Macmillan. pp. 401-421.
Fayyad, U.M. 1996. Data mining and knowledge discovery: Making sense
out of data. IEEE Expert: Intelligent Systems and Their Applications
11(5): 20-25.
Ibrahim, Z. & Rusli, D. 2007. Predicting
students’ academic performance: Comparing artificial neural network,
decision tree and linear regression. 21st Annual SAS Malaysia
Forum, 5th September.
Kotsiantis, S.,
Pierrakeas, C. & Pintelas, P.
2004. Predicting students’performance
in distance learning using machine learning techniques. Applied
Artificial Intelligence 18(5): 411-426.
Kryszkiewicz, M.
1999. Rules in incomplete information systems. Information Sciences
113(3): 271-292.
Kryszkiewicz, M.
1998. Rough set approach to incomplete information systems. Information
Sciences 112(1): 39-49.
Márquez-Vera,
C., Cano, A., Romero, C. & Ventura, S. 2013. Predicting student
failure at school using genetic programming and different data mining
approaches with high dimensional and imbalanced data. Applied
Intelligence 38(3): 315-330.
Minaei-Bidgoli, B.,
Kashy, D.A., Kortemeyer, G. &
Punch, W.F. 2003. Predicting student performance: An application
of data mining methods with an educational web-based system. Proceedings-Frontiers
in Education Conference 2003 1: 13-18.
Mohammed, M.A.T., Mohd, W.M.W., Arshah, R.A., Mungad, M., Sutoyo, E. & Chiroma, H. 2016.
Analysis of parameterization value reduction of soft sets and its
algorithm. International Journal of Software Engineering and
Computer Systems 2(1): 51-57.
Ogunde, A.O.
& Ajibade, D.A. 2014. A data mining
system for predicting university students’ graduation grades using
ID3 decision tree algorithm. Journal of Computer Science and
Information Technology 2(1): 21-46.
Pal, S. 2012. Mining educational data to reduce dropout rates of
engineering students. International Journal of Information Engineering
and Electronic Business 4(2): 1. Doi:
10.5815/ ijieeb.2012.02.01.
Romero, C. & Ventura, S. 2007. Educational data mining: A survey
from 1995 to 2005. Expert Systems with Applications 33(1):
135-146.
Saedudin, R.R.,
Kasim, S., Mahdin,
H., Sutoyo, E., Riyadi
Yanto, I.T., Hassan, R. & Ismail, M.A. 2018. A relative
tolerance relation of rough set (RTRS) for potential fish yields
in Indonesia. Journal of Coastal Research: Special Issue 82 -
Coastal Ecosystem Responses to Human and Climatic Changes throughout
Asia. pp. 84-92.
Saedudin, R.R., Sutoyo, E., Kasim, S., Mahdin, H. & Yanto, I.T.R. 2017a.
A comparative analysis of rough sets for incomplete information
system in student dataset. International Journal on Advanced
Science, Engineering and Information Technology 7(6): 2078-2084.
Saedudin, R.R., Sutoyo,
E., Kasim, S., Mahdin,
H. & Yanto, I.T.R. 2017b. Attribute
selection on student performance dataset using maximum dependency
attribute. Electrical, Electronics and Information Engineering
(ICEEIE), 2017 5th International Conference. pp. 176-179.
Saedudin, R.R., Kasim,
S.B., Mahdin, H. & Hasibuan,
M.A. 2016. Soft set approach for clustering graduated dataset. International
Conference on Soft Computing and Data Mining. pp. 631-637.
Slavin, R.E., Karweit,
N.L. & Wasik, B.A. 1994. Preventing
Early School Failure: Research, Policy, and Practice. Boston:
Allyn & Bacon.
Stefanowski, J. & Tsoukias,
A. 2001. Incomplete information tables and rough classification.
Computational Intelligence 17(3): 545-566.
Stefanowski, J. & Tsoukiàs,
A. 1999. On the extension of rough sets under incomplete information.
International Workshop on Rough Sets, Fuzzy Sets, Data Mining,
and Granular-Soft Computing. pp. 73-81.
Sutoyo E., Yanto,
I.T.R., Saadi, Y., Chiroma,
H., Hamid, S. & Herawan, T. 2019.
A framework for clustering of web users
transaction based on soft set theory. In Proceedings of the International
Conference on Data Engineering 2015 (DaEng-2015). Lecture Notes
in Electrical Engineering, edited by Abawajy, J., Othman, M., Ghazali,
R., Deris, M., Mahdin
H. & Herawan T. Singapore: Springer.
520: 307-314.
Sutoyo, E., Yanto,
I.T.R., Saedudin, R.R. & Herawan,
T. 2017a. A soft set-based co-occurrence for clustering web user
transactions. Telkomnika (Telecommunication Computing Electronics
and Control) 15(3): 1344-1353.
Sutoyo, E., Saedudin,
R.R., Yanto, I.T.R. & Apriani,
A. 2017b. Application of adaptive neuro-fuzzy inference system and
chicken swarm optimization for classifying river water quality.
Electrical, Electronics and Information Engineering (ICEEIE),
2017 5th International Conference. pp. 118-122.
Van Nguyen, D., Yamada, K. & Unehara, M. 2013. Extended tolerance relation to define a
new rough set model in incomplete information systems. Advances
in Fuzzy Systems 2013: 37209.
Wang, G. 2002. Extension of rough
set under incomplete information systems. Proceedings of the
2002 IEEE International Conference 2: 1098-1103.
Wu, Y. & Guo,
Q. 2010. An extension model of rough set in incomplete information
system. Future Computer and Communication (ICFCC), 2010 2nd International
Conference 2: 434-438.
Yadav, S.K., Bharadwaj,
B. & Pal, S. 2012. Mining education data to predict student’s
retention: A comparative study. International Journal of Computer
Science and Information Security 10(2): 113-117.
Yadav, S.K. & Pal, S. 2012. Data
mining: A prediction for performance improvement of engineering
students using classification. World of Computer Science and
Information Technology Journal WCSIT 2(2): 51-56.
Yang, X. 2009. An improved model of
rough sets on incomplete information systems. Management of e-Commerce
and e-Government, 2009. ICMECG’09. International Conference. pp.
193-196.
Yang, X., Song, X. & Hu, X. 2011.
Generalisation of rough set for rule induction
in incomplete system. International Journal of Granular Computing,
Rough Sets and Intelligent Systems 2(1): 37-50.
Yanto, I.T.R., Saedudin,
R.R., Hartama, D. & Herawan,
T. 2016. Clustering based on classification quality (CCQ). International
Conference on Soft Computing and Data Mining. pp. 327-335.
Yanto, I.T.R., Saedudin,
R.R., Lashari, S.A. & Haviluddin.
2018a. A numerical classification technique based on fuzzy soft
set using hamming distance. International Conference on Soft
Computing and Data Mining. pp. 252-260.
Yanto, I.T.R., Sutoyo,
E., Apriani, A. & Verdiansyah,
O. 2018b. Fuzzy soft set for rock igneous clasification.
2018 International Symposium on Advanced Intelligent Informatics
(SAIN). pp. 199-203.
Zhou, J. & Yang, X. 2012. Rough
set model based on hybrid tolerance relation. International Conference
on Rough Sets and Knowledge Technology. pp. 28-33.
Zhou, Q. 2010. Research on tolerance-based
rough set models. System Science, Engineering Design and Manufacturing
Informatization (ICSEM), 2010 International
Conference 2: 137-139.
*Corresponding author; email: rdrohmat@telkomuniversity.ac.id
|