CLASSIFICATION OF DENGUE FEVER DISEASE USING A MACHINE LEARNING-BASED RANDOM FOREST ALGORITHM

ARIF FITRA SETYAWAN, Amelia Devi Putri Ariyanto, Amelia Devi Putri Ariyanto, Fari Katul Fikriah, Fari Katul Fikriah

Abstract


Dengue Hemorrhagic Fever (DHF) is a tropical disease that often results in high morbidity and mortality rates. Early diagnosis of DHF is crucial to mitigate its adverse effects. However, manual diagnostic processes are often inefficient and prone to errors. This study aims to develop a DHF classification model using the Random Forest algorithm, which is expected to assist in the early diagnosis of this disease. The methodology used in this research is CRISP-DM (Cross-Industry Standard Process for Data Mining), which includes the stages of Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. Data was obtained from kaggle.com, and during the Data Preparation stage, missing values were removed, categorical features were encoded, data was normalized, and split into training and testing sets. The research results show that the Random Forest model has an accuracy of 88.5%, precision of 88.2%, recall of 65.2%, F1-score of 74.9%, and ROC AUC of 0.810. Feature importance analysis revealed that the Gender_Male and Body_Pain features have the largest contributions in DHF classification. Although the model demonstrated high accuracy and precision, the lower recall value indicates that some positive cases were missed, requiring further improvements. The Random Forest can be used as a tool for early DHF diagnosis, but further adjustments are necessary to enhance its performance. This research provides insights into the contributing factors for DHF diagnosis and the practical application potential of this model in medical decision support systems.


Full Text:

PDF

References


I. A. Dania, “GAMBARAN PENYAKIT DAN VEKTOR DEMAM BERDARAH DENGUE (DBD),” Jurnal Warta, 2016.

T. Firman dan A. Ahmedika, “Diagnosa Penyakit Demam Berdarah Dengue (DBD) menggunakan Metode Learning Vector Quantization (LVQ),” JISKa, p. 193 – 201, 2020.

D. A. Reza, N. N. Yuki dan S. Wahyuningsih, “KLASIFIKASI PROBABILISTIC NEURAL NETWORK (PNN) PADA DATA DIAGNOSA PENYAKIT DEMAM BERDARAH DENGUE (DBD) TAHUN 2018,” dalam Prosiding Seminar Nasional Matematika, Statistika, dan Aplikasinya 2019, Samarinda, 2019.

A. H. Chainur, A. M. Moch dan A. Prahutama, “KLASIFIKASI DIAGNOSA PENYAKIT DEMAM BERDARAH DENGUE (DBD) MENGGUNAKAN SUPPORT VECTOR MACHINE (SVM) BERBASIS GUI MATLAB,” JURNAL GAUSSIAN, pp. 171-180 , 2017.

Y. P. Arifin, N. F. Hari, M.Fakhrizal dan A. Nur, “Penerapan Data Mining Dalam Analisis Prediksi Kanker Paru Menggunakan Algoritma Random Forest,” Jurnal Ilmiah Teknik Informatika dan Komunikasi, 2023.

M. Idris, “IMPLEMENTASI DATA MINING DENGAN ALGORITMA NAÏVE BAYES UNTUK MEMPREDIKSI ANGKA KELAHIRAN,” Jurnal Pelita Informatika, pp. 421-428, 2019.

C. M. Ajeng, Rusdah, L. H. Law dan A. Dian, “DETEKSI DINI GEJALA AWAL PENYAKIT DIABETES MENGGUNAKAN ALGORITMA RANDOM FOREST,” Idealis: Indonesia Journal Information System , pp. 165-171, 2023 .

W. Y. Ayele, “Adapting CRISP-DM for Idea Mining: A Data Mining Process for Generating Ideas Using a Textual Dataset,” International Journal of Advanced Computer Science and Applications(IJACSA), vol. 11, pp. 20-32, 2020.

A. S. Jairo, J. L. C. Diana, F. U. I. Samir dan J. R. Coronado-Hernández, “Predictive models assessment based on CRISP-DM methodology for students performance in Colombia - Saber 11 Test,,” Procedia Computer Science, vol. 198, pp. 512-517, 2022.

P. Rifkie, Algoritma machine learning, Bandung: Informatika Bandung, 2021.

A. Y. S. Taghfirul dan J. P. Wawan, “IMPLEMENTASI SELEKSI FITUR INFORMATION GAIN RATIO PADA ALGORITMA RANDOM FOREST UNTUK MODEL DATA KLASIFIKASI PEMBAYARAN KULIAH,” Dinamika Informatika, pp. 41-49 , 2023.

P. S. Ary, P. P. Dwi, P. P. Jojor dan R. B. Khairul, “Implementasi Algoritma Random Forest Dalam Klasifikasi Diagnosis Penyakit Stroke,” Jurnal Penelitian Rumpun Ilmu Teknik (JUPRIT), pp. 155-164, 2023.

Sriyanto dan R. S. Agiska, “Prediksi Penyakit Diabetes Menggunakan Algoritma Random Forest,” JURNAL TEKNIKA, pp. 163-172 , 2023.

B. Mahmin, B. H. Dinda dan S. Oloan, “KLASIFIKASI PENYAKIT STUNTING DENGAN MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE DAN RANDOM FOREST,” Jurnal TEKINKOM, pp. 540-549, 2023 .

Firmansyah dan Y. Agus, “Prediksi Penyakit Jantung Menggunakan Algoritma Random Forest,” Jurnal Minfo Polgan, pp. 2239-2246 , 2023.

S. Regina, P. Madalena, A. Mariana, R. Mariana dan P. Hugo, “Harnessing Data Mining to Predict Survival Outcomes in Patients with Hepatic Cirrhosis,,” Procedia Computer Science, vol. 238, pp. 938-943, 2024.

A. A. Zharfan, R. P. Satriawan, A. S. Darmawan, Ricko, W. Adi, S. Aris dan F. Wijayanto, “Prediction of Hotel Booking Cancellation using CRISP-DM,” dalam 4th International Conference on Informatics and Computational Sciences (ICICoS), Semarang, 2020.

P. Ana, F. Diana, N. Cristiana, A. António dan M. José, “Data Mining to Predict Early Stage Chronic Kidney Disease,” Procedia Computer Science, vol. 177, pp. 562-567, 2020.




DOI: https://doi.org/10.33387/jiko.v7i2.8496

Refbacks

  • There are currently no refbacks.