COMPARISON OF DECISION TREE AND RANDOM FOREST METHODS IN THE CLASSIFICATION OF DIABETES MELLITUS

Nova Auliyatul Maulidiyyah, Trimono Trimono, Aviolla Terza Damaliana, Dwi Arman Prasetya

Abstract


Diabetes mellitus is a deadly disease caused by the failure of the pancreas to produce enough insulin. Indonesia ranks fifth in the world with the number of people with diabetes in 2021 at around 19.47 million, and this number continues to increase. One of the main challenges in diabetes management is to make the right classification between type 1 and type 2 diabetes, as misdiagnosis can result in inappropriate treatment and worsen the patient's condition. This study uses a machine learning approach to compare Decision Tree and Random Forest methods in classifying type 1 and type 2 diabetes mellitus. The goal is to identify the most effective model in predicting the type of diabetes based on medical record data. The comparison was done using k-fold cross validation and confusion matrix. The results showed that Random Forest provided an average accuracy of 94%, while Decision Tree reached 93% during cross validation testing. Although both models were able to perform well in classification, Random Forest showed a more stable performance and a slight edge in accuracy over Decision Tree. Evaluation with the confusion matrix showed that the Decision Tree model achieved 93% accuracy compared to Random Forest's 91%. In addition, the Decision Tree model also had a lower number of prediction errors, 7, compared to 9 for Random Forest. The most influential variables in classification also differed between the two models, showing the unique advantages and characteristics of each approach.


Full Text:

PDF

References


F. M. Hana, "Classification of Diabetic Patients Using the C4 Decision Tree Algorithm. 5,†J. Sist. Comput. Artificial Intelligence, vol. 4, no. 2, 2020.

R. Marzel, "Therapy in Type 1 DM," J. Researcher. Nurse Prof., vol. 3, no. 1, pp. 51–62, 2021, doi: 10.37287/jppp.v3i1.297.

IDF, “Diabetes report 2000 — 2045,†Diabetes Atlas, 2021. https://diabetesatlas.org/data/en/country/94/id.html

S. P. Katongole, P. Akweongo, R. Anguyo, D. E. Kasozi, and A. Adomah-Afari, “Prevalence and Classification of Misdiagnosis Among Hospitalised Patients in Five General Hospitals of Central Uganda,†Clin. Audit, vol. Volume 14, no. September, pp. 65–77, 2022, doi: 10.2147/ca.s370393.

W. Nugraha and R. Sabaruddin, "Resampling Techniques to Overcome Class Imbalance in Diabetes Classification Using C4.5, Random Forest, and SVM," Techno.Com, vol. 20, no. 3, pp. 352–361, 2021, doi: 10.33633/tc.v20i3.4762.

A. Tangkelayuk and E. Mailoa, "Classification of Water Quality Using the KNN, Naïve Bayes and Decision Tree Methods," vol. 9, no. 2, pp. 1109–1119, 2022.

K. Siti, "CLASSIFICATION OF DIABETES USING THE DECISION TREE AND RANDOM FOREST METHOD," repository.unsri.ac.id, no. 8.5.2017, pp. 2003–2005, 2022.

E. Rosta et al., "Mental Health Data Classification in the Technology Industry Using the Random Forest Algorithm," vol. 1, no. 3, pp. 237–253, 2023.

A. Husna Nasrullah, "IMPLEMENTATION OF DECISION TREE ALGORITHM FOR CLASSIFICATION OF BEST-SELLING PRODUCTS," vol. 7, no. 2, pp. 45–51, 2021.

A. Prabowo, S. Wardani, R. Wijaya Dewantoro, W. Wesly, and Leonardo, "Comparison of Random Forest and Decision Tree C4 Accuracy Levels. 5 On the Classification of Infertility Disease Data," vol. 4, no. 1, pp. 218–224, 2023, doi: 10.30865/klik.v4i1.1115.

M. Sahebhonar and M. G. Dehaki, “A Comparison of Three Research Methods : Logistic Regression , Decision Tree , and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population,†vol. 10, no. 2, pp. 9–11, 2022.

N. R. Jevintya, U. Darusalam, S. Abdullah, and U. S. Asia, “APPLICATION OF THE K-MEANS AND DECISION TREE ALGORITHMS IN,†vol. 7, no. 1, pp. 13–18, 2024, doi: 10.33387/jiko.v7i1.7580.

L. Qadrini, A. Seppewali, and A. Aina, "DECISION TREE AND ADABOOST ON THE CLASSIFICATION OF RECIPIENTS OF SOCIAL ASSISTANCE PROGRAMS," J. Inov. Researcher., vol. 2, no. 7, 2021.

M. L. Suliztia, "APPLICATION OF RANDOM FOREST ANALYSIS TO THE PROTOTYPE OF THE USED CAMERA PRICE PREDICTION SYSTEM USING FLASK," dspace.uii.ac.id, 2020.

M. Aqsha and N. Sunusi, "DATA CLASSIFICATION PERFORMANCE IS UNBALANCED WITH MACHINE LEARNING APPROACH (CASE STUDY: DIABETES INDIAN PIMA)," vol. 12, no. 2, pp. 176–193, 2023.

F. Mu'alim and R. Hidayati, "Implementation of the Random Forest Method for Majors," vol. 14, no. 1, pp. 116–125, 2022.

U. Azmi, "Detection of Dried Cannabis Aroma Using Random Forest Algorithm," JITSI J. Ilm. Technol. Sist. Inf., vol. 4, no. 1, pp. 28–33, 2023, [Online]. Available: https://jurnal-itsi.org/index.php/jitsi/article/view/104%0Ahttps://jurnal-itsi.org/index.php/jitsi/article/download/104/82

L. Mardiana, D. Kusnandar, and N. Satyahadewi, "DISCRIMINATION ANALYSIS WITH K FOLD CROSS VALIDATION FOR WATER QUALITY CLASSIFICATION IN PONTIANAK CITY," vol. 11, no. 1, pp. 97–102, 2022.

A. Nurwalikadani, "Implementation of Smote Algorithm and Random Forest Classification on Imbalanced Lysine Protein Sequence Methylation Data," 2022, [Online]. Available: http://digilib.unila.ac.id/67956/%0Ahttp://digilib.unila.ac.id/67956/3/SKRIPSI FULL WITHOUT PEMBAHASAN.pdf




DOI: https://doi.org/10.33387/jiko.v7i2.8316

Refbacks

  • There are currently no refbacks.