COMPARISON OF DECISION TREE ALGORITHM AND K-NEAREST NEIGHBOR (KNN) ALGORITHM PERFORMANCE IN DIABETES CASE STUDY

DEO, SILVANO PRATAMA JUBILATE (2023) COMPARISON OF DECISION TREE ALGORITHM AND K-NEAREST NEIGHBOR (KNN) ALGORITHM PERFORMANCE IN DIABETES CASE STUDY. Other thesis, UNIVERSITAS KHATOLIK SOEGIJAPRANATA.

[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-COVER_a.pdf

Download (528kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-BAB I_a.pdf
Restricted to Registered users only

Download (86kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-BAB II_a.pdf
Restricted to Registered users only

Download (91kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-BAB III_a.pdf
Restricted to Registered users only

Download (106kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-BAB IV_a.pdf
Restricted to Registered users only

Download (615kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-BAB V_a.pdf
Restricted to Registered users only

Download (82kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-DAPUS_a.pdf

Download (207kB)
[img] Text
19.K1.0016-SILVANO PRATAMA JUBILATE DEO-LAMP_a.pdf
Restricted to Registered users only

Download (279kB)

Abstract

Diabetes is a chronic metabolic disease characterized by elevated blood sugar levels, which can result in damage to the eyes and vital organs. Type 2 diabetes is a variant of diabetes that most often affects adults over 18 years old, the symptoms caused by this variant are not very noticeable and to identify it requires a long test process. The use of classification algorithms in predicting diabetes, can help minimize the risk in the early stages of the disease and help health practitioners in controlling the impact of diabetes. In this study, the authors compare the performance of Decision Tree and K-Nearest Neighbor algorithms in predicting diabetes, on the Pima Indian Diabetes dataset. Both algorithm models were trained with 3 dataset sharing ratios, which are 80:20, 70:30 and 65:35. In addition, the authors also implemented GridSearchCV hyperparameter tuning to find the best parameters of both models. Accuracy, precision, recall and F-1 score of the two models are used to determine which model has the best performance. The results show that the Decision Tree algorithm without hyperparameter tuning has the best performance at a ratio of 70:40, resulting in accuracy 83.33%. K=7 is the most optimal K value in the KNN algorithm, resulting in an accuracy of 77.65%. Hyperparameter tuning GridSearchCV can work optimally at a ratio of 80:20 and 65:35, in finding the best parameters in decision algorithms. But there is still overfitting in decision tree algorithms.

Item Type: Thesis (Other)
Subjects: 000 Computer Science, Information and General Works > 004 Data processing & computer science
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: Mr Yosua Norman Rumondor
Date Deposited: 05 Oct 2023 06:28
Last Modified: 05 Oct 2023 06:28
URI: http://repository.unika.ac.id/id/eprint/32966

Actions (login required)

View Item View Item