COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION

CARLA, JEVON (2023) COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION. Other thesis, Universitas Katholik Soegijapranata Semarang.

[img] Text
19.K1.0017-JEVON CARLA-COVER_a.pdf

Download (615kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB I_a.pdf

Download (255kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB II_a.pdf
Restricted to Registered users only

Download (269kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB III_a.pdf

Download (347kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB IV_a.pdf

Download (308kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB V_a.pdf

Download (541kB)
[img] Text
19.K1.0017-JEVON CARLA-BAB VI_a.pdf

Download (251kB)
[img] Text
19.K1.0017-JEVON CARLA-DAPUS_a.pdf

Download (265kB)
[img] Text
19.K1.0017-JEVON CARLA-LAMP_a.pdf

Download (284kB)

Abstract

Diabetes is one of the serious diseases and it causes the sufferer to have high blood sugar due to the body unable to produce the required amount of insulin to regulate glucose. It may cause complications or may increase the risk of developing another disease like heart disease, kidney disease, blindness, etc. One of the best ways to fight this disease is by early diagnosis. If there are a lot of patient records, the machine learning classification algorithms play a great role in predicting whether a person has diabetes or not. The used dataset is Diabetes UCI Dataset from kaggle which has been collected using direct questionnaires from the patients of Sylhet Diabetes Hospital in Sylhet, Bangladesh, and approved by a doctor. The dataset has 520 data and 17 attributes. Several studies have been made in the last few decades and some of them show that Artificial Neural Networks (ANN) are one of the best algorithms for diabetes predictions, Extreme Gradient Boosting (XGBoost) is one of the popular machine learning algorithms used for classification, because of that reason the writer wants to find out whether XGBoost can be used on diabetes prediction and compare it with ANN. Both algorithms models were trained with the same ratio 80:20, 75:25, 70:30. 60:40, and 50:50. There are four models for the ANN with 3 hidden layers, 4 hidden layers, 5 hidden layers, and 6 hidden layers, as for the XGBoost models there are the first model with default parameters and the second one with the hyperparameters tuning. The accuracy, precision, recall, and f1 score of the models will be compared to find out which one has better performance. XGBoost performance able to achieve better performance but the third ANN models able to achieve highest score on 80:20, with 75:25 XGBoost with hyperparameters tuning able to achieve highest score, but XGBoost with default parameters have the same score as the the third ANN model, with 70:30 ratio, the third ANN model and both XGBoost models have the same score and have the highest score among all ratio. with 60:40 ratio, the first to third ANN models and XGBoost with default parameters have the same accuracy score but the third ANN models have the highest recall but lower precision than the XGBoost models. And with 50:50 XGBoost 2 has the best overall performances than the other models.

Item Type: Thesis (Other)
Subjects: 000 Computer Science, Information and General Works > 004 Data processing & computer science
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: mr AM. Pudja Adjie Sudoso
Date Deposited: 05 Apr 2023 01:39
Last Modified: 05 Apr 2023 01:39
URI: http://repository.unika.ac.id/id/eprint/31409

Actions (login required)

View Item View Item