ADINARTA, FERREY (2024) PREDICTING FLIGHT DELAY USING RANDOM FOREST ALGORITHM, XGBOOST ALGORITHM, AND STACKING ENSEMBLE METHOD FERREY ADINARTA. S1 thesis, UNIVERSITAS KATOLIK SOEGIJAPRANATA.
|
Text
21.K1.0007_FERREY ADINARTA_COVER.pdf Download (727kB) | Preview |
|
|
Text
21.K1.0007_FERREY ADINARTA_BAB I.pdf Restricted to Registered users only Download (489kB) |
||
|
Text
21.K1.0007_FERREY ADINARTA_BAB II.pdf Restricted to Registered users only Download (473kB) |
||
|
Text
21.K1.0007_FERREY ADINARTA_BAB III.pdf Restricted to Registered users only Download (753kB) |
||
|
Text
21.K1.0007_FERREY ADINARTA_BAB IV.pdf Restricted to Registered users only Download (737kB) |
||
|
Text
21.K1.0007_FERREY ADINARTA_BAB V.pdf Restricted to Registered users only Download (457kB) |
||
|
Text
21.K1.0007_FERREY ADINARTA_DAPUS.pdf Download (482kB) | Preview |
|
|
Text
21.K1.0007_FERREY ADINARTA_LAMP.pdf Restricted to Registered users only Download (647kB) |
Abstract
Flight delays are problematic for both passengers and airlines. With the increasing amount of flight traffic volume, time punctuality is important since it significantly influences passengers’ satisfaction and airline companies' financial performance. Many studies have been conducted to predict these delays by using machine learning algorithms. In some research, it was found that combining more than one machine learning algorithm can improve the prediction results. Therefore, in this research, a comparison of machine learning ensemble methods like bagging, boosting, and stacking to predict flight delays is compared. The objective of this research is to find the best-performing ensemble method for flight delay prediction. A dataset from Kaggle named ‘Flight Status Prediction’ is used as the dataset for this research. Then, the dataset is cleaned and modified using the preprocessing steps. After that, the dataset is fitted to each ensemble model using the Random Forest algorithm as the bagging method, the Extreme Gradient Boosting (XGBoost) algorithm as the boosting method, and combining both algorithms using the stacking method with Random Forest as the first learner, and the results are evaluated based on the accuracy, recall, and precision values. The results are gotten from two different dimensional reduction methods, which are feature selection and principal component analysis (PCA). The results obtained from this study are that the XGBoost model performs best on predicting flight delays with a mean average accuracy of above 95% in both dimensionality reduction methods, while the Stacking Ensemble method performs the worst with a mean accuracy of less than 92% in both dimensionality reduction methods. Keyword: Flight delay prediction, Ensemble method comparison, Random Forest, Extreme
| Item Type: | Thesis (S1) |
|---|---|
| Subjects: | 000 Computer Science, Information and General Works 000 Computer Science, Information and General Works > 004 Data processing & computer science |
| Divisions: | Faculty of Computer Science > Department of Informatics Engineering |
| Depositing User: | ms. Wiwien Vieragustin |
| Date Deposited: | 10 Jul 2025 07:59 |
| Last Modified: | 10 Jul 2025 07:59 |
| URI: | http://repository.unika.ac.id/id/eprint/37186 |
| Keywords: | UNSPECIFIED |
Actions (login required)
![]() |
View Item |
