CHI - SQUARE AND INFORMATION GAIN FEATURE SELECTION FOR HOTEL REVIEW SENTIMENT ANALYSIS USING SUPPORT VECTOR MACHINE

KARUNIA, NATHANAEL (2022) CHI - SQUARE AND INFORMATION GAIN FEATURE SELECTION FOR HOTEL REVIEW SENTIMENT ANALYSIS USING SUPPORT VECTOR MACHINE. Other thesis, Universitas Katholik Soegijapranata Semarang.

[img] Text
18.K1.0027-NATHANAEL KARUNIA-COVER_a.pdf

Download (785kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB I_a.pdf

Download (205kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB II_a.pdf
Restricted to Registered users only

Download (215kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB III_a.pdf

Download (300kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB IV_a.pdf

Download (675kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB V_a.pdf

Download (427kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-BAB VI_a.pdf

Download (184kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-DAPUS_a.pdf

Download (117kB)
[img] Text
18.K1.0027-NATHANAEL KARUNIA-LAMP_a.pdf

Download (235kB)

Abstract

In the current era, it has become a trend for people to order tickets online through online booking sites and applications, both in terms of transportation such as planes, vacations such as tours, and also lodging such as hotels. To get a good hotel, you need a review from people who have booked it. With the reviews written by visitors to the site or mobile application, they will then be analyzed so that an output can be produced that can be useful. One of the analytical models that can be done is sentiment analysis. The purpose of this study is to find the best method in analyzing sentiment based on the preprocessing of the data and hopefully it can produce knowledge in the form of sentiment analysis classification methods in order to determine a good method devoted to the data preprocessing section. The algorithm used to make this sentiment classification analysis is the Support Vector Machine using 3 feature selection methods, namely not using the selection feature, using the chi square selection feature, and using the information gain selection feature. The process consists of five steps in this study, which include several activities. namely data collection, preprocessing, feature extraction, feature selection, classification, and calculating accuracy. In the process of calculating accuracy, I used the Confusion Matrix method to find the best method of the three based on the accuracy results obtained. The results of the 3 uses of the feature selection method that were carried out were using the chi square feature selection method, the highest results were obtained, namely with an average accuracy of 86.68% which was followed by the use of the information gain selection feature which obtained an average accuracy of 85.78% and the last one was followed by the method not using the selection feature which got an average accuracy of 85.24%. From the results of the three methods, it can be concluded that the use of the chi square feature selection method in the case of sentiment analysis on hotel reviews is the best compared to the other two.

Item Type: Thesis (Other)
Subjects: 000 Computer Science, Information and General Works > 005 Computer programming, programs & data
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: mr AM. Pudja Adjie Sudoso
Date Deposited: 26 Oct 2022 09:27
Last Modified: 26 Oct 2022 09:27
URI: http://repository.unika.ac.id/id/eprint/30028

Actions (login required)

View Item View Item