Analysis Fake News Using Random Forest And Support Vector Machine

Angkasa, Julius Warih (2020) Analysis Fake News Using Random Forest And Support Vector Machine. Other thesis, Universitas Katolik Soegijapranata Semarang.

[img] Text (Julius Warih Angkasa- COVER)
16.K1.0059 - JULIUS WARIH - COVER.pdf

Download (201kB)
[img] Text (Julius Warih Angkasa- BAB I)
16.K1.0059 - JULIUS WARIH - BAB I.pdf

Download (82kB)
[img] Text (Julius Warih Angkasa- BAB II)
16.K1.0059 - JULIUS WARIH - BAB II.pdf
Restricted to Registered users only

Download (89kB)
[img] Text (Julius Warih Angkasa- BAB III)
16.K1.0059 - JULIUS WARIH - BAB III.pdf

Download (108kB)
[img] Text (Julius Warih Angkasa- BAB IV)
16.K1.0059 - JULIUS WARIH - BAB IV.pdf

Download (291kB)
[img] Text (Julius Warih Angkasa- BAB V)
16.K1.0059 - JULIUS WARIH - BAB V.pdf

Download (142kB)
[img] Text (Julius Warih Angkasa- BAB VI)
16.K1.0059 - JULIUS WARIH - BAB VI.pdf

Download (78kB)
[img] Text (Julius Warih Angkasa- DAFTAR PUSTAKA)
16.K1.0059 - JULIUS WARIH - DAFTAR PUSTAKA.pdf

Download (101kB)
[img] Text (Julius Warih Angkasa- LAMPIRAN)
16.K1.0059 - JULIUS WARIH - LAMPIRAN.pdf

Download (181kB)

Abstract

Dissemination of information through online media is very fast, online media can influence people's mindsets in consuming information that is being scouted. So that the internet is very easy for humans to spread information. Easily disseminating this information is not uncommon for many individuals or groups not responsible for spreading false news. To develop technology in dealing with deceptive news, it is necessary analysis fake news. The study uses the calculation of the TF-IDF weight value and uses two algorithms, namely Support Vector Machine and Random Forest algorithm. Before calculating the weight value of TF-IDF, it is necessary to do the pre-processing stage. The word results from pre-processing are used to calculate the TF-IDF value, after calculating the TF-IDF value it is necessary to calculate the TF-IDF category. TF-IDF category to determine grouping based on TF-IDF value. The TF-IDF results were trained using two algorithms, namely Support Vector Machine and Random Forest algorithms. In the test carried out 4 scenarios with portions of training data and testing data that is different - different. Training data is used to train machines, while testing data is used for testing in terms of prediction results, accuracy, precision, recall, f1-score values. Based on the results of research in analyzing news titles using a dataset of 1100 data, the Random Forest algorithm obtained the highest accuracy value of 72.04301075268818% with 1007 training data. And the lowest accuracy value is 64.61538461538461% with 760 training data. Then the test results using the Support Vector Machine algorithm obtained the highest accuracy value of 65.59139784946237% with 1007 training data and the lowest accuracy value was 49.67032967032967% with the number of training data 644 data. While the results of research in analyzing news content using the Random Forest algorithm, obtained the highest accuracy value of 63.44086021505376% with 1007 training data, and the lowest accuracy value of 58.24175824175825% with the number of training data 644 data. Then the results of the analysis of the news content using the Support Vector Machine algorithm obtained the highest accuracy value of 57.692307692307686 %% with 760 training data and the lowest accuracy value was 40.18691588785047% with the amount of training data 886 data.

Item Type: Thesis (Other)
Subjects: 000 Computer Science, Information and General Works
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: ms F. Dewi Retnowati
Date Deposited: 26 May 2021 02:51
Last Modified: 26 May 2021 02:51
URI: http://repository.unika.ac.id/id/eprint/25288

Actions (login required)

View Item View Item