IMPROVING ABUSIVE WORD DETECTION ALGORITHMS IN SOCIAL MEDIA WITH CBOW

APHRODITA, TAN TAMARINE MYRNA (2024) IMPROVING ABUSIVE WORD DETECTION ALGORITHMS IN SOCIAL MEDIA WITH CBOW. Skripsi thesis, UNIVERSITAS KATOLIK SOEGIJAPRANATA.

[img]
Preview
Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_COVER_1.pdf

Download (193kB) | Preview
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_BAB I_1.pdf
Restricted to Registered users only

Download (155kB)
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_BAB II_1.pdf
Restricted to Registered users only

Download (90kB)
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_BAB III_1.pdf
Restricted to Registered users only

Download (366kB)
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_BAB IV_1.pdf
Restricted to Registered users only

Download (776kB)
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_BAB V_1.pdf
Restricted to Registered users only

Download (83kB)
[img]
Preview
Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_DAPUS_1.pdf

Download (178kB) | Preview
[img] Text
20.K1.0041-TAN TAMARINE MYRNA APHRODITA_LAMPIRAN_1.pdf
Restricted to Registered users only

Download (251kB)

Abstract

The increase in the use of abusive language on social media lately is very bad. Many parties throw abusive words at each other against an object, either personal or group. Abusive words themselves can be in the form of sexism, attacking flaws or disabilities, and others. Activities on social media are now so negative that they do more harm than good. We use Word2vec and some algorithms to detect abusive words in hate speech on social media to see who’s the best algorithms so far that compatible work together with word2vec. First, we need to know the dataset we use from Kaggle.com. Then, for implementation, the dataset needs to be processed in data preprocessing, with steps such as word embedding, so that maximum results can be obtained. The final result of this project will be presented in a table of confusion matrix, and with this research, the calculated average F1 value is 86% and the accuracy rate is also 86%. So, with that result, we know that the final result is that the most suitable algorithm for this dataset is XGBoost, but the algorithm the most suitable with word2vec is KNearestNeighbor.

Item Type: Thesis (Skripsi)
Subjects: 000 Computer Science, Information and General Works > 004 Data processing & computer science
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: Mr Yosua Norman Rumondor
Date Deposited: 05 May 2024 12:19
Last Modified: 05 May 2024 12:19
URI: http://repository.unika.ac.id/id/eprint/35296

Actions (login required)

View Item View Item