TOXIC COMMENT CLASSIFICATION COMPARISON BETWEEN LSTM, BILSTM, GRU, AND BIGRU

GIUSTINO, JONATHAN KEVIN (2024) TOXIC COMMENT CLASSIFICATION COMPARISON BETWEEN LSTM, BILSTM, GRU, AND BIGRU. Skripsi thesis, UNIVERSITAS KATOLIK SOEGIJAPRANATA.

[img]
Preview
Text
20.K1.0009-JONATHAN KEVIN GIUSTION_COVER_1.pdf

Download (297kB) | Preview
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_BAB I_1.pdf
Restricted to Registered users only

Download (173kB)
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_BAB II_1.pdf
Restricted to Registered users only

Download (244kB)
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_BAB III_1.pdf
Restricted to Registered users only

Download (421kB)
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_BAB IV_1.pdf
Restricted to Registered users only

Download (1MB)
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_BAB V_1.pdf
Restricted to Registered users only

Download (168kB)
[img]
Preview
Text
20.K1.0009-JONATHAN KEVIN GIUSTION_DAPUS_1.pdf

Download (233kB) | Preview
[img] Text
20.K1.0009-JONATHAN KEVIN GIUSTION_LAMPIRAN_1.pdf
Restricted to Registered users only

Download (277kB)

Abstract

One of the biggest problems in modern internet age is toxicity. In this study, our aim is to draw an effective method to classify toxicity in text comment in this case specifically on Wikipedia comment. Old Regular Rule based model attacks, Machine Learning Methods suffer from rules based approaches and thus are incapable of accurate detection of toxicity while maintaining precision at the same time. To overcome this limitation, recurrent neural networks (RNNs), wherein, long short-term memory (LSTM) networks, gated recurrent unit(GRU) have been proposed. In this paper, author compares the LSTM, BiLSTM, GRU, and BiGRU for multi label classification for which is the best model to use towards jigsaw toxicity challenge dataset to classify toxicity. This study will be finding out which of the model is the best for classification and the difference between different type of pre-processing. We’ll go ahead to use League of Legends tribunal datasets from kaggle as our base. The results obtained were that, the highest accuracy on the test was attained by BiLSTM model without cleaning with on 87.208% accuracy, 55.205% Precision, 68.540% Recall, and 60.623% F1-Score the result also shows while preprocessing on cleaning improve the resultant metrics by a marginal amount for regular LSTM, GRU, and BiGRU it doesn’t always improve the resuI

Item Type: Thesis (Skripsi)
Subjects: 000 Computer Science, Information and General Works
Divisions: Faculty of Computer Science > Department of Informatics Engineering
Depositing User: Mr Yosua Norman Rumondor
Date Deposited: 23 Apr 2024 02:26
Last Modified: 23 Apr 2024 02:26
URI: http://repository.unika.ac.id/id/eprint/35208

Actions (login required)

View Item View Item