SANTOSO, YOSEFINA OKTAVIANI (2019) TITLE BASED SCIENTIFIC JOURNAL CLUSTERING. Other thesis, UNIKA SOEGIJAPRANATA SEMARANG.
|
Text (COVER)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf COVER.pdf Download (702kB) | Preview |
|
|
Text (BAB I)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB I.pdf Download (66kB) | Preview |
|
Text (BAB II)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB II.pdf Restricted to Registered users only Download (203kB) |
||
|
Text (BAB III)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB III.pdf Download (66kB) | Preview |
|
|
Text (BAB IV)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB IV.pdf Download (147kB) | Preview |
|
|
Text (BAB V)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB V.pdf Download (361kB) | Preview |
|
|
Text (BAB VI)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf BAB VI.pdf Download (61kB) | Preview |
|
|
Text (DAFTAR PUSTAKA)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf DAPUS.pdf Download (67kB) | Preview |
|
|
Text (LAMPIRAN)
15.K1.0014 YOSEFINA OKTAVIANI SANTOSO (9)..pdf LAMP.pdf Download (512kB) | Preview |
Abstract
Scientific journals develop very rapidly along with the development of science. Reporting from labs.semanticscholar.org/corpus, the number of scientific journals has reached over 39 million. A large number of scientific journals makes it challenging to grouping scientific journals. Grouping becomes more difficult because each scientific journal can have more than one topic. Therefore, special methods are needed to group the scientific journals. One of the well-known topic modeling methods is Latent Dirichlet Allocation (LDA). This research is an implementation of the LDA algorithm to do topic modeling in scientific journals. The topic modeling in this study uses the title as a corpus. Various titles are processed into a bag of words in the pre-processing process so that they can be used to distribute. The results of the distribution stage are used for sampling with the Gibbs Sampling method. Through the sampling process, testing can also be done to determine the optimal parameters. The testing in this study used perplexity to find the most optimal number of iterations and topics. The result from this research is that the LDA Algorithm successfully performs topic modeling in scientific journals by generating a list of keywords for each topic and grouping documents on each topic. The optimal parameters based on the results of perplexity comparison are 3 topics and 500 iterations. Keyword: Topic Modeling, LDA, perplexity, scientific journal
Item Type: | Thesis (Other) |
---|---|
Subjects: | 000 Computer Science, Information and General Works > 050 Magazines, journals & serials |
Divisions: | Faculty of Computer Science > Department of Informatics Engineering |
Depositing User: | Mr Lucius Oentoeng |
Date Deposited: | 10 Jul 2019 08:22 |
Last Modified: | 10 Nov 2020 05:39 |
URI: | http://repository.unika.ac.id/id/eprint/19651 |
Actions (login required)
View Item |