文章基本信息

标题：Stemming Influence on Similarity Detection of Abstract Written in Indonesia
本地全文：下载
作者：Tari Mardiana ; Teguh Bharata Adji ; Indriana Hidayah 等
期刊名称：TELKOMNIKA (Telecommunication Computing Electronics and Control)
印刷版ISSN：2302-9293
出版年度：2016
卷号：14
期号：1
页码：219-227
DOI：10.12928/telkomnika.v14i1.1926
语种：English
出版社：Universitas Ahmad Dahlan
摘要：In this paper we would like to discuss about stemming effect by using Nazief and Adriani algorithm against similarity detection result of Indonesian written abstract. The contents of the publication abstract similarity detection can be used as an early indication of whether or not the act of plagiarism in a writing. Mostly in processing the text adding a pre-process, one of it which is called a stemming by changing the word into the root word in order to maximize the searching process. The result of stemming process will be changed as a certain word n-gram set then applied an analysis of similarity using Fingerprint Matching to perform similarity matching between text. Based on the F 1 -score which used to balance the precision and recall number, the detection that implements stemming and stopword removal has a better result in detecting similarity between the text with an average is 42%. It is higher comparing to the similarity detection by using only stemming process (31%) or the one that was done without involving the text pre-process (34%) while applying the bigram.
其他摘要：In this paper we would like to discuss about stemming effect by using Nazief and Adriani algorithm against similarity detection result of Indonesian written abstract. The contents of the publication abstract similarity detection can be used as an early indication of whether or not the act of plagiarism in a writing. Mostly in processing the text adding a pre-process, one of it which is called a stemming by changing the word into the root word in order to maximize the searching process. The result of stemming process will be changed as a certain word n-gram set then applied an analysis of similarity using Fingerprint Matching to perform similarity matching between text. Based on the F 1 -score which used to balance the precision and recall number, the detection that implements stemming and stopword removal has a better result in detecting similarity between the text with an average is 42%. It is higher comparing to the similarity detection by using only stemming process (31%) or the one that was done without involving the text pre-process (34%) while applying the bigram.
关键词：abstract;Indonesian;similarity;stemming;word.