首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:DEVELOPING A BILINGUAL MODEL OF WORD EMBEDDING FOR DETECTING INDONESIAN ENGLISH PLAGIARISM
  • 本地全文:下载
  • 作者:YULYANI ARIFIN ; SANI M ; ISA
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2021
  • 卷号:99
  • 期号:17
  • 语种:English
  • 出版社:Journal of Theoretical and Applied
  • 摘要:The Internet users that increasing can make it easier to access information even in different languages. Also, the translation application can help users to translate some idea or document without proper citation or acknowledge their idea So, plagiarism is increasing not only in the academic field but also in the industry. A lot of researchers already propose some method to detect plagiarism, but mostly in the European language. Previous research in Indonesian-English plagiarism has already proposed some methods but it is still dependent on machine translation. So, from this research, we purpose a model that can be used to detect cross-language plagiarism without depending on machine translation. The model's purpose is to use combination canonical correlation analysis with the paragraph to vector. Evaluation will be done with the monolingual task and cross-language detection plagiarism. The model evaluation has a good result in monolingual word similarity also when detecting cross-language plagiarism without depending on machine translation. After comparing with the benchmark that using Fingerprint Method with machine translation, the proposed method can detect plagiarism type with paraphrasing more accurately than the benchmark. Even the improvement compared with the benchmark not so significantly but through this proposed method can detect cross-language plagiarism in Indonesian-English language without depending on machine translation. For future work, it needs to enlarge the parallel corpus for Indonesian-English to improve the accuracy of the proposed method.
  • 关键词:Word Embeddings;Plagiarism;Bilingual Model;Cross-Lingual;Canonical Correlation Analys
国家哲学社会科学文献中心版权所有