期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2019
卷号:10
期号:11
页码:646-651
出版社:Science and Information Society (SAI)
摘要:Stylometry plays an important role in the intrinsic
plagiarism detection, where the goal is to identify potential
plagiarism by analyzing a document involving undeclared changes
in writing style. The purpose of this paper is to study the
interaction between syntactic structures, attention mechanism,
and contextualized word embeddings, as well as their effectiveness
on plagiarism detection. Accordingly, we propose a new style
embedding that combines syntactic trees and the pre-trained
Multi-Task Deep Neural Network (MT-DNN). Additionally, we
use attention mechanisms to sum the embeddings, thereby experimenting
with both a Bidirectional Long Short-Term Memory
(BiLSTM) and a Convolutional Neural Network (CNN) maxpooling
for sentences encoding. Our model is evaluated on two
sub-task; style change detection and style breach detection, and
compared with two baseline detectors based on classic stylometric
features.
关键词:Plagiarism detection; style embedding; deep neural
network; stylometry; syntactic trees