期刊名称:International Journal on Computer Science and Engineering
印刷版ISSN:2229-5631
电子版ISSN:0975-3397
出版年度:2011
卷号:3
期号:03
页码:1118-1123
出版社:Engg Journals Publications
摘要:Authorship attribution, the science of inferring characteristics of author from characteristics of documents written by that author become an urgent need to find the original author of anonymous text. In this paper, a novel approach is proposed that attempts to measure the style variation of author using character n-gram profiles. This proposed method is a different approach to identify the author using initial character n-gram whereas prior research has shown the identification on total character ngram. This approach will prove to be quite stable. With the help of small experiment, we attempt to prove it. The results acquired from the mention technique are quite accurate and it hikes to 100% in identifying the author from an anonymous text. Using N-gram frequency profiles, it provides a simple and reliable way to categorize documents in a wide range of classification tasks.
关键词:Author identification; Character n-gram; Dis-similarity measure; Natural Language Processing.