期刊名称:Advanced Computing : an International Journal
印刷版ISSN:2229-726X
电子版ISSN:2229-6727
出版年度:2013
卷号:4
期号:5
DOI:10.5121/acij.2013.4501
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Authorship attribution mainly deals with undecided authorship of literary texts. Authorship attribution is useful in resolving issues like uncertain authorship, recognize authorship of unknown texts, spot plagiarism so on. Statistical methods can be used to set apart the approach of an author numerically. The basic methodologies that are made use in computational stylometry are word length, sentence length, vocabulary affluence, frequencies etc. Each author has an inborn style of writing, which is particular to himself. Statistical quantitative techniques can be used to differentiate the approach of an author in a numerical way. The problem can be broken down into three sub problems as author identification, author characterization and similarity detection. The steps involved are pre-processing, extracting features, classification and author identification. For this different classifiers can be used. Here fuzzy learning classifier and SVM are used. After author identification the SVM was found to have more accuracy than Fuzzy classifier. Later combined the classifiers to obtain a better accuracy when compared to individual SVM and fuzzy classifier
关键词:Authorship attribution; ;T;ext;pre;-;processing; Stemming;Feature extraction and M;achine learning classifier