期刊名称:International Journal of Soft Computing & Engineering
电子版ISSN:2231-2307
出版年度:2012
卷号:2
期号:5
页码:24-28
出版社:International Journal of Soft Computing & Engineering
摘要:Plagiarism detection is a challenging problem. Today thousands of documents are present on the net but there are no proper tools to guarantee their uniqueness in such a great domain. PDF documents form a significant portion of this vast database. Copy detection in digital document database may provide necessary guarantees for publishers and newsfeed services to offer their valuable work for others perusal. We consider the case of comparing a Query Document with a Registered Document .Plagiarism detection techniques are applied by making a distinction between natural and programming language. In this paper we have implemented SCAM (standard Copy Analysis Mechanism) which is relative measure to detecting copies based on comparing the words and lines frequency occurrences of the new document against those of registered documents. These tests involve comparisons of various articles and show that in general this scheme performs pretty well in detecting documents that have Exact, Partial and Trivial overlap.