期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2013
卷号:4
期号:3
页码:379-383
语种:English
出版社:Ayushmaan Technologies
摘要:Data leakage is defined as the accidental or unintentional distribution of private or sensitive data to unauthorized entity. It is a silent type of threat. For example, an employee can intentionally or accidentally leak certain sensitive information. Sensitive data of companies and organizations includes Intellectual Property (IP), financial information, patient information, personal creditcard data, and other information depending on the business and the industry. Hence which sensitive data that has already been leaked from the enterprise and is publicly available, for example, on the Internet should be detected. This strategy is post-facto leakage detection. Traditionally, this leakage detection is handled by watermarking, in which a unique code is embedded in each distributed copy. By introducing a technique beyond watermarking, we can facilitate this post-facto detection technique, in which a unique embedded signature will be identified from within the contents of the original document containing the sensitive data. In this paper, we present an automated tamper-proof low complexity algorithm to solve data leakages. We extract embedded signatures from sensitive documents and use them in conjunction with search engines to determine whether near-duplicate versions of the document (or portions of it) are available on the Web [3]. The embedded signature is tamper-proof; even if an adversary partially modifies a document, our mechanism can detect duplicate copies. Also, if a duplicate copy is present in the Web, our system can detect such a copy with a small number of queries.
关键词:Data Leakage;Watermarking;Signature;Generic Linguistic Data Consortium (GLDC)