期刊名称:International Journal of Data Mining & Knowledge Management Process
印刷版ISSN:2231-007X
电子版ISSN:2230-9608
出版年度:2015
卷号:5
期号:5
页码:65
DOI:10.5121/ijdkp.2015.5505
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Data leakage means sending confidential data to an unauthorized person. Nowadays, identifyingconfidential data is a big challenge for the organizations. We developed a system by using data miningtechniques, which identifies confidential data of an organization. First, we create clusters for the trainingdata set. Next, identify confidential terms and context terms for each cluster. Finally, based on theconfidential terms and context terms, the confidentiality level of the detected document calculated in termsof score. If the score of the detected document beyond a predefined threshold, then the document is blockedand marked as a confidential.