文章基本信息

标题：Big Data Knowledge Mining
本地全文：下载
作者：Huda Umar Banuqitah ; Fathy Eassa ; Kamal Jambi 等
期刊名称：International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN：2158-107X
电子版ISSN：2156-5570
出版年度：2016
卷号：7
期号：11
DOI：10.14569/IJACSA.2016.071123
出版社：Science and Information Society (SAI)
摘要：Big Data (BD) era has been arrived. The ascent of big data applications where information accumulation has grown beyond the ability of the present programming instrument to catch, manage and process within tolerable short time. The volume is not only the characteristic that defines big data, but also velocity, variety, and value. Many resources contain BD that should be processed. The biomedical research literature is one among many other domains that hides a rich knowledge. MEDLINE is a huge biomedical research database which remain a significantly underutilized source of biological information. Discovering the useful knowledge from such huge corpus leading to many problems related to the type of information such as the related concepts of the domain of texts and the semantic relationship associated with them. In this paper, an agent-based system of two–level for Self-supervised relation extraction from MEDLINE using Unified Medical Language System (UMLS) Knowledgebase, has been proposed . The model uses a Self-supervised Approach for Relation Extraction (RE) by constructing enhanced training examples using information from UMLS with hybrid text features. The model incorporates Apache Spark and HBase BD technologies with multiple data mining and machine learning technique with the Multi Agent System (MAS). The system shows a better result in comparison with the current state of the art and naïve approach in terms of Accuracy, Precision, Recall and F-score.
关键词：thesai; IJACSA Volume 7 Issue 11; Knowledge Mining; Relation Extraction; Self-supervised; Big Data; Agent