首页    期刊浏览 2024年11月09日 星期六
登录注册

文章基本信息

  • 标题:Development of a multi-agent system for solving domain dictionary construction problem
  • 本地全文:下载
  • 作者:Vadym Yaremenko ; Oleksandr Syrotiuk
  • 期刊名称:Technology Audit and Production Reserves
  • 电子版ISSN:2706-5448
  • 出版年度:2020
  • 卷号:4
  • 期号:2
  • 页码:27-30
  • DOI:10.15587/2706-5448.2020.208400
  • 语种:English
  • 出版社:PC Technology Center
  • 摘要:The object of research is the use of multi-agent systems for text data mining. The need for this study arose with a tendency to increase the amount of textual information generated in the world. Accordingly, it is necessary to develop and research methods of its processing, as well as ways to use the results of this processing, because the methods can’t exist in isolation from practice. At the same time, there is a development of multi-agent systems (MAS), where agents are endowed with some kind of intelligence, these systems can be easily scaled. The use of MAS for text analysis is a promising area.The following methods of text data analysis were used in this study: TF-IDF and RAKE methods, Word2Vec neural network models, and TextRank. The algorithms were compared for their work and the results were compared. The corpus of documents (10–12texts, 5732–12331words) from the subject areas of physics and biology were used as a test set. According to the results of the study, one method was chosen, on the basis of which the MAS was built to solve the problem. Additionally, Schulze methods (with one and several winners) were used for voting. With the received system additional researches concerning accuracy and speed of work, and also – influence are carried out system parameters for its operation.It has been found that TF-IDF-based analysis is useful for finding terms in documents with a weak context. The resulting system shows an accuracy of 75% (3 of the 4 words proposed by the system are terms). The maximum operating time on test cases is 2–3seconds, which is achieved through the use of parallel calculations and modification of the Schulze method. The results obtained in this paper are heuristic (ontology is a rather vague concept) and require additional elaboration by experts in the relevant fields. However, the results are positive within this experiment.
  • 关键词:TF-IDF;RAKE;TextRank;Word2Vec;Schulze method;text data;frequency analysis;parallel computing;multi-agent system
国家哲学社会科学文献中心版权所有