文章基本信息

标题：Machine Learning-Based Keywords Extraction for Scientific Literature
作者：Chunguo Wu ; Maurizio Marchese ; Jingqing Jiang 等
期刊名称：Journal of Universal Computer Science
印刷版ISSN：0948-6968
出版年度：2007
卷号：13
期号：10
页码：1471-1483
出版社：Graz University of Technology and Know-Center
摘要：With the currently growing interest in the Semantic Web, keywords/metadata extraction is coming to play an increasingly important role. Keywords extraction from documents is a complex task in natural languages processing. Ideally this task concerns sophisticated semantic analysis. However, the complexity of the problem makes current semantic analysis techniques insufficient. Machine learning methods can support the initial phases of keywords extraction and can thus improve the input to further semantic analysis phases. In this paper we propose a machine learning-based keywords extraction for given documents domain, namely scientific literature. More specifically, the least square support vector machine is used as a machine learning method. The proposed method takes the advantages of machine learning techniques and moves the complexity of the task to the process of learning from appropriate samples obtained within a domain. Preliminary experiments show that the proposed method is capable to extract keywords from the domain of scientific literature with promising results.