首页    期刊浏览 2025年02月26日 星期三
登录注册

文章基本信息

  • 标题:Hashing and Enriching Short Texts Query Search Engine through Semantic Signals
  • 本地全文:下载
  • 作者:Panditi Santhi ; M.Venkatesh Naik
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:9
  • 页码:15242
  • DOI:10.15680/IJIRCCE.2017.0509070
  • 出版社:S&S Publications
  • 摘要:Short texts are different from long documents, they have unique characteristics which make difficult tounderstand and handle. Everyday billions of short texts are generated in an enormous volume in the form of searchqueries, news titles, tags, chatbots, social media posts etc. Most of the generated short texts contain less than 5 words.These short texts, do not always examine the syntax of a written language. Hence, traditional NLP methods do notalways apply to short texts. Many applications, including search engines, Question answering system, onlineadvertising etc. rely on short texts. Short texts usually encounter data sparsity and ambiguity problems inrepresentations for their lack of context. Understanding short texts retrieval, classification and processing become avery difficult task.In this paper, we propose a neural network based approach for understanding short text, where we perform texts as avectors with Recurrent Neural Networks (RNN), and use a sematic network to determine our intention for clusteringand understanding short texts. The task of short text understanding or conceptualization can be divided into three, astext segmentation, type detection, and concept labeling. In text segmentation, first the input text is pre-processed andremoves all the stop words if any. Then it is divided into a sequence of terms. Type detection is incorporated into theframework for short text understanding and it help to conduct disambiguation based on various types of contextualinformation that present in the text. Finally, concept labeling is performed to discover the hidden semantics from anatural language text.
  • 关键词:Short text understanding; conceptualization; semantic labeling; text segmentation; Recurrent Neural;Networks.
国家哲学社会科学文献中心版权所有