首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Text pre-processing of multilingual for sentiment analysis based on social network data
  • 本地全文:下载
  • 作者:Neha Garg ; Kamlesh Sharma
  • 期刊名称:International Journal of Electrical and Computer Engineering
  • 电子版ISSN:2088-8708
  • 出版年度:2022
  • 卷号:12
  • 期号:1
  • 页码:776-784
  • DOI:10.11591/ijece.v12i1.pp776-784
  • 语种:English
  • 出版社:Institute of Advanced Engineering and Science (IAES)
  • 摘要:Sentiment analysis (SA) is an enduring area for research especially in the field of text analysis. Text pre-processing is an important aspect to perform SA accurately. This paper presents a text processing model for SA, using natural language processing techniques for twitter data. The basic phases for machine learning are text collection, text cleaning, pre-processing, feature extractions in a text and then categorize the data according to the SA techniques. Keeping the focus on twitter data, the data is extracted in domain specific manner. In data cleaning phase, noisy data, missing data, punctuation, tags and emoticons have been considered. For pre-processing, tokenization is performed which is followed by stop word removal (SWR). The proposed article provides an insight of the techniques, that are used for text pre-processing, the impact of their presence on the dataset. The accuracy of classification techniques has been improved after applying text pre-processing and dimensionality has been reduced. The proposed corpus can be utilized in the area of market analysis, customer behaviour, polling analysis, and brand monitoring. The text pre-processing process can serve as the baseline to apply predictive analysis, machine learning and deep learning algorithms which can be extended according to problem definition.
  • 关键词:code-switch;linguistic-switching;machine learning;multilingual;pre-processing;sentiment analysis
国家哲学社会科学文献中心版权所有