首页    期刊浏览 2025年02月20日 星期四
登录注册

文章基本信息

  • 标题:A Guide to Text Analysis with Latent Semantic Analysis in R with Annotated Code: Studying Online Reviews and the Stack Exchange Community
  • 作者:Gefen, David ; Endicott, James E. ; Fresneda, Jorge E.
  • 期刊名称:Communications of the Association for Information Systems
  • 印刷版ISSN:1529-3181
  • 出版年度:2017
  • 卷号:41
  • 期号:1
  • 页码:21
  • 出版社:Association for Information Systems
  • 摘要:In this guide, we introduce researchers in the behavioral sciences in general and MIS in particular to text analysis as done with latent semantic analysis (LSA). The guide contains hands-on annotated code samples in R that walk the reader through a typical process of acquiring relevant texts, creating a semantic space out of them, and then projecting words, phrase, or documents onto that semantic space to calculate their lexical similarities. R is an open source, popular programming language with extensive statistical libraries. We introduce LSA as a concept, discuss the process of preparing the data, and note its potential and limitations. We demonstrate this process through a sequence of annotated code examples: we start with a study of online reviews that extracts lexical insight about trust. That R code applies singular value decomposition (SVD). The guide next demonstrates a realistically large data analysis of Stack Exchange, a popular Q&A site for programmers. That R code applies an alternative sparse SVD method. All the code and data are available on github.com.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有