期刊名称:International Journal of Signal Processing, Image Processing and Pattern Recognition
印刷版ISSN:2005-4254
出版年度:2014
卷号:7
期号:6
页码:75-84
DOI:10.14257/ijsip.2014.7.6.07
出版社:SERSC
摘要:The survey paper explains about the extraction and retrieval of personal name alias using various techniques from the web with the help of web crawls. The existing methods help to improve the depth of knowledge relevant to alias extraction and retrieval process. It also describes about how the aliases are ranked, then page counts on the web, word co-occurrence using anchor text and techniques like term frequency (tf), inverse document frequency (idf), log likelihood ratio. Chi-squared tests etc.., are used for measuring the association and similarities between words. The existing method consists of pattern extraction algorithm or string matching algorithm for extracting patterns from snippets instead of using these algorithms. The survey helps to discover a proposed method as graph mining to extract personal name aliases from the web.
关键词:Text mining; Information extraction; Web text analysis; Sentiment analysis