首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Front Index Extraction from Research Documents Using Meta-Content Framework
  • 本地全文:下载
  • 作者:Tripti Sharma ; Sarang Pitale
  • 期刊名称:Indian Journal of Education and Information Management
  • 印刷版ISSN:2277-5367
  • 电子版ISSN:2277-5374
  • 出版年度:2012
  • 卷号:1
  • 期号:7
  • 页码:301-305
  • 语种:English
  • 出版社:Indian Society for Education and Environment
  • 摘要:Text mining is providing new areas of research for the researchers. Front index extraction is one of such area in the field of text mining. Front index for a book is a tabular management of topics and subtopics with page numbers. Various ongoing researches focus on front index extraction from e-books using various techniques such as image processing. The present scheme focuses on front index extraction from research documents using a string matching algorithm. The paper also describe the working of a framework called Meta-Content framework for e-books, MCFE, which uses the front index extraction process and uses the extracted front index as meta information. The framework takes e-book in PDF form and extracts the front index by converting the PDF format e-book in text. The framework is developed using Java and iText library.
  • 关键词:Text Mining, Front Index, e-book, Meta-information, PDF, Java, itext
国家哲学社会科学文献中心版权所有