期刊名称:International Journal of Computer Technology and Applications
电子版ISSN:2229-6093
出版年度:2011
卷号:2
期号:6
页码:2047-2051
出版社:Technopark Publications
摘要:Interest in the new publishing phenomenon known as e-book has grown enormously in last few years. There are now at least 150 companies involved in various ways in the development of e-books. Despite this involvement the spread of e-books has not yet useful in implementation of digital libraries. The use of e-books of PDF format in the implementation of digital library requires a robust information extraction system. In this paper we survey ten extraction tools for extracting contents like text, images, tables fonts etc. from e-books of PDF format. We also compare information extraction tools on the basic of various factors