期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2012
卷号:40
期号:2
页码:135-142
出版社:Journal of Theoretical and Applied
摘要:SAHARA is a framework proposed by the authors of this paper to integrate Semantic Web and Natural Language Processing tools, to timely collect and disseminate disaster information to stakeholders to help in disaster management. This paper is related with information extraction component of SAHARA and presents a set of rules developed in GATE to extract disaster related information from online text resources. The developed pattern-action rules can be used to extract disaster entities including disaster location, type, magnitude, date and number of dead, injured, lost, homeless and affected people. A corpora is developed for various types of disasters such as earthquakes, hurricanes, floods, tsunami, forest fires, suicide bombing and military operations. The developed rule set is tested against this corpora. We achieved varying results for overall precision, recall and f-measure of extracted entities. The best results were achieved for disaster magnitude and the worst for date and time.
关键词:Information Extraction; NLP (Natural Language Processing); GATE (General Architecture For Text Engineering); Semantic Disaster Management System