期刊名称:International Journal of Electrical and Computer Engineering
电子版ISSN:2088-8708
出版年度:2022
卷号:12
期号:1
页码:541-551
DOI:10.11591/ijece.v12i1.pp541-551
语种:English
出版社:Institute of Advanced Engineering and Science (IAES)
摘要:Arabic is a Semitic language and one of the most natural languages distinguished by the richness in morphological enunciation and derivation. This special and complex nature makes extracting information from the Arabic language difficult and always needs improvement. Open information extraction systems (OIE) have been emerged and used in different languages, especially in English. However, it has almost not been used for the Arabic language. Accordingly, this paper aims to introduce an OIE system that extracts the relation tuple from Arabic web text, exploiting Arabic dependency parsing and thinking carefully about all possible text relations. Based on clause types' propositions as extractable relations and constituents' grammatical functions, the identities of corresponding clause types are established. The proposed system named Arabic open information extraction(AOIE) can extract highly scalable Arabic text relations while being domain independent. Implementing the proposed system handles the problem using supervised strategies while the system relies on unsupervised extraction strategies. Also, the system has been implemented in several domains to avoid information extraction in a specific field. The results prove that the system achieves high efficiency in extracting clauses from large amounts of text.
关键词:Arabic;Clause tuples;Dependency parsing;Open information extraction