摘要:In this paper we present CroNER, a named entity recognition and classification system for Croatian lan- guage based on supervised sequence labeling with conditional random fields (CRF). We use a rich set of lexical and gazetteer-based features and different methods for enforcing document-level label consistency. Extensive evaluation shows that our method achieves state-of-the-art results (MUC F1 90.73%, Exact F1 87.42%) when compared to existing NERC systems for Croatian and other Slavic languages.
关键词:named entity recognition; conditional random fields; natural language processing; information extraction;Croatian language