摘要:Part-of-speech (POS) tagging, also called grammatical tagging, is the
commonest form of corpus annotation, and was the first form of annotation
to be developed by UCREL (University Centre for Computer Corpus
Research on Language) at Lancaster. Our POS tagging software for
English text, CLAWS (the Constituent Likelihood Automatic Word-tagging
System), has been continuously developed since the early 1980s. The
latest version of the tagger, CLAWS4, was used to POS tag 100 million
words of the British National Corpus (BNC); see Garside (1996).