文章基本信息

标题：Multi-word lexemes in syntactic context
其他标题：Víceslovné lexémy v syntaktickém kontextu
本地全文：下载
作者：Alexandr Rosen ; Hana Skoumalová ; Jiří Znamenáček 等
期刊名称：Studie z Aplikované Lingvistiky
印刷版ISSN：1804-3240
电子版ISSN：2336-6702
出版年度：2020
卷号：11
期号：2
页码：63-84
语种：English
出版社：Univerzita Karlova, Filozofická fakulta
摘要：We start with the assumption that (i) a corpus represents the use of language,i.e. linguistic performance,(ii) a rule-based grammar represents language as a system,i.e. linguistic competence,and (iii) corpus annotation represents the interface between the two. To detect and diagnose mismatches between the language use and the language system we use a constraint-based grammar run as a constraint solver on texts tagged and dependency-parsed by stochastic tools. The texts also have MWEs (multi-word expressions) identified and transformed into a constituency-based format before the grammar is applied. We describe the role and results of the grammar,and its use to check texts annotated with morphosyntactic categories,syntactic struc_ture and information about the status of relevant expressions as MWEs. The grammar also employs lexical resources such as a valency lexicon and a database of MWEs to make the checking more ac?curate and the annotation more informative. The results are represented as typed feature structures where MWE-related information can be shared by lexical and phrasal nodes. This allows for the an_notation of MWEs as lexical units,independently of their analysis in terms of syntactic structure. Focusing on the interplay of MWEs with their syntactic context we analyse a number of representa?tive examples,pointing out the pros and cons of specific solutions and the whole approach.
关键词：Czech;HPSG;syntax;treebank;multi-word expressions
其他关键词：čeština;HPSG;syntax;treebank;víceslovné lexikální jednotky