首页    期刊浏览 2025年12月25日 星期四
登录注册

文章基本信息

  • 标题:AUTOMATIC WORD ORDER ANALYSIS OF ESTONIAN AS A SECOND LANGUAGE: THE NUCLEAR SENTENCE
  • 其他标题:KESKSETE LAUSEKOMPONENTIDE JÄRJESTUS ÕPPIJAKEELES: ARVUTIANALÜÜSI KATSE
  • 本地全文:下载
  • 作者:Helena Metslang ; Erika Matsak
  • 期刊名称:Eesti Rakenduslingvistika Ühingu Aastaraamat
  • 印刷版ISSN:1736-2563
  • 电子版ISSN:2228-0677
  • 出版年度:2010
  • 卷号:6
  • 页码:175-193
  • DOI:10.5128/ERYa6.11
  • 语种:English
  • 出版社:Eesti Rakenduslingvistika Ühing (Estonian Association for Applied Linguistics)
  • 摘要:This article gives an overview of our work on the automatic analysis of second language word order.For this purpose,an error analyzer and a set of correct word order patterns found from the fiction sub-corpus of Tartu University’s Corpus of Written Estonian were created.It is important to be able to form the nuclear sentence of the target language well (incl.subject,finite verb,obligatory modifiers of the verb and other elements influencing the sentence word order) because a well-formed core clause conveys the integral meaning of the whole sentence and helps to avoid the errors that even the very high level learners make.The article describes the learner’s difficulties in choosing,inflecting and ordering the core elements of the sentence (in the data of EVKK – Estonian Interlanguage Corpus).It gives an overview of the first steps of the automatic analysis of learner language word order,introduces the set of correct word order patterns and the prototype of the word order error analyzer and analyzes the factors influencing the success of its performance.The task of the error analyzer is to detect the core arguments,estimate their syntactic role and assess the correctness of their order.In the test described in the article 100 sentences were analyzed with the error analyzer and the output was assessed.The program made the correct choice 62 times of which 26 times the error analyzer connected the correct sentence with the right rule and assessed that the word order in the clause was correct.In the other 36 times the analyzer found that the input clause didn’t correspond to any rule in the pre-defined set.16 clauses from these 36 contained a word order error.There are a number of problems that still need to be solved including the erringly unmarked clausal border,more complex cases of coordination and imprecise adverbial analysis.The article suggests the use of the valency dictionary of Estonian verbs (Pool 1999) and the Database of Estonian verbal multi-word expressions as a possible solution of how to improve the program’s ability to distinguish free and bound adverbials.In the future it would be useful to integrate Remmel’s (1963) word order laws and the typical position preferences of parts of sentences into the system.To check the learners’ word order more effectively it is possible to add the module of typical errors to the programme.Identifying the more frequent errors of the clause,before searching within the set of correct patterns,would raise the precision and speed of the programme.This would likely be less time consuming than supplementing the error analyzer with pragmatic information (which in fact is one of the most important factors influencing Estonian word order).
  • 其他摘要:Artikkel käsitleb eesti keele lihtlause sõnajärje arvutianalüüsi katset,mille eesmärgiks on õppijakeele sõnajärje vealeidja loomine.Katse käigus koostati eesti keele sagedaste sõnajärjetüüpide mallid,mis kirjeldasid lihtlause ja mõne lihtsama liitlausetüübi verbi,tuumargumentide ning nende järge mõjutavate moodustajate või sõnade järge (põhiliselt subjekt,objekt,predikaat,adverbiaal lause algul või seotud laiendina,üldlaiend).Mallid leiti Tartu Ülikooli kirjakeele korpuse põhjal.Saadud mallide katvust hinnati kirjakeele ja õppijakeele korpuste peal spetsiaalselt loodud programmi abil.Artiklis kirjeldatav programm,mis on kasutatav koos mallide koguga,analüüsib õppijakeelt,märkides küsitavaks laused,mis ühelegi mallile ei vasta.Artikkel tutvustab mallide kogu loomise protsessi ja tekstilausete sõnajärge hindavat programmi.Antakse ka ülevaade programmi efektiivsusest õppijakeele tekstide analüüsil ning vealeidja edasise arendamise vajadustest.Õppijakeele analüüsil kasutati Tallinna Ülikooli eesti vahekeele korpust,mis koondab ligi 740 000 sõne mahus eesti keele õppijate loovkirjutisi ja harjutusi.*.
  • 关键词:word order;corpus linguistics;second language acquisition;Estonian
  • 其他关键词:sõnajärg;korpuslingvistika;teise keele omandamine;eesti keel
国家哲学社会科学文献中心版权所有