首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Identification of translational equivalents in Croatian-English parallel corpus
  • 本地全文:下载
  • 作者:Tadić, Marko ; Šojat, Krešimir
  • 期刊名称:Filologija
  • 印刷版ISSN:0449-363X
  • 出版年度:2002
  • 期号:38-39
  • 页码:0-0
  • 语种:English
  • 出版社:Croatian Academy of Sciences and Arts
  • 摘要:The contribution is investigating the possibilities of identification of translational equivalents (TE) in Croatian-English parallel corpus aligned at the sentence level and collected in the Institute of Linguistics, Faculty of Philosophy, University of Zagreb. At the beginning the identification of TEs between single words is being accomplished by generating all possible word pairs with first word in pair from source language and second word in pair from target language. Only sentences with 1:1 alignment were included in processing. The statistical measure of Mutual Information was applied to generated pairs of words and it gave us the statistically relevant cooccurences. Pairs with high MI value are considered good TE candidates. In the second part of paper the identification of multi-word units (in this case only MWUs with 2 elements) has been achieved by applying the same statistical measure in both, source (Croatian) and target (English) language. The MI value has been applied on pairs of pairs of words giving the possible candidates of translational patterns. By high MI values it has been detected that there were pairs of words in source language, which were regularly translated with fixed pair of words in target language although the MI values for monolingual pairs in each language were extremely low. The contribution aims to show how the usage of statistical methods in parallel corpora processing can facilitate the detection of collocations (possible multi-word terms) and their TEs. At the same time the correspondent co-textual examples of word-usage is being provided in both, source and target language. This is of relevance for multilingual lexicographers as dictionary-writers and translators as the most important group of dictionary-users.
  • 关键词:Croatian-English parallel corpus; multi-word units; translational equivalents; word alignment; mutual information
国家哲学社会科学文献中心版权所有