期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2017
卷号:2017
页码:1074-1084
语种:English
出版社:ACL Anthology
摘要:The language that we produce reflects our personality, and various personal and demographic characteristics can be detected in natural language texts. We focus on one particular personal trait of the author, gender, and study how it is manifested in original texts and in translations. We show that author’s gender has a powerful, clear signal in originals texts, but this signal is obfuscated in human and machine translation. We then propose simple domain-adaptation techniques that help retain the original gender traits in the translation, without harming the quality of the translation, thereby creating more personalized machine translation systems.