首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:An Overview of Fairness in Data – Illuminating the Bias in Data Pipeline
  • 本地全文:下载
  • 作者:Senthil Kumar B ; Aravindan Chandrabose ; Bharathi Raja Chakravarthi
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:34-45
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:Data in general encodes human biases by default; being aware of this is a good start, and the research around how to handle it is ongoing. The term ‘bias’ is extensively used in various contexts in NLP systems. In our research the focus is specific to biases such as gender, racism, religion, demographic and other intersectional views on biases that prevail in text processing systems responsible for systematically discriminating specific population, which is not ethical in NLP. These biases exacerbate the lack of equality, diversity and inclusion of specific population while utilizing the NLP applications. The tools and technology at the intermediate level utilize biased data, and transfer or amplify this bias to the downstream applications. However, it is not enough to be colourblind, gender-neutral alone when designing a unbiased technology – instead, we should take a conscious effort by designing a unified framework to measure and benchmark the bias. In this paper, we recommend six measures and one augment measure based on the observations of the bias in data, annotations, text representations and debiasing techniques.
国家哲学社会科学文献中心版权所有