摘要:Background: Age misreporting is pervasive in most low- and middle-income countries (LMIC). It may bias estimates of key demographic indicators, such as those required to track progress towards sustainable development goals. Existing methods to improve age data are often ineffective, cannot be adopted on a large scale, and/or do not permit estimating age over the entire life course. Objective: We tested a computer vision approach, which produces an age estimate by analyzing a photograph of an individual’s face. Methods: We constituted a small training dataset in a population of Senegal covered by a health and demographic surveillance system (HDSS) since 1962. We collected facial images of 353 women aged 18 and above, whose age could be ascertained precisely using HDSS data. We developed automatic age estimation (AAE) systems through machine learning and cross-validation. Results: AAE was highly accurate in distinguishing women of reproductive age from women aged 50 and older (area under the curve > 0.95). It allowed estimating age in completed years, with a level of precision comparable to those obtained in European or East Asian populations with training datasets of similar sizes (mean absolute error = 4.62 years). Conclusions: Computer vision might help improve age ascertainment in demographic datasets collected in LMICs. Further improving the accuracy of this approach will require constituting larger and more complete training datasets in additional LMIC populations. Contribution: Our work highlights the potential benefits of widely used computer science tools for improving demographic measurement in LMIC settings with deficient data.
关键词:age;age measurement;age misreporting;census;Senegal;survey data