首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:Record Linking Techniques in the Utah Population Database to Improve Linking Rates of Hispanics
  • 本地全文:下载
  • 作者:Alison Fraser ; Stevie Kinnear ; Ken Smith
  • 期刊名称:International Journal of Population Data Science
  • 电子版ISSN:2399-4908
  • 出版年度:2018
  • 卷号:3
  • 期号:4
  • 页码:1-1
  • DOI:10.23889/ijpds.v3i4.764
  • 出版社:Swansea University
  • 摘要:IntroductionHispanic naming conventions frequently follow historical traditions. A person’s name consists of a given name or names followed by the father’s first surname and the mother’s first surname or reversed if the parents wish. The challenge occurs in keying and linking these non-standard names resulting in a potential linking bias. Objectives and ApproachHistorically the Utah Population Database (UPDB) has combined multiple surnames into a single surname to standardize names such as VAN WINKLE, however this resulted in Hispanic surnames combined into nonsensical names, for example ‘MARTINEZCRUZ’ that were difficult to match to surnames stored in separate fields in assorted combinations. The objective of this study was to see if name specific frequencies and name arrays created with the second and third given name, maiden name and surname allowed for ultimate flexibility in matching to records which did not adhere to any standardized keying convention and resulted in better linking results. ResultsA “Gold Standard” set of Hispanic individuals with multiple record sources in UPDB and the presence of two names in the surname field were evaluated. Two Linking approaches, one using the UPDB standard methodology and the other using name arrays were compared. Both methodologies resulted in high linking rates into complete or partial sets of records per individual. Overall, the array methodology linked more records into complete sets than the standard (94.7% cf. 82.6%). Using arrays, males linked at a higher rate than females and persons from Spanish speaking countries linked at the highest rate compared with USA born or other countries. However, there was an increase in incorrect links using arrays. Name frequency distributions specific to Hispanics also proved important. Conclusion/ImplicationsThis study found weights based on frequencies specific to the population being linked is critical to complete linking. Using name arrays for Hispanics was most effective in males with indicators of strong ethnic ties. However, the cost of using arrays was an increase in incorrect links and further refinement is needed.
国家哲学社会科学文献中心版权所有