期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2015
期号:NCET
页码:33
出版社:S&S Publications
摘要:Data quality is a critical problem in modern databases. Data-entry forms present the first and arguablybest opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improvingdata quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, anddata quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of theform. USHER then applies this model at every step of the data-entry process to improve data quality. Before entry, itinduces a form layout that captures the most important data values of a form instance as quickly as possible and reducesthe complexity of error-prone questions. During entry, it dynamically adapts the form to the values being entered byproviding real-time interface feedback, reasking questions with dubious responses, and simplifying questions by reformulatingthem. After entry, it revisits question responses that it deems likely to have been entered incorrectly by reaskingthe question or a reformulation thereof. We evaluate these components of USHER using two real-world data sets.Our results demonstrate that USHER can improve data quality considerably at a reduced cost when compared to currentpractice.
关键词:Data quality; data entry; form design; adaptive form