期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2015
卷号:38
期号:3
出版社:IEEE Computer Society
摘要:One of the foremost challenges for information technology over the last few years has been to explore, under-stand, and extract useful information from large amounts of data. Some particular tasks such as annotating dataor matching entities have been outsourced to human workers for many years. But the last few years have seenthe rise of a new research field called crowdsourcing that aims at delegating a wide range of tasks to humanworkers, building formal frameworks, and improving the efficiency of these processes.The database community has thus been suggesting algorithms to process traditional data manipulation op-erators with the crowd, such as joins or filtering. This is even more useful when comparing the underlying“tuples” is a subjective decision – e.g., when they are photos, text, or simply noisy data with different variationsand interpretations – and can presumably be done better and faster by humans than by machines.The problems considered in this article aim to retrieve a subset of preferred items from a set of items bydelegating pairwise comparison operations to the crowd. The most obvious example is finding the maximum ofa set of items (called max). We also consider two natural generalizations of the max problem: