摘要:Arun Kumar, Feng Niu, and Christopher Ré, Department of Computer Sciences, University of Wisconsin-Madison The rise of big data presents both big opportunities and big challenges in domains ranging from enterprises to sciences. The opportunities include better-informed business decisions, more efficient supply-chain management and resource allocation, more effective targeting of products and advertisements, better ways to "organize the world's information," faster turnaround of scientific discoveries, etc. The challenges are also tremendous. For one, more and more data comes in diverse forms: text, audio, video, OCR (optical character recognition), sensor data, etc. While existing data management systems predominantly assume that data has rigid, precise semantics, increasingly more data (albeit valuable) contains imprecision or inconsistency. For another, the proliferation of ever-evolving algorithms to gain insights from data (under names including machine learning, data mining, and statistical analysis) can often be daunting to a developer with a particular data set and specific goals: the developer not only has to keep up with the state of the art, but also must expend significant development effort in experimenting with different algorithms.