期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2018
卷号:41
期号:2
页码:10-22
出版社:IEEE Computer Society
摘要:We present BIGGORILLA, an open-source resource for data scientists who need data preparation andintegration tools, and the vision underlying the project. We then describe four packages that we contributedto BIGGORILLA: KOKO (an information extraction tool), FLEXMATCHER (a schema matchingtool), MAGELLAN and DEEPMATCHER (two entity matching tools). We hope that as more softwarepackages are added to BIGGORILLA, it will become a one-stop resource for both researchers and industrypractitioners, and will enable our community to advance the state of the art at a faster pace..