文章基本信息

标题：Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback
本地全文：下载
作者：S. Dasgupta ; V. Ng
期刊名称：Journal of Artificial Intelligence Research
印刷版ISSN：1076-9757
出版年度：2010
卷号：39
页码：581-632
出版社：American Association of Artificial
摘要：While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimensions, such as the author's mood, gender, age, or sentiment. Without knowing the user's intention, a clustering algorithm will only group documents along the most prominent dimension, which may not be the one the user desires. To address the problem of clustering documents along the user-desired dimension, previous work has focused on learning a similarity metric from data manually annotated with the user's intention or having a human construct a feature space in an interactive manner during the clustering process. With the goal of reducing reliance on human knowledge for fine-tuning the similarity function or selecting the relevant features required by these approaches, we propose a novel active clustering algorithm, which allows a user to easily select the dimension along which she wants to cluster the documents by inspecting only a small number of words. We demonstrate the viability of our algorithm on a variety of commonly-used sentiment datasets.