期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2012
卷号:2012
出版社:ACL Anthology
摘要:One of the key tasks for analyzing conversational
data is segmenting it into coherent topic
segments. However, most models of topic
segmentation ignore the social aspect of conversations,
focusing only on the words used.
We introduce a hierarchical Bayesian nonparametric
model, Speaker Identity for Topic Segmentation
(SITS), that discovers (1) the topics
used in a conversation, (2) how these topics
are shared across conversations, (3) when
these topics shift, and (4) a person-specific
tendency to introduce new topics. We evaluate
against current unsupervised segmentation
models to show that including personspecific
information improves segmentation
performance on meeting corpora and on political
debates. Moreover, we provide evidence
that SITS captures an individual¡¯s tendency to
introduce new topics in political contexts, via
analysis of the 2008 US presidential debates
and the television program Crossfire.