摘要:This study adopts a large-scale corpus with five-tier break indices annotated according to C-TOBI. Based on it, several approaches, N-gram, Markov model and decision tree learning are applied to predict break indices automatically for unrestricted mandarin text. These approaches differ mutually not only in model, but also on features and even part-of-speech tag size. A deep comparison and analysis among these approaches was made in the research.
关键词:Markov models; speech synthesis; break indices; n-gram; decision tree