首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Prosodically Rich Speech Synthesis Interface Using Limited Data of Celebrity Voice
  • 本地全文:下载
  • 作者:Takashi Nose ; Taiki Kamei
  • 期刊名称:Journal of Computer and Communications
  • 印刷版ISSN:2327-5219
  • 电子版ISSN:2327-5227
  • 出版年度:2016
  • 卷号:04
  • 期号:16
  • 页码:79-94
  • DOI:10.4236/jcc.2016.416006
  • 语种:English
  • 出版社:Scientific Research Publishing
  • 摘要:To enhance the communication between human and robots at home in the future, speech synthesis interfaces are indispensable that can generate expressive speech. In addition, synthesizing celebrity voice is commercially important. For these issues, this paper proposes techniques for synthesizing natural-sounding speech that has a rich prosodic personality using a limited amount of data in a text-to-speech (TTS) system. As a target speaker, we chose a well-known prime minister of Japan, Shinzo Abe, who has a good prosodic personality in his speeches. To synthesize natural-sounding and prosodically rich speech, accurate phrasing, robust duration prediction, and rich intonation modeling are important. For these purpose, we propose pause position prediction based on conditional random fields (CRFs), phone-duration prediction using random forests, and mora-based emphasis context labeling. We examine the effectiveness of the above techniques through objective and subjective evaluations.
  • 关键词:Parametric Speech Synthesis;Hidden Markov Model (HMM);Prosodic Personality;Prosody Modeling;Conditional Random Field (CRF);Random Forest;Emphasis Context
国家哲学社会科学文献中心版权所有