首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition
  • 本地全文:下载
  • 作者:Yuki Takashima ; Toru Nakashika ; Tetsuya Takiguchi
  • 期刊名称:EURASIP Journal on Audio, Speech, and Music Processing
  • 印刷版ISSN:1687-4714
  • 电子版ISSN:1687-4722
  • 出版年度:2019
  • 卷号:2019
  • 期号:1
  • 页码:1-11
  • DOI:10.1186/s13636-019-0160-1
  • 出版社:Hindawi Publishing Corporation
  • 摘要:Voice conversion (VC) is a technique of exclusively converting speaker-specific information in the source speech while preserving the associated phonemic information. Non-negative matrix factorization (NMF)-based VC has been widely researched because of the natural-sounding voice it achieves when compared with conventional Gaussian mixture model-based VC. In conventional NMF-VC, models are trained using parallel data which results in the speech data requiring elaborate pre-processing to generate parallel data. NMF-VC also tends to be an extensive model as this method has several parallel exemplars for the dictionary matrix, leading to a high computational cost. In this study, an innovative parallel dictionary-learning method using non-negative Tucker decomposition (NTD) is proposed. The proposed method uses tensor decomposition and decomposes an input observation into a set of mode matrices and one core tensor. The proposed NTD-based dictionary-learning method estimates the dictionary matrix for NMF-VC without using parallel data. The experimental results show that the proposed method outperforms other methods in both parallel and non-parallel settings.
  • 关键词:Voice conversion; Non-negative Tucker decomposition; Non-negative matrix factorization; Non-parallel training
国家哲学社会科学文献中心版权所有