首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:Multi-task Knowledge Distillation with Rhythm Features for Speaker Verification
  • 本地全文:下载
  • 作者:Ruyun Li ; Peng Ouyang ; Dandan Song
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2020
  • 卷号:10
  • 期号:5
  • 页码:249-262
  • DOI:10.5121/csit.2020.100523
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Recently, speaker embedding extracted by deep neural networks (DNN) has performed well in speaker verification (SV). However, it is sensitive to different scenarios, and it is too computationally intensive to be deployed on portable devices. In this paper, we first combine rhythm and MFCC features to improve the robustness of speaker verification. The rhythm feature can reflect the distribution of phonemes and help reduce the average error rate (EER) in speaker verification, especially in intra-speaker verification. In addition, we propose a multitask knowledge distillation architecture that transfers the embedding-level and label-level knowledge of a well-trained large teacher to a highly compact student network. The results show that rhythm features and multi-task knowledge distillation significantly improve the performance of the student network. In the ultra-short duration scenario, using only 14.9% of the parameters in the teacher network, the student network can even achieve a relative EER reduction of 32%.
  • 关键词:Multi-task learning ;Knowledge distillation ;Rhythm variation ;Angular softmax ;Speaker verification.
国家哲学社会科学文献中心版权所有