摘要:Emotion mismatch between training and
testing is one of the important factors causing the performance degradation of
speaker recognition system. In our previous work, a bi-model emotion speaker
recognition (BESR) method based on virtual HD (High Different from neutral,
with large pitch offset) speech synthesizing was proposed to deal with this
problem. It enhanced the system performance under mismatch emotion states in
MASC, while still suffering the system risk introduced by fusing the scores
from the unreliable VHD model and the neutral model with equal weight. In this
paper, we propose a new BESR method based on score reliability fusion. Two strategies,
by utilizing identification rate and scores average relative loss difference,
are presented to estimate the weights for the two group scores. The results on
both MASC and EPST shows that by using the weights generated by the two strategies,
the BESR method achieve a better performance than that by using the equal
weight, and the better one even achieves a result comparable to that by using
the best weights selected by exhaustive strategy.