摘要:We present a quantitative model of Standard Yor¨´b¨¢ (SY) intonation; it is designed to have
parameters that are linguistically interpretable. The model is built and trained on speech data from a
native speaker of SY. The resulting model reproduces the data well: its Root Mean Square prediction
error (RMSE) is 14:00 Hz on a test set. We find that intonation is used to mark sentence and phrase
boundaries: beginning syllables are systematically stronger, while ending syllables are systematically
weaker than the medial syllables. The M tone is the strongest and the H tone is the weakest, though
the differences are modest. We see comparable amounts of carry-over and anticipatory co-articulation.
The resulting model for SY shows similar characteristics when compared to Mandarin and Cantonese
intonation models.
关键词:Intonation modelling, Speech synthesis, Quantitative model