摘要:In making the Machines Intelligent, and enable them to work as human, Speech recognition is one of the most essential requirement. Human Language conveys various types of information such as the energy, pitch, loudness, rhythm etc., in the sound, the speech and its context such as gender, age and the emotion. Identifying the emotion from a speech pattern is a challenging task and the most useful solution especially in the era of widely developing speech recognition systems with digital assistants. Digital assistants like Bixby, Blackberry assistant are building products that consist of emotion identification and reply the user in step with user point of view. The objective of this work is to improve the accuracy of the speech emotion prediction using deep learning models. Our work experiments with the MLP and CNN classification models on three benchmark datasets with 5700 speech files of 7 emotion categories. The proposed model showed improved accuracy.