Aiming at the problem of facial expression recognition under unconstrained conditions, a facial expression recognition method based on an improved capsule network model is proposed. Firstly, the expression image is normalized by illumination based on the improved Weber face, and the key points of the face are detected by the Gaussian process regression tree. Then, the 3dmms model is introduced. The 3D face shape, which is consistent with the face in the image, is provided by iterative estimation so as to further improve the image quality of face pose standardization. In this paper, we consider that the convolution features used in facial expression recognition need to be trained from the beginning and add as many different samples as possible in the training process. Finally, this paper attempts to combine the traditional deep learning technology with capsule configuration, adds an attention layer after the primary capsule layer in the capsule network, and proposes an improved capsule structure model suitable for expression recognition. The experimental results on JAFFE and BU-3DFE datasets show that the recognition rate can reach 96.66% and 80.64%, respectively.