摘要:Aiming at the problem that the lack of accurate and efficient off-topic detection model for current Automated English Scoring System in China, an unsupervised off-topic essay detection model based on hybrid semantic space was proposed. Firstly, the essay and its essay prompt are respectively represented as noun phrases by using a neural-network dependency parser. Secondly, we introduce a method to construct a hybrid semantic space. Thirdly, we propose a method to represent the noun phrases of the essay and its prompt as vectors in hybrid semantic space and calculate the similarity between the essay and its prompt by using the noun phrase vectors of them. Finally, we propose a sort method to set the off-topic threshold so that the off-topic essays can be identified efficiently. The experimental results on four datasets totaling 5000 essays show that, compared to the previous off-topic essay detection models, the proposed model can detect off-topic essays with higher accuracy, and the accuracy rate over all essay data sets reaches 89.8%.