The robot simultaneous localization and mapping (SLAM) is a very important and useful technology in the robotic field. However, the environmental map constructed by the traditional visual SLAM method contains little semantic information, which cannot satisfy the needs of complex applications. The semantic map can deal with this problem efficiently, which has become a research hot spot. This paper proposed an improved deep residual network- (ResNet-) based semantic SLAM method for monocular vision robots. In the proposed approach, an improved image matching algorithm based on feature points is presented, to enhance the anti-interference ability of the algorithm. Then, the robust feature point extraction method is adopted in the front-end module of the SLAM system, which can effectively reduce the probability of camera tracking loss. In addition, the improved key frame insertion method is introduced in the visual SLAM system to enhance the stability of the system during the turning and moving of the robot. Furthermore, an improved ResNet model is proposed to extract the semantic information of the environment to complete the construction of the semantic map of the environment. Finally, various experiments are conducted and the results show that the proposed method is effective.