摘要:Algorithms from the field of computer vision are widely applied in various fields including security, monitoring, automation elements, but also in multimodal human-computer interactions where they are used for face detection, body tracking and object recognition. Designing algorithms to reliably perform these tasks with limited computing resources and the ability to detect the presence of nearby people and objects in the background, changes in illumination and camera pose is a huge challenge for the field. Many of these problems use different classification methods. One of many image classification algorithms is Bag-of-Words (BoW). Originally, the classic BoW algorithm was used mainly for the natural language, so its direct application to computer vision issues may not be effective enough. The algorithm presented in this article contains a number of modifications that facilitate application of many types of characteristic features extracted from an image, image representation analysis and an adaptive clustering algorithm to create a dictionary of image features. These modifications affect classification result, which was confirmed in the experimental research.