We propose a new algorithm for object tracking in crowded video scenes by exploiting the properties of undecimated wavelet packet transform (UWPT) and interframe texture analysis. The algorithm is initialized by the user through specifying a region around the object of interest at the reference frame. Then, coefficients of the UWPT of the region are used to construct a feature vector (FV) for every pixel in that region. Optimal search for the best match is then performed by using the generated FVs inside an adaptive search window. Adaptation of the search window is achieved by interframe texture analysis to find the direction and speed of the object motion. This temporal texture analysis also assists in tracking of the object under partial or short-term full occlusion. Moreover, the tracking algorithm is robust to Gaussian and quantization noise processes. Experimental results show that the proposed algorithm has good performance for object tracking in crowded scenes on stairs, in airports, or at train stations in the presence of object translation, rotation, small scaling, and occlusion.