Many practical recommendation systems have been studied, and also the services based on such recommendation systems have been opened in real world. The main research questions of this work are i) how these recommendation services provide users with useful information, and ii) how different the results from the systems are from each other. In this paper, we propose a black-box evaluation framework of the practical recommendation services. Thus, we have designed user modeling process for generating synthesized user models as the inputs for the recommendation services. User models (i.e., a set of user ratings) have been synthesized to discriminate the recommendation results. Given a set of practical recommendation systems, the proposed black-box testing scheme has been applied by comparing recommendation results. Particularly, we focus on investigating whether the services consider attribute selection.