The paper describes the philosophy of the climate model evaluation scheme being developed within CAWCR as well as the database of observational data-sets which under-pins it. It argues that model evaluation should measure 'fitness-of-purpose', that it should be objective and that it should be based on the largest possible number of observational data-sets. Time series plots of smoothed observational data and the relevance of the disparity between data from different sources are discussed. The paper calls for active participation of the Australian climate research community in the project.