摘要:Evaluation of causal reasoning models depends on how well the subjects’ causal beliefs are assessed. Elicitation of causal beliefs is determined by the experimental questions put to subjects. We examined the impact of question formats commonly used in causal reasoning research on participant’s responses. The results of our experiment (Study 1) demonstrate that both the mean and homogeneity of the responses can be substantially influenced by the type of question (structure induction vs. strength estimation vs. prediction). Study 2a demonstrates that subjects’ responses to a question requiring them to predict the effect of a candidate cause can be significantly lower and more heterogeneous than their responses to a question asking them to diagnose a cause when given an effect. Study 2b suggests that diagnostic reasoning can strongly benefit from cues relating to temporal precedence of the cause in the question. Finally, we evaluated 16 variations of recent computational models and found the model fitting was substantially influenced by the type of questions. Our results show that future research in causal reasoning should place a high priority on disentangling the effects of question formats from the effects of experimental manipulations, because that will enable comparisons between models of causal reasoning uncontaminated by method artefact.