首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Causal inference and the evolution of opposite neurons
  • 本地全文:下载
  • 作者:Stephanie Badde ; Fangfang Hong ; Michael S. Landy
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2021
  • 卷号:118
  • 期号:36
  • DOI:10.1073/pnas.2112686118
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:A pesky mosquito continues to annoy you and you are poised to swat it. You see it hovering above your arm, and feel a gentle tickle, but in a slightly different spot ( Fig. 1 A ). Where should you strike? The mathematically optimal solution is to average the locations indicated by vision and touch, with greater weight given to the more reliable signal, the one that typically leads to smaller errors. A substantial literature indicates that for most modality pairings and perceptual tasks humans behave in accordance with this optimal prescription for sensory integration ( 1– 4). However, if vision and touch indicate very different locations, the tickle might be due to another cause such as an old mosquito bite ( Fig. 1 B ). In this case, it makes sense to segregate the sensory signals, ignore touch, and swat at the location indicated by vision. This decision requires making a “causal inference,” that is, an inference as to whether two sensory signals derive from a common source or separate sources. Humans ( 5, 6) and monkeys ( 7, 8) behave as if they perform causal inference; they do not integrate signals unlikely to come from the same source. The challenging question is, how are sensory cue integration and causal inference implemented in the brain? Fig. 1. Multisensory integration and causal inference. ( A) When a common cause is inferred, sensory signals are integrated; ( B) when separate sources are inferred, the segregated visual signal is used. ( C) Congruent neurons have similar tuning for heading direction across modalities; ( D) opposite neurons’ preferred directions differ across modalities. Both types of neurons contribute to ( E) self- and ( F) world-motion estimation as well as ( G) causal-inference judgments, but to different degrees. ( H) In Bayesian estimation, the integrated and segregated estimates are combined with weights equal to the probabilities of each causal scenario. In PNAS, Rideaux et al. ( 9) demonstrate how the interplay between different types of neurons could accomplish both optimal integration and causal-inference judgments. They simulate a particularly puzzling but also well-researched case of multisensory perception, visual and vestibular signals of self-motion. These signals converge in brain areas that include the dorsal medial superior temporal area (MSTd) and the ventral intraparietal area (VIP). Neurons in these areas are often tuned for heading direction, that is, these neurons fire the most when sensory cues indicate a particular direction, and fire less and less the more the signaled direction differs from their preferred one. Many of the neurons that receive input from both modalities are congruent neurons: They have similar tuning for the two modalities ( Fig. 1 C ). Thus, congruent neurons seem predestined to perform multisensory integration ( 10, 11). Curiously, many other neurons in MSTd and VIP are opposite neurons ( Fig. 1 D ): They are tuned to visual and vestibular information indicating opposite heading directions, e.g., rightward motion signaled by visual stimuli and leftward motion signaled by vestibular stimuli ( 10, 12). Opposite neurons appear to be ideally equipped to detect when sensory signals arise from different sources. In turn, the interplay of congruent and opposite neurons might enable the brain to perform causal inference ( 10, 13). A direct test of this hypothesis would require simultaneous recordings of congruent and opposite neurons in MSTd and VIP as well as the neurons they project to, which is a near-impossible task. However, artificial neural networks make it easy to inspect the interconnected behavior of neurons across different brain areas. Rideaux et al. ( 9) used a particularly clever approach to this problem. Rather than constructing an artificial neural network with a layer of hand-tuned congruent and opposite neurons, they trained an unconstrained artificial neural network to perform causal-inference judgments as well as self- and world-motion estimation and afterward inspected the tuning and connectivity of multisensory neurons. This multilayer, feedforward network had two sets of inputs, visual and vestibular. The visual inputs were short sequences of natural images that translated at various velocities in four directions (left–right, up–down, toward–away, and rotations about the line of sight). The vestibular inputs were from units tuned to velocities along each of these four axes and slightly corrupted by noise. Separately for each motion direction, the output neurons of the neural network determined motion velocity (trained to match the average of the vestibular and visual input velocities; Fig. 1 E ), world-motion velocity (trained to match the difference between the two input velocities; Fig. 1 F ), and a common-source judgment (trained to match a binary categorization of whether the difference between the two input velocities was large or small; Fig. 1 G ). Notably, the world-motion estimation task differs from multisensory integration in other domains, where integration is typically contrasted against segregation, i.e., reliance on one modality alone (compare Fig. 1 A and B ). Thus, it is interesting how the network would generalize to multisensory perception of spatial, temporal, or other features. The key contribution of the paper is that after successful training, the network developed neurons with the same characteristics as congruent and opposite neurons in macaque MSTd and VIP. More specifically, in the “MSTd” layer of the network, neurons had clear tuning for heading direction (where direction was computed from velocity along the left–right and forward–backward axes), and most neurons had either congruent visual and vestibular tuning or showed opposite motion direction tuning for the two modalities. Both types of neurons provided significant input for causal-inference judgments, confirming the initial hypothesis that the balance between congruent and opposite neurons is crucial for inferring whether two signals originate from the same source. Regarding the network’s motion velocity percepts, congruent cells provided stronger input for self-motion estimates and opposite cells provided stronger input for world-motion estimates. Both types of neurons were also able to contribute to the other perceptual estimate, but to a lesser degree. Previous computational models with hand-tuned congruent and opposite neurons had already demonstrated that such networks are able to perform causal inference ( 13, 14). However, so are artificial neural networks without these properties ( 15). In contrast to these top–down approaches, Rideaux et al. ( 9) show that the requirement to make both perceptual and causal-inference judgments leads to the development of congruent and opposite neurons, suggesting that this neural substrate is the optimal solution for the computation. The network’s performance in the perceptual tasks qualitatively mirrored human and monkey behavior in heading-discrimination tasks ( 11). After successful training, the network was presented with visual and vestibular motion inputs with a small cross-modal discrepancy. For these tests, the visual stimulus was changed to a collection of moving dots; its reliability was manipulated by varying the proportion of dots moving in the same direction. The network integrated visual and vestibular inputs according to their reliabilities: Self-motion estimates agreed more with the vestibular input if the visual input was less reliable and toward the visual input, if it was more reliable. Notably, during training, self-motion estimates were reinforced to match the 50–50 average of the visual and vestibular signals, raising the question whether the ability to perform reliability-weighted integration emerged from the combined training of several tasks or was due to the architecture of the network. The influence of visual input on self-motion estimates was lower when the network inferred separate causes than when it inferred a common source of the two signals. Such a difference in cross-modal biases automatically emerges if one input modality is noisy, that is, if the same stimulus results in slightly different internal measurements across trials ( 6). However, in its current form, the network will not be able to reproduce the behavioral hallmark of causal inference: the reduction of cross-modal biases when the signals are more discrepant and thus less likely to have emerged from the same source. Given that the model is strictly feedforward, with separate outputs for causal inference and self- and world-motion estimates, the model is essentially unable to have the causal-inference judgment affect self-motion estimates. In other words, the model is unable to ignore the tickling sensation from the old mosquito bite in our introductory example. In contrast, Bayesian causal-inference models ( 6) replicate the nonlinear dependency of cross-modal biases on the discrepancy between the two signals ( 6, 16). They achieve this by summing the integrated and segregated estimates, weighted by the inferred probability of a common and separate sources, respectively ( Fig. 1 H ). In this view, the neural network model by Rideaux et al. ( 9) encompasses the first stage of a two-stage perceptual process. In fact, human brain activity in a multisensory context is consistent with separate representations for integrated, segregated, and the combined final estimates ( 17, 18). Another key component of the Bayesian approach to causal inference is the assumed prior probability of a common source. In the model of Rideaux et al. ( 9), this prior might be reflected in the weights of the connections between the multisensory MSTd layer and the output layer. However, this common-cause prior changes with the experimental context ( 19, 20), suggesting the need for additional input to the causal-inference process. Thus, a more complete model of multisensory integration and causal inference will require room for representations of both same- and separate-source perceptual estimates along with flexible common-source priors. Rideaux et al. ( 9) provide a convincing solution to the puzzle of the role of congruent and opposite neurons in causal inference. Training an artificial neural network to concurrently derive estimates of self- and world-motion as well as causal-inference judgments resulted in the development of congruent and opposite neurons. Whether the brain implements these inferences in this simple feedforward manner, and how causal inference is involved in perceptual judgments, are important future questions. Similarly, it would be fascinating to explore whether sensory experience is required for the development of opposite neurons during ontogenesis, or whether the processes invoked by training of this artificial neural network have played out during evolution.
国家哲学社会科学文献中心版权所有