摘要:SummaryDeep mutational scanning (DMS) experiments have been performed on SARS-CoV-2’s spike receptor-binding domain (RBD) and human angiotensin-converting enzyme 2 (ACE2) zinc-binding peptidase domain—both central players in viral infection and evolution and antibody evasion—quantifying how mutations impact biochemical phenotypes. We modeled biochemical phenotypes from massively parallel assays, using neural networks trained on protein sequence mutations in the virus and human host. Neural networks were significantly predictive of binding affinity, protein expression, and antibody escape, learning complex interactions and higher-order features that are difficult to capture with conventional methods from structural biology. Integrating the physicochemical properties of amino acids, such as hydrophobicity and long-range non-bonded energy per atom, significantly improved prediction (empirical p < 0.01). We observed concordance of the neural network predictions with molecular dynamics (multiple 500 ns or 1 μs all-atom) simulations of the spike protein-ACE2 interface, with critical implications for the use of deep learning to dissect molecular mechanisms.Graphical abstractDisplay OmittedHighlights•Deep learning models of biochemical phenotypes from deep mutational scanning (DMS) data•Prediction performance gain from using physicochemical properties of amino acids•Concordance of neural network predictions with molecular dynamics simulations•Improved causal inference properties for neural-network-defined phenotypesComputational intelligence; Computational molecular modelling; Health sciences