摘要:SummaryA key task of genomic surveillance of infectious viral diseases lies in the early detection of dangerous variants. Unexpected help to this end is provided by the analysis of deep sequencing data of viral samples, which are typically discarded after creating consensus sequences. Such analysis allows one to detect intra-host low-frequency mutations, which are a footprint of mutational processes underlying the origination of new variants. Their timely identification may improve public-health decision-making with respect to traditional approaches exploiting consensus sequences. We present the analysis of 220,788 high-quality deep sequencing SARS-CoV-2 samples, showing that many spike and nucleocapsid mutations of interest associated to the most circulating variants, including Beta, Delta, and Omicron, might have been intercepted several months in advance. Furthermore, we show that a refined genomic surveillance system leveraging deep sequencing data might allow one to pinpoint emerging mutation patterns, providing an automated data-driven support to virologists and epidemiologists.Graphical abstractDisplay OmittedHighlights•Early detection of hazardous variants is crucial in genomic surveillance of epidemics•Most approaches focus on consensus sequences and neglect intra-host minor mutations•We present the analysis of minor mutation profiles from 220k + SARS-CoV-2 samples•Many S and N mutations of interest are detected as minor several months in advanceMicrobiology; Virology; Bioinformatics; Genomic analysis