摘要:Breast cancer is highly heterogeneous. The subtypes defined using immunohistochemistry markers and gene expression profilings (GEP) are related but not equivalent, with inter-connections under investigated. Our previous study revealed a set of differentially expressed genes (diff-genes), containing 1015 mRNAs and 69 miRNAs, which characterize the immunohistochemistry-defined breast tumor subtypes at the GEP level. However, they may convey redundant information due to the large amount of genes included. By reducing the dimension of the diff-genes, we identified 119 mRNAs and 20 miRNAs best explaining breast tumor heterogeneity with the most succinct number of genes found using hierarchical clustering and nearest-to-center principle. The final signature panel contains 119 mRNAs, whose superiority over diff-genes was replicated in two independent public datasets. The comparison of our signature with two pioneering signatures, the Sorlie’s signature and PAM50, suggests a novel marker, FOXA1, in breast cancer classification. Subtype-specific feature genes are reported to characterize each immunohistochemistry-defined subgroup. Pathway and network analysis reveal the critical roles of Notch signalings in [ER+|PR+]HER2− and cell cycle in [ER+|PR+]HER2+ tumors. Our study reveals the primary differences among the four immunohistochemistry-defined breast tumors at the mRNA and miRNA levels, and proposes a novel signature for breast tumor subtyping given GEP data.