Introduction
As the most common malignant tumor in women, breast cancer has exhibited the fifth highest death rate after stomach cancer worldwide [1]. With the progressive increase of both incidence and mortality, it is crucial to evaluate the prognosis in clinical environment and to propose an appropriate therapeutic regimen for malignant tumor patients. In clinical practice, TNM staging is generally used, but it may lead to an entirely distinct prognosis in the same situation [2]. Therefore, a more precise and valuable method is highly desirable to predict outcomes.
The past decade has emerged several novel technologies to explore cancer's molecular characteristics, especially with the rapid advancement of high-throughput next-generation sequencing (NGS), bioinformatics analyses, machine learning, and gene microarray technique. These techniques extremely contributed to early diagnosis of tumors, prognosis prediction, and individualized treatment. In addition, the bioinformatics-based biomarker discovery renders a deeper understanding on disease-related regulatory pathways and molecular mechanisms. To identify high-risk patients, gene risk models are established via bioinformatics analysis using clinical information and gene expression data. More studies have been addressed on establishing gene risk models, resulting from the measuring capability of mRNA expression by NGS and microarray. Particularly, several risk models have played an excellent role in predicting the prognosis outcomes, including autophagy-related gene models, immune cell infiltration-related gene models, nomograms, and so on [3–6]. However, these models elucidated the prognosis of BRCA from the perspective of a single functional genome, which limited their prediction results.
The occurrence and development of tumors are highly relative to the accumulative changes in tumor suppressor genes and oncogenes [7]. The differentially expressed genes (DEGs) play varying roles during different periods of the occurrence and distinct developmental stages of cancer [8]. Genes' abnormal expression has been previously reported to accelerate the progression of malignancies, and DEGs have been targeted as a novel treatment approach in several antitumor clinical types of research [9, 10].
In this study, high-throughput mRNA expression profiles from distinct regions and races have been investigated, focusing on the differences between breast cancer and adjacent mammary tissue hence to identify potential genetic biomarkers. The selected DEGs profiles were incorporated with TCGA-BRCA clinical data to explore DEGs' role in prognosis. Moreover, the independent prediction of BRCA patients' outcomes was achieved using a risk model based on 12 DEGs-related signatures. As such, the risk prediction model is demonstrated as a reliable prognostic marker in BRCA patients. On the other hand, our functional biomarker-based study also provides a novel alternative to predict prognosis in BRCA patients.
Results
Identification of DEGs in BRCA
To identify DEGs in this study, breast carcinoma samples (479 cases) and normal breast tissues (206 cases) were randomly collected from different regions and races. Using the limma package, we identified 3065 DEGs in GSE29431, 1293 DEGs in GSE32641, 1315 DEGs in GSE61304, 2252 DEGs in GSE70947 and 722 DEGs in GSE86374. DEGs in two representative samples from each of the five expression profiling databases are shown in the volcano plots (Figure 1A–1E). In Figure 1F and 1G, 61 upregulated genes and 90 downregulated genes are in common (|log2Fold Change| >1, adj.p < 0.05), which are displayed in the heatmap in five databases (Figure 1H).

Functional enrichment of DEGs and PPI network construction
The biological roles of 151 DEGs were further investigated using GO and KEGG pathway analysis. There are three functional categories in GO analysis: 1) in terms of biological process (BP), upregulated DEGs were merely enriched in nuclear division and organelle fission, while downregulated ones were enriched in peptide hormone and smooth muscle cell proliferation; 2) in terms of cellular component (CC), upregulated DEGs were involved in spindle, condensed chromosome and spindle pole, while downregulated ones were involved in collagen-containing extracellular matrix, lipid droplet and basement membrane; 3) in terms of molecular function (MF), microtubule motoring and binding activities were remarkably relative to upregulated DEGs, while downregulated ones were merely involved in integrin and growth factor and glycosaminoglycan binding (Figure 2A–2B). Additionally, KEGG pathway analysis demonstrated the participated roles of most upregulated genes in cell cycle and oocyte meiosis. As a contrast, downregulated genes were merely involved in PPAR signaling pathway and tyrosine metabolism (Figure 2C). The heatmap indicated the relationship between DEGs and enriched KEGG pathways (Figure 2D). Lastly, there are 151 nodes and 1169 edges in the PPI network of DEGs (51 upregulated genes and 90 downregulated genes, p = 1.0e-16), and the highest degree is determined at 48 (Supplementary Figure 1).

Exploration of the mechanism in predicted differential risk patients by GSEA
The functional differences among differential risk patients by GSEA were explored by comparing patients in low-risk and high-risk groups. For instance, high-risked patients were positively correlated with cell cycle (NES = 1.85, p = 0.012), TCA cycle (NES = 1.82, p = 0.016), and oxidative phosphorylation (NES = 2.05, p = 0.026). Meanwhile, patients in low-risk groups were negatively correlated with basal cell carcinoma (NES = –1.80, p = 0.004), Hedgehog signaling pathway (NES = –1.69, p = 0.010), and JAK-STAT signaling pathway (NES = –1.65, p = 0.035) (Figure 7). Therefore, the presence of an intensively regulatory role was observed for the development and progression in high-risk BRCA patients, exhibiting significant changes in pathways.
Prediction of targeted treatment in BRCA patients by our risk score
As presented in Figure 8, our DEGs-related risk score was closely associated with CDK4 expression (cor = 0.12, p = 9.9e-05), as well as the expressions of ERBB2 (cor = 0.14, p < 2.2E-16), EGFR (cor = –0.14, p <4.6e-06), and KIT (cor = –0.19, p < 4e-10) by Pearson’s correlation analysis. In conclusion, patients with higher DEGs-related risk score exhibited favorable therapeutic response to CDK4- and ERBB2- targeted treatments. Otherwise, the lower risk score, patients exhibited better response to EGFR and KIT targeted treatments.
Verification of the 12-prognostic mRNA expressions between BRCA specimens and adjacent breast tissues by qRT-PCR
To avoid false-positive results from public database, 12-mRNA expressions were further verified based on qRT-PCR results from 20 frozen tissues from BRCA patients (Figure 9). The experimental results indicated that the mRNA expressions of ANLN, KIF4A, MELK, NDC80, NEK2, TOP2A, TTK and UBE2T were upregulated in BRCA tissue in comparison to the adjacent tissues, while that of FREM1 and SORBS1 were downregulated in BRCA tissues compared to the adjacent tissues. These validation results were generally consistent with TCGA-BRCA database (Supplementary Figure 2), suggesting that these prognostic genes played critical roles in the initiative and developing stages of breast cancer.

Discussion
The aberrant gene expression is a major threat during the progressively developing stages of BRCA, recent intensive studies have indicated that some genes could be potentially targeted for diagnosis, treatment, and prognosis in BRCA. Thus, it is highly desirable to discover effective gene signatures to identify patients' condition, not only to find applicable prognostic targets but also to provide precise therapy for patients at high risk for disease recurrence. Nowadays, full-scale genetic data from BRCA samples can be obtained using DNA microarray and next-generation sequencing, providing comprehensive assessment during diseases progression.
Model in the present study was constructed using five GEO databases and TCGA-BRCA database. Among the total 151 DEGs, 31 DEGs were significantly correlated to the prognosis in BRCA patients, and 12 DEGs (ANLN, CCNE2, KIF4A, MELK, NDC80, NEK2, TOP2A, TTK, UBE2T, FREM1, IGF1 and SORBS1) were recognized and included in the risk model for overall survival prediction. For instance, ANLN was identified as the target for BRCA patients, which correlated with poor survival [11, 12]. CCNE2 promoted G1-S transition in HER2+ BRCA, and hence resulted in trastuzumab resistance [13]. In certain stages of mitosis, KIF4A served as a biomarker for predicting clinical prognosis [14, 15]. MELK promoted the occurrence and progression of colorectal adenocarcinomas [16, 17]. NDC80 was also recognized as a potential target in BRCA [18]. The expression of NEK2 exhibited its importance during the mitotic cell cycle of BRCA [19, 20]. As previously reported, the higher the expression of TOP2A, the better prognosis of BRCA [21]. Also, TTK was another promising therapeutic target due to it was overexpressed in BRCA [22]. UBE2T activated PI3K/Akt signaling pathway and stimulated tumor progression [23]. A recent study proposed that FREM1 isoform was an effective diagnostic and therapeutic marker for BRCA [24]. In certain cancer cells, binding of IGF-1 to IGF-1R could activate some signaling pathway and then promote oncogenic effect [25, 26]. SORBS1 was an adaptor protein involved in cell adhesion, growth factor signaling, and cancer metastasis [27, 28]. Next, our risk model was ascertained to independently predict prognosis in BRCA patients. As the risk score increased, the prognosis outcomes got worsen. The resulting DEGs-related risk model exhibited favorable predictive outcomes in both training (AUC at 5 year = 74%) and validation (AUC at 5 year = 73%) cohorts.
Furthermore, the 12 DEGs-related prognostic model was validated as an effective and excellent indicator of patients' tumor status and prognostic outcomes. Using our risk model, patients with particular clinicopathological features can be stratified into subgroups with varying clinical outcomes. In combination with these results, a nomogram was established by incorporating clinical features and risk score for DEGs signature, which presented excellent performance in survival prediction for BRCA patients. GSEA revealed that inhibited cell cycle is associated with better outcomes, suggesting the critical role of CDK 4/6 inhibitors for the prolonged survival time among BRCA patients. Moreover, the DEGs-related risk score significantly correlated with four targeted therapy genes, providing potential guidance for personalized treatments.
Several prognostic risk models have been previously developed, for example, Zhao et al. recently reported BRCA patients from TCGA cohorts and discussed the immune-related genes and the immune microenvironment of BRCA [5]. Lin et al. identified an autophagy-related genes prognostic model [6]. In our study, a DEGs-related signature was constructed through the different regions and races databases to predict BRCA patients' prognosis, as well as to evaluate the essence in BRCA in a comprehensive manner. The development and progression of BRCA were elaborated from multiple perspectives of differentially expressed genes, and few DEGs have been identified and verified to have potential application in clinics [29, 30]. However, even though external verification was performed in this study, validation of other cohorts is still necessary to verify the prognostic risk model's performance and efficiency. Lastly, the inevitable and inherent bias within the retrospective method should also be addressed.
In conclusion, a DEGs-related risk model was successfully constructed for predicting prognosis in BRCA, which is beneficial for patients and clinical researchers. Our systematic and comprehensive studies suggest that 12-DEGs signature might offer a more accurate evaluation system for BRCA patients' prognosis and provide more personalized therapies. In the future, more extended researches should be carried out to explore the possible mechanisms for the prediction of genes function, as well as the constitutions of the prognostic signature.
Materials and Methods
Data sources
Firstly, a flow chat was used to illustrate the entire process of our studies (Supplementary Figure 5). A total of 5 GEO datasets were used to identify DEGs in this study. GSE29431, GSE32641, GSE61304, GSE70947, and GSE86374 were downloaded from NCBI-GEO, a free and public database of transcriptional expression. GSE29431 data was obtained with the GPL570 platforms and collected from 54 breast tumors and 12 non-tumor breast tissue samples. GPL887 platforms were utilized to obtain GSE32641 data from 95 breast tumors and 7 normal breast samples. GSE61304 data was obtained with the GPL570 platforms and collected from 58 breast tumors and 4 normal breast tissues. GSE70947 data was obtained with the GPL13607 platforms and collected from 148 breast tumors and 148 normal breast samples. GSE86374 data was obtained with the GPL6422 platforms and collected from 124 breast tumors and 35 normal breast tissues. RNA-seq and survival information of TCGA-BRCA cohorts were retrieved from UCSC Xena [31]. GSE20685 and GSE 48390 data, including 408 BRCA samples from GEO datasets, were used for external validation. Details of the GEO datasets was shown in the Supplementary Table 2.
Identification of DEGs
Limma, a package that allows users to compare multiple databases in the GEO series under the R environment [32]. DEGs between BC and non-tumor breast tissue samples are identified. Removing the invalid genes, absolute log2 fold change > 1 and adjusted p < 0.05 were confirmed as threshold criteria for the genes, to further identify significant DEGs. The volcano plots, Venn diagrams, and heatmap were made by TBtools, and the overlapping DEGs were used to delve deeper [33].
Functional enrichment analysis of DEGs
Significant DEGs in BRCA were analysed by the “clusterProfiler” package [34], including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis [35, 36]. The biological processes, cell components and molecular functions were evaluated. Significant signaling pathways were statistically identified by KEGG analysis, where p < 0.05 and adjusted p < 0.05 were applied.
Construction of the PPI network
In this study, STRING online database (version 11.0) was utilized to construct the protein-protein interaction (PPI) network [37]. Cytoscape software (version 3.7.2) was employed to paint the integrated regulatory networks and analyzed the interaction network of different genes [38]. Using the MCODE plug-in component of the Cytoscape software, sorted and filtered critical modules of the whole network [39].
Breast cancer gene-expression miner
The expression of 12 selected DEGs in different subtypes of BRCA were analyzed using the bcGenExMiner (version 4.5), by correlating that with the co-expressed genes [41]. The correlation of 12 selected genes was generated using the correlation module.
Nomogram construction and validation
The “rms” package was used to establish a nomogram in R, which can predict patients’ prognosis by combining the risk score and multiple clinicopathological factors [42]. As for the assessment of predictive accuracy of the model, Calibration curves were established, the concordance index (C-index) and Akaike information criterion (AIC) were further conducted to evaluate the influence of prognosis factors.
Gene set enrichment analysis (GSEA)
GSEA4.0.3 (https://www.gsea-msigdb.org/gsea/index.jsp) was used to detect which pathways genes are primarily enriched, that is, gene set enrichment analysis differences satisfying the nominal p < 0.05 and the FDR < 0.25 were considered statistically significant [43].
Samples and clinicopathological data
A total of 20 surgically resected breast cancer specimens and adjacent breast tissue were collected from the Second Hospital of Dalian Medical University between January 2010 and January 2018. There are 5 patients involved in each subtypes (luminal A, luminal B, HER2+, and TNBC), identified via pathological examination. No chemotherapeutic or radiotherapeutic treatments have been applied on patients before the surgery. All procedures in this research protocol were approved by the ethics committee in the Second Hospital of Dalian Medical University.
Isolation of RNA and quantitative reverse transcriptase PCR quantification
Extraction of total RNA and synthesis of complementary DNA was performed by referring to the manufacturers’ instructions. TransStart Tip Green qPCR SuperMix (Transgen Biotech) was utilized for real-time qRT-PCR with specific primers against ANLN, CCNE2, KIF4A, MELK, NDC80, NEK2, TOP2A, TTK, UBE2T FREM1, IGF1, SORBS1, and glyceraldehyde-3-phosphate dehydrogenase (GAPDH), using the ABI 7900HT FAST Real-Time PCR System (Applied Biosystems, USA). GAPDH was selected for normalization. The primer sequences were shown in Supplementary Table 3.
Statistical analysis
The Wilcox test was proceeded to identify genes’ expression levels of BRCA tissues against that of normal tissues. DEGs in the prognostic risk signature were screened via Cox regression analyses. The log-rank test was performed to correlate OS related Kaplan–Meier survival curve. Time-dependent ROC curve was analysed using the “timeROC” package. Step-comparison for internal-group and external-group was conducted using Mann–Whitney–Wilcoxon test and Kruskal-Wallis test, respectively. The c-index was applied to represent the prognostic nomogram for 5, and 10 years. Statistical significance was considered when two-sided p-values were smaller than 0.05, using R software (version 3.6.2).
Author Contributions
J Li, G Huang, Z Zhao and M Li designed the study. J Li, G Huang and C Ren collected and analyzed the databases. J Li, N Wang, and S Sui organized and analyzed the statistics. J Li and G Huang wrote and revised the manuscript. All authors read and approved the manuscript and the submitted version.
Conflicts of Interest
The authors declare no conflicts of interest related to this study.
Funding
This study was funded by the Natural Science Foundation of Liaoning Province NO.2019-MS-098 to G. Huang, and National Nature Science Foundation of China (NSFC) No.81872156 to M. Li.
Editorial Note
This corresponding author has a verified history of publications using a personal email address for correspondence
References
- 1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021; 71:209–49. https://doi.org/10.3322/caac.21660 [PubMed]
- 2. Park YH, Lee SJ, Cho EY, Choi Y, Lee JE, Nam SJ, Yang JH, Shin JH, Ko EY, Han BK, Ahn JS, Im YH. Clinical relevance of TNM staging system according to breast cancer subtypes. Ann Oncol. 2011; 22:1554–60. https://doi.org/10.1093/annonc/mdq617 [PubMed]
- 3. Liu Y, Wu L, Ao H, Zhao M, Leng X, Liu M, Ma J, Zhu J. Prognostic implications of autophagy-associated gene signatures in non-small cell lung cancer. Aging (Albany NY). 2019; 11:11440–62. https://doi.org/10.18632/aging.102544 [PubMed]
- 4. Wang S, Zhang Q, Yu C, Cao Y, Zuo Y, Yang L. Immune cell infiltration-based signature for prognosis and immunogenomic analysis in breast cancer. Brief Bioinform. 2021; 22:2020–31. https://doi.org/10.1093/bib/bbaa026 [PubMed]
- 5. Zhao Y, Pu C, Liu Z. Exploration the Significance of a Novel Immune-Related Gene Signature in Prognosis and Immune Microenvironment of Breast Cancer. Front Oncol. 2020; 10:1211. https://doi.org/10.3389/fonc.2020.01211 [PubMed]
- 6. Lin QG, Liu W, Mo YZ, Han J, Guo ZX, Zheng W, Wang JW, Zou XB, Li AH, Han F. Development of prognostic index based on autophagy-related genes analysis in breast cancer. Aging (Albany NY). 2020; 12:1366–76. https://doi.org/10.18632/aging.102687 [PubMed]
- 7. Lahouel K, Younes L, Danilova L, Giardiello FM, Hruban RH, Groopman J, Kinzler KW, Vogelstein B, Geman D, Tomasetti C. Revisiting the tumorigenesis timeline with a data-driven generative model. Proc Natl Acad Sci U S A. 2020; 117:857–64. https://doi.org/10.1073/pnas.1914589117 [PubMed]
- 8. Cook DJ, Kallus J, Jörnsten R, Nielsen J. Molecular natural history of breast cancer: Leveraging transcriptomics to predict breast cancer progression and aggressiveness. Cancer Med. 2020; 9:3551–62. https://doi.org/10.1002/cam4.2996 [PubMed]
- 9. Huynh MM, Pambid MR, Jayanthan A, Dorr A, Los G, Dunn SE. The dawn of targeted therapies for triple negative breast cancer (TNBC): a snapshot of investigational drugs in phase I and II trials. Expert Opin Investig Drugs. 2020; 29:1199–208. https://doi.org/10.1080/13543784.2020.1818067 [PubMed]
- 10. Robert M, Frenel JS, Bourbouloux E, Rigaud DB, Patsouris A, Augereau P, Gourmelon C, Campone M. An Update on the Clinical Use of CDK4/6 Inhibitors in Breast Cancer. Drugs. 2018; 78:1353–62. https://doi.org/10.1007/s40265-018-0972-9 [PubMed]
- 11. Dai X, Mei Y, Chen X, Cai D. ANLN and KDR Are Jointly Prognostic of Breast Cancer Survival and Can Be Modulated for Triple Negative Breast Cancer Control. Front Genet. 2019; 10:790. https://doi.org/10.3389/fgene.2019.00790 [PubMed]
- 12. Magnusson K, Gremel G, Rydén L, Pontén V, Uhlén M, Dimberg A, Jirström K, Pontén F. ANLN is a prognostic biomarker independent of Ki-67 and essential for cell cycle progression in primary breast cancer. BMC Cancer. 2016; 16:904. https://doi.org/10.1186/s12885-016-2923-8 [PubMed]
- 13. Lee C, Fernandez KJ, Alexandrou S, Sergio CM, Deng N, Rogers S, Burgess A, Caldon CE. Cyclin E2 Promotes Whole Genome Doubling in Breast Cancer. Cancers (Basel). 2020; 12:2268. https://doi.org/10.3390/cancers12082268 [PubMed]
- 14. Li TF, Zeng HJ, Shan Z, Ye RY, Cheang TY, Zhang YJ, Lu SH, Zhang Q, Shao N, Lin Y. Overexpression of kinesin superfamily members as prognostic biomarkers of breast cancer. Cancer Cell Int. 2020; 20:123. https://doi.org/10.1186/s12935-020-01191-1 [PubMed]
- 15. Zhu C, Jiang W. Cell cycle-dependent translocation of PRC1 on the spindle by Kif4 is essential for midzone formation and cytokinesis. Proc Natl Acad Sci U S A. 2005; 102:343–48. https://doi.org/10.1073/pnas.0408438102 [PubMed]
- 16. Gong X, Chen Z, Han Q, Chen C, Jing L, Liu Y, Zhao L, Yao X, Sun X. Sanguinarine triggers intrinsic apoptosis to suppress colorectal cancer growth through disassociation between STRAP and MELK. BMC Cancer. 2018; 18:578. https://doi.org/10.1186/s12885-018-4463-x [PubMed]
- 17. Ding X, Duan H, Luo H. Identification of Core Gene Expression Signature and Key Pathways in Colorectal Cancer. Front Genet. 2020; 11:45. https://doi.org/10.3389/fgene.2020.00045 [PubMed]
- 18. Bièche I, Vacher S, Lallemand F, Tozlu-Kara S, Bennani H, Beuzelin M, Driouch K, Rouleau E, Lerebours F, Ripoche H, Cizeron-Clairac G, Spyratos F, Lidereau R. Expression analysis of mitotic spindle checkpoint genes in breast carcinoma: role of NDC80/HEC1 in early breast tumorigenicity, and a two-gene signature for aneuploidy. Mol Cancer. 2011; 10:23. https://doi.org/10.1186/1476-4598-10-23 [PubMed]
- 19. Deb B, Sengupta P, Sambath J, Kumar P. Bioinformatics Analysis of Global Proteomic and Phosphoproteomic Data Sets Revealed Activation of NEK2 and AURKA in Cancers. Biomolecules. 2020; 10:237. https://doi.org/10.3390/biom10020237 [PubMed]
- 20. Rellos P, Ivins FJ, Baxter JE, Pike A, Nott TJ, Parkinson DM, Das S, Howell S, Fedorov O, Shen QY, Fry AM, Knapp S, Smerdon SJ. Structure and regulation of the human Nek2 centrosomal kinase. J Biol Chem. 2007; 282:6833–42. https://doi.org/10.1074/jbc.M609721200 [PubMed]
- 21. Fasching PA, Weihbrecht S, Haeberle L, Gasparyan A, Villalobos IE, Ma Y, Ekici AB, Wachter DL, Hartmann A, Beckmann MW, Slamon DJ, Press MF. HER2 and TOP2A amplification in a hospital-based cohort of breast cancer patients: associations with patient and tumor characteristics. Breast Cancer Res Treat. 2014; 145:193–203. https://doi.org/10.1007/s10549-014-2922-x [PubMed]
- 22. King JL, Zhang B, Li Y, Li KP, Ni JJ, Saavedra HI, Dong JT. TTK promotes mesenchymal signaling via multiple mechanisms in triple negative breast cancer. Oncogenesis. 2018; 7:69. https://doi.org/10.1038/s41389-018-0077-z [PubMed]
- 23. Wang Y, Leng H, Chen H, Wang L, Jiang N, Huo X, Yu B. Knockdown of UBE2T Inhibits Osteosarcoma Cell Proliferation, Migration, and Invasion by Suppressing the PI3K/Akt Signaling Pathway. Oncol Res. 2016; 24:361–69. https://doi.org/10.3727/096504016X14685034103310 [PubMed]
- 24. Kashem MA, Li H, Toledo NP, Omange RW, Liang B, Liu LR, Li L, Yang X, Yuan XY, Kindrachuk J, Plummer FA, Luo M. Toll-like Interleukin 1 Receptor Regulator Is an Important Modulator of Inflammation Responsive Genes. Front Immunol. 2019; 10:272. https://doi.org/10.3389/fimmu.2019.00272 [PubMed]
- 25. Zhu Y, Wang T, Wu J, Huang O, Zhu L, He J, Li Y, Chen W, Chen X, Shen K. Associations Between Circulating Insulin-Like Growth Factor 1 and Mortality in Women With Invasive Breast Cancer. Front Oncol. 2020; 10:1384. https://doi.org/10.3389/fonc.2020.01384 [PubMed]
- 26. Christopoulos PF, Msaouel P, Koutsilieris M. The role of the insulin-like growth factor-1 system in breast cancer. Mol Cancer. 2015; 14:43. https://doi.org/10.1186/s12943-015-0291-7 [PubMed]
- 27. Brandsma CA, Guryev V, Timens W, Ciconelle A, Postma DS, Bischoff R, Johansson M, Ovchinnikova ES, Malm J, Marko-Varga G, Fehniger TE, van den Berge M, Horvatovich P. Integrated proteogenomic approach identifying a protein signature of COPD and a new splice variant of SORBS1. Thorax. 2020; 75:180–83. https://doi.org/10.1136/thoraxjnl-2019-213200 [PubMed]
- 28. Lin WH, Chiu KC, Chang HM, Lee KC, Tai TY, Chuang LM. Molecular scanning of the human sorbin and SH3-domain-containing-1 (SORBS1) gene: positive association of the T228A polymorphism with obesity and type 2 diabetes. Hum Mol Genet. 2001; 10:1753–60. https://doi.org/10.1093/hmg/10.17.1753 [PubMed]
- 29. Zhang B, Chen MY, Shen YJ, Zhuo XB, Gao P, Zhou FS, Liang B, Zu J, Zhang Q, Suleman S, Xu YH, Xu MG, Xu JK, et al. A Large-Scale, Exome-Wide Association Study of Han Chinese Women Identifies Three Novel Loci Predisposing to Breast Cancer. Cancer Res. 2018; 78:3087–97. https://doi.org/10.1158/0008-5472.CAN-17-1721 [PubMed]
- 30. Bownes RJ, Turnbull AK, Martinez-Perez C, Cameron DA, Sims AH, Oikonomidou O. On-treatment biomarkers can improve prediction of response to neoadjuvant chemotherapy in breast cancer. Breast Cancer Res. 2019; 21:73. https://doi.org/10.1186/s13058-019-1159-3 [PubMed]
- 31. Goldman MJ, Craft B, Hastie M, Repečka K, McDade F, Kamath A, Banerjee A, Luo Y, Rogers D, Brooks AN, Zhu J, Haussler D. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020; 38:675–78. https://doi.org/10.1038/s41587-020-0546-8 [PubMed]
- 32. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015; 43:e47. https://doi.org/10.1093/nar/gkv007 [PubMed]
- 33. Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, Xia R. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol Plant. 2020; 13:1194–202. https://doi.org/10.1016/j.molp.2020.06.009 [PubMed]
- 34. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012; 16:284–87. https://doi.org/10.1089/omi.2011.0118 [PubMed]
- 35. Lu Y, Rosenfeld R, Simon I, Nau GJ, Bar-Joseph Z. A probabilistic generative model for GO enrichment analysis. Nucleic Acids Res. 2008; 36:e109. https://doi.org/10.1093/nar/gkn434 [PubMed]
- 36. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017; 45:D353–61. https://doi.org/10.1093/nar/gkw1092 [PubMed]
- 37. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering CV. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019; 47:D607–13. https://doi.org/10.1093/nar/gky1131 [PubMed]
- 38. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003; 13:2498–504. https://doi.org/10.1101/gr.1239303 [PubMed]
- 39. Bandettini WP, Kellman P, Mancini C, Booker OJ, Vasu S, Leung SW, Wilson JR, Shanbhag SM, Chen MY, Arai AE. MultiContrast Delayed Enhancement (MCODE) improves detection of subendocardial myocardial infarction by late gadolinium enhancement cardiovascular magnetic resonance: a clinical validation study. J Cardiovasc Magn Reson. 2012; 14:83. https://doi.org/10.1186/1532-429X-14-83 [PubMed]
- 40. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000; 56:337–44. https://doi.org/10.1111/j.0006-341x.2000.00337.x [PubMed]
- 41. Jézéquel P, Frénel JS, Campion L, Guérin-Charbonnel C, Gouraud W, Ricolleau G, Campone M. bc-GenExMiner 3.0: new mining module computes breast cancer gene expression correlation analyses. Database (Oxford). 2013; 2013:bas060. https://doi.org/10.1093/database/bas060 [PubMed]
- 42. Balachandran VP, Gonen M, Smith JJ, DeMatteo RP. Nomograms in oncology: more than meets the eye. Lancet Oncol. 2015; 16:e173–80. https://doi.org/10.1016/S1470-2045(14)71116-7 [PubMed]
- 43. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005; 102:15545–50. https://doi.org/10.1073/pnas.0506580102 [PubMed]








