Research Paper|Volume 14, Issue 2|pp 845—868

Tumor microenvironment-related multigene prognostic prediction model for breast cancer

Kai Hong¹, Yingjue Zhang², Lingli Yao¹, Jiabo Zhang³, Xianneng Sheng³, Yu Guo³

¹Medicine School, Ningbo University, Jiangbei, Ningbo 315211, Zhejiang, China
²Department of Molecular Pathology, Division of Health Sciences, Graduate School of Medicine, Osaka University, Suita, Osaka 565–0871, Japan
³Department of Thyroid and Breast Surgery, Ningbo City First Hospital, Haishu, Ningbo 315010, Zhejiang, China

Received: October 28, 2021Accepted: January 14, 2022Published: January 20, 2022

https://doi.org/10.18632/aging.203845

Copyright: © 2022 Hong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Background: Breast cancer is an invasive disease with complex molecular mechanisms. Prognosis-related biomarkers are still urgently needed to predict outcomes of breast cancer patients.

Methods: Original data were download from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). The analyses were performed using perl-5.32 and R-x64-4.1.1.

Results: In this study, 1086 differentially expressed genes (DEGs) were identified in the TCGA cohort; 523 shared DEGs were identified in the TCGA and GSE10886 cohorts. Eight subtypes were estimated using non-negative matrix factorization clustering with significant differences seen in overall survival (OS) and progression-free survival (PFS) (P < 0.01). Univariate Cox analysis and least absolute shrinkage and selection operator (LASSO) regression analysis were performed to develop a related risk score related to the 17 DEGs; this score separated breast cancer into low- and high-risk groups with significant differences in survival (P < 0.01) and showed powerful effectiveness (TCGA all group: 1-year area under the curve [AUC] = 0.729, 3-year AUC = 0.778, 5-year AUC = 0.781). A nomogram prediction model was constructed using non-negative matrix factorization clustering, the risk score, and clinical characteristics. Our model was confirmed to be related with tumor microenvironment. Furthermore, DEGs in high-risk breast cancer were enriched in histidine metabolism (normalized enrichment score [NES] = 1.49, P < 0.05), protein export (NES = 1.58, P < 0.05), and steroid hormone biosynthesis signaling pathways (NES = 1.56, P < 0.05).

Conclusions: We established a comprehensive model that can predict prognosis and guide treatment.