Abstract

Background: Sarcoma is a rare malignant tumor originating of the interstitial or connective tissue with a poor prognosis. Next-generation sequencing technology offers new opportunities for accurate diagnosis and treatment of sarcomas. There is an urgent need for new gene signature to predict prognosis and evaluate treatment outcomes.

Methods: We used transcriptome data from the Cancer Genome Atlas (TCGA) database and single sample gene set enrichment analysis (ssGSEA) to explore the cancer hallmarks most associated with prognosis in sarcoma patients. Then, weighted gene coexpression network analysis, univariate COX regression analysis and random forest algorithm were used to construct prognostic gene characteristics. Finally, the prognostic value of gene markers was validated in the TCGA and Integrated Gene Expression (GEO) (GSE17118) datasets, respectively.

Results: MYC targets V1 and V2 are the main cancer hallmarks affecting the overall survival (OS) of sarcoma patients. A six-gene signature including VEGFA, HMGB3, FASN, RCC1, NETO2 and BIRC5 were constructed. Kaplan-Meier analysis suggested that higher risk scores based on the six-gene signature associated with poorer OS (P < 0.001). The receiver Operating characteristic curve showed that the risk score based on the six-gene signature was a good predictor of sarcoma, with an area under the curve (AUC) greater than 0.73. In addition, the prognostic value of the six-gene signature was validated in GSE17118 with an AUC greater than 0.72.

Conclusion: This six-gene signature is an independent prognostic factor in patients with sarcoma and is expected to be a potential therapeutic target for sarcoma.