Aging
Navigate
Research Paper|Volume 15, Issue 10|pp 4465—4480

Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning

Lingxiang Ran1, Zhixiang Gao1, Qiu Chen2, Fengmei Cui2, Xiaolong Liu1, Boxin Xue1
  • 1Department of Urology, The Second Affiliated Hospital of Soochow University, Suzhou, Jiangsu 215004, China
  • 2School of Radiation Medicine and Protection, Soochow University, Suzhou, Jiangsu 215123, China
Received: February 8, 2023Accepted: May 16, 2023Published: May 24, 2023

Copyright: © 2023 Ran et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Non-obstructive azoospermia (NOA) is a common cause of male infertility, and no specific diagnostic indicators exist. In this study, we used human testis datasets GSE45885, GSE45887, and GSE108886 from GEO database as training datasets, and screened 6 signature genes (all lowly expressed in the NOA group) using Boruta algorithm and Lasso regression: C12orf54, TSSK6, OR2H1, FER1L5, C9orf153, XKR3. The diagnostic efficacy of the above genes was examined by constructing models with LightGBM algorithm: the AUC (Area Under Curve) of both ROC and Precision-Recall curves for internal validation was 1.0 (p < 0.05). For the external validation dataset GSE145467 (human testis), the AUC of its ROC curve was 0.9 and that of its Precision-Recall curve was 0.833 (p < 0.05). Next, we confirmed the cellular localization of the above genes using human testis single-cell RNA sequencing dataset GSE149512, which were all located in spermatid. Besides, the downstream regulatory mechanisms of the above genes in spermatid were inferred by GSEA algorithm: C12orf54 may be involved in the repression of E2F-related and MYC-related pathways, TSSK6 and C9orf153 may be involved in the repression of MYC-related pathways, while FER1L5 may be involved in the repression of spermatogenesis pathway. Finally, we constructed a NOA model in mice using X-ray irradiation, and quantitative Real-time PCR results showed that C12orf54, TSSK6, OR2H1, FER1L5, and C9orf153 were all lowly expressed in NOA group. In summary, we have identified novel signature genes of NOA using machine learning methods and complete experimental validation, which will be helpful for its early diagnosis.