Aging
Navigate
Research Paper|Volume 15, Issue 8|pp 3120—3140

An artificial neural network model to diagnose non-obstructive azoospermia based on RNA-binding protein-related genes

Fan Peng1, Bahaerguli Muhuitijiang2,3, Jiawei Zhou2,3, Haoyu Liang4, Yu Zhang1, Ranran Zhou1,3
  • 1Department of Urology, Baoan Central Hospital of Shen Zhen, Shenzhen 518102, China
  • 2Department of Urology, Nanfang Hospital, Southern Medical University, Guangzhou 510000, China
  • 3The First School of Clinical Medicine, Southern Medical University, Guangzhou 510000, China
  • 4Department of Urology, The Third Affiliated Hospital, Southern Medical University, Guangzhou 510000, China
* Equal contribution
Received: November 16, 2022Accepted: April 15, 2023Published: April 24, 2023

Copyright: © 2023 Peng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Non-obstructive azoospermia (NOA) is a severe form of male infertility, but its pathological mechanisms and diagnostic biomarkers remain obscure. Since the dysregulation of RNA-binding proteins (RBPs) had nonnegligible effects on spermatogenesis, we aimed to investigate the functions and diagnosis values of RBPs in NOA. 58 testicular samples (control = 11, NOA = 47) from Gene Expression Omnibus (GEO) were set as the training cohort. Three public datasets, containing GSE45885 (control = 4, NOA = 27), GSE45887 (control = 4, NOA = 16), and GSE145467 (control = 10, NOA = 10), and 44 clinical samples from the local hospital (control = 27, NOA = 17) were used for validation. Through a series of bioinformatical analyses and machine learning algorithms, including genomic difference detection, protein-protein interaction network analysis, LASSO, SVM-RFE, and Boruta, DDX20 and NCBP2 were determined as significant predictors of NOA. Single-cell RNA sequencing of 432 testicular cell samples from NOA patients indicated that DDX20 and NCBP2 were associated with spermatogenesis (false discovery rate < 0.05). Based on the transcriptome expressions of DDX20 and NCBP2, we constructed multiple diagnosis models using logistic regression, random forest, and artificial neural network (ANN). The ANN model exhibited the most reliable predictive performance in the training cohort (AUC = 0.840), GSE45885 (AUC = 0.731), GSE45887 (AUC = 0.781), GSE145467 (AUC = 0.850), and local cohort (AUC = 0.623). Totally, an ANN diagnosis model based on RBP DDX20 and NCBP2 was developed and externally validated in NOA, functioning as a promising tool in clinical practice.