Research Paper Volume 16, Issue 17 pp 12191—12208

Aging-related biomarkers for the diagnosis of Parkinson’s disease based on bioinformatics analysis and machine learning

Weiwei Yang1, , Shengli Xu1, , Ming Zhou1, , Piu Chan1,2,3, ,

  • 1 Department of Neurobiology, Neurology and Geriatrics, Xuanwu Hospital of Capital Medical University, National Clinical Research Center for Geriatric Disorders, Beijing, China
  • 2 Clinical Center for Parkinson's Disease, Capital Medical University, Beijing, China
  • 3 Key Laboratory for Neurodegenerative Disease of the Ministry of Education, Beijing Key Laboratory for Parkinson's Disease, Parkinson Disease Center of Beijing Institute for Brain Disorders, Beijing, China

Received: October 23, 2023       Accepted: April 22, 2024       Published: September 10, 2024      

https://doi.org/10.18632/aging.205954
How to Cite

Copyright: © 2024 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Parkinson’s disease (PD) is a multifactorial disease that lacks reliable biomarkers for its diagnosis. It is now clear that aging is the greatest risk factor for developing PD. Therefore, it is necessary to identify novel biomarkers associated with aging in PD. In this study, we downloaded aging-related genes from the Human Ageing Gene Database. To screen and verify biomarkers for PD, we used whole-blood RNA-Seq data from 11 PD patients and 13 healthy control (HC) subjects as a training dataset and three datasets retrieved from the Gene Expression Omnibus (GEO) database as validation datasets. Using the limma package in R, 1435 differentially expressed genes (DEGs) were found in the training dataset. Of these genes, 29 genes were found to occur in both DEGs and 307 aging-related genes. By using machine learning algorithms (LASSO, RF, SVM, and RR), Venn diagrams, and LASSO regression, four of these genes were determined to be potential PD biomarkers; these were further validated in external validation datasets and by qRT-PCR in the peripheral blood mononuclear cells (PBMCs) of 10 PD patients and 10 HC subjects. Based on the biomarkers, a diagnostic model was developed that had reliable predictive ability for PD. Two of the identified biomarkers demonstrated a meaningful correlation with immune cell infiltration status in the PD patients and HC subjects. In conclusion, four aging-related genes were identified as robust diagnostic biomarkers and may serve as potential targets for PD therapeutics.

Abbreviations

PD: Parkinson’s disease; HC: Healthy Control; GEO: Gene Expression Omnibus; PBMC: Peripheral blood mononuclear cells; CSF: cerebrospinal fluid; SN: substantia nigra; DEGs: differentially expressed genes; IncRNC: long noncoding RNAs; AI: artificial intelligence; SVM: Support Vector Machine; Lasso: Least absolute shrinkage and selection operator; RF: Random Forest; HAGR: Human Ageing Gene Database; PPI: protein-protein interactions; TLR4: toll-like receptor 4; BBB: blood-brain barrier; ROS: Reactive oxygen species; RNA Pol I: RNA polymerase I; NOX: NADPH oxidase; H2O2: hydrogen peroxide; ECM: Extracellular matrix; CAM: cell adhesion molecules; AD: Alzheimer’s disease; MPTP: 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine.