Research Paper Volume 16, Issue 5 pp 4075—4094
Genome-wide transcriptome profiling and development of age prediction models in the human brain
- 1 Department of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- 2 Department of Surgery, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
- 3 Shriner's Hospitals for Children-Boston, Boston, MA 02114, USA
Received: May 2, 2022 Accepted: March 28, 2023 Published: February 28, 2024
https://doi.org/10.18632/aging.205609How to Cite
Copyright: © 2024 Zarrella and Tsurumi. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
Aging-related transcriptome changes in various regions of the healthy human brain have been explored in previous works, however, a study to develop prediction models for age based on the expression levels of specific panels of transcripts is lacking. Moreover, studies that have assessed sexually dimorphic gene activities in the aging brain have reported discrepant results, suggesting that additional studies would be advantageous. The prefrontal cortex (PFC) region was previously shown to have a particularly large number of significant transcriptome alterations during healthy aging in a study that compared different regions in the human brain. We harmonized neuropathologically normal PFC transcriptome datasets obtained from the Gene Expression Omnibus (GEO) repository, ranging in age from 21 to 105 years, and found a large number of differentially regulated transcripts in the old and elderly, compared to young samples overall, and compared female and male-specific expression alterations. We assessed the genes that were associated with age by employing ontology, pathway, and network analyses. Furthermore, we applied various established (least absolute shrinkage and selection operator (Lasso) and Elastic Net (EN)) and recent (eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM)) machine learning algorithms to develop accurate prediction models for chronological age and validated them. Studies to further validate these models in other large populations and molecular studies to elucidate the potential mechanisms by which the transcripts identified may be related to aging phenotypes would be advantageous.
Abbreviations
PFC: Prefrontal cortex; GEO: Gene Expression Omnibus; Lasso: Least absolute shrinkage and selection operator; EN: Elastic Net; XGBoost: eXtreme Gradient Boosting; LightGBM: Light Gradient Boosting Machine; SHAP: SHapely Additive exPlanations; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; CV: Cross-validation.