Research Paper Volume 8, Issue 5 pp 1021—1030
Deep biomarkers of human aging: Application of deep neural networks to biomarker development
- 1 Pharma.AI Department, Insilico Medicine, Inc, Baltimore, MD 21218, USA
- 2 Computer Technologies Lab, ITMO University, St. Petersburg 197101, Russia
- 3 The Biogerontology Research Foundation, Oxford, UK
- 4 School of Systems Biology, George Mason University (GMU), Fairfax, VA 22030, USA
- 5 Invitro Laboratory, Ltd, Moscow 125047, Russia
- 6 Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- 7 Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
Received: September 26, 2015 Accepted: May 9, 2016 Published: May 18, 2016
https://doi.org/10.18632/aging.100968How to Cite
Abstract
One of the major impediments in human aging research is the absence of a comprehensive and actionable set of biomarkers that may be targeted and measured to track the effectiveness of therapeutic interventions. In this study, we designed a modular ensemble of 21 deep neural networks (DNNs) of varying depth, structure and optimization to predict human chronological age using a basic blood test. To train the DNNs, we used over 60,000 samples from common blood biochemistry and cell count tests from routine health exams performed by a single laboratory and linked to chronological age and sex. The best performing DNN in the ensemble demonstrated 81.5 % epsilon-accuracy r = 0.90 with R2 = 0.80 and MAE = 6.07 years in predicting chronological age within a 10 year frame, while the entire ensemble achieved 83.5% epsilon-accuracy r = 0.91 with R2 = 0.82 and MAE = 5.55 years. The ensemble also identified the 5 most important markers for predicting human chronological age: albumin, glucose, alkaline phosphatase, urea and erythrocytes. To allow for public testing and evaluate real-life performance of the predictor, we developed an online system available at http://www.aging.ai. The ensemble approach may facilitate integration of multi-modal data linked to chronological age and sex that may lead to simple, minimally invasive, and affordable methods of tracking integrated biomarkers of aging in humans and performing cross-species feature importance analysis.
Abbreviations
ML: Machine Learning; SVM: Support Vector Machine; DNN: Deep Neural Network; PFI: Permutation Feature Importance; RF: Random Forests; GBM: Gradient Boosting Machine; kNN: k-Nearest Neighbors; DT: Decision Trees; LR: Linear Regression.