Research Paper Volume 8, Issue 5 pp 1021—1030

Deep biomarkers of human aging: Application of deep neural networks to biomarker development

Evgeny Putin1,2, , Polina Mamoshina1,3, , Alexander Aliper1, , Mikhail Korzinkin1, , Alexey Moskalev1,4, , Alexey Kolosov5, , Alexander Ostrovskiy5, , Charles Cantor6, , Jan Vijg7, , Alex Zhavoronkov1,3, ,

  • 1 Pharma.AI Department, Insilico Medicine, Inc, Baltimore, MD 21218, USA
  • 2 Computer Technologies Lab, ITMO University, St. Petersburg 197101, Russia
  • 3 The Biogerontology Research Foundation, Oxford, UK
  • 4 School of Systems Biology, George Mason University (GMU), Fairfax, VA 22030, USA
  • 5 Invitro Laboratory, Ltd, Moscow 125047, Russia
  • 6 Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
  • 7 Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA

Received: September 26, 2015       Accepted: May 9, 2016       Published: May 18, 2016      

https://doi.org/10.18632/aging.100968
How to Cite

Copyright: © 2016 Putin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

One of the major impediments in human aging research is the absence of a comprehensive and actionable set of biomarkers that may be targeted and measured to track the effectiveness of therapeutic interventions. In this study, we designed a modular ensemble of 21 deep neural networks (DNNs) of varying depth, structure and optimization to predict human chronological age using a basic blood test. To train the DNNs, we used over 60,000 samples from common blood biochemistry and cell count tests from routine health exams performed by a single laboratory and linked to chronological age and sex. The best performing DNN in the ensemble demonstrated 81.5 % epsilon-accuracy r = 0.90 with R2 = 0.80 and MAE = 6.07 years in predicting chronological age within a 10 year frame, while the entire ensemble achieved 83.5% epsilon-accuracy r = 0.91 with R2 = 0.82 and MAE = 5.55 years. The ensemble also identified the 5 most important markers for predicting human chronological age: albumin, glucose, alkaline phosphatase, urea and erythrocytes. To allow for public testing and evaluate real-life performance of the predictor, we developed an online system available at http://www.aging.ai. The ensemble approach may facilitate integration of multi-modal data linked to chronological age and sex that may lead to simple, minimally invasive, and affordable methods of tracking integrated biomarkers of aging in humans and performing cross-species feature importance analysis.

Abbreviations

ML: Machine Learning; SVM: Support Vector Machine; DNN: Deep Neural Network; PFI: Permutation Feature Importance; RF: Random Forests; GBM: Gradient Boosting Machine; kNN: k-Nearest Neighbors; DT: Decision Trees; LR: Linear Regression.