Abstract

Aging has pronounced effects on blood laboratory biomarkers used in the clinic. Prior studies have largely investigated one biomarker or population at a time, limiting a comprehensive view of biomarker variation and aging across different populations. Here we develop a supervised machine learning approach to study aging using 356 blood biomarkers measured in 67,563 individuals across diverse populations. Our model predicts age with a mean absolute error (MAE), or average magnitude of prediction errors, in held-out data of 4.76 years and an R2 value of 0.92. Age prediction was highly accurate for the pediatric cohort (MAE = 0.87, R2 = 0.94) but inaccurate for ages 65+ (MAE = 4.30, R2 = 0.25). Variability was observed in which biomarkers carry predictive power across age groups, genders, and race/ethnicity groups, and novel candidate biomarkers of aging were identified for specific age ranges (e.g. Vitamin E, ages 18-44). We show that predictors for one age group may fail to generalize to other groups and investigate non-linearity in biomarkers near adulthood. As populations worldwide undergo major demographic changes, it is increasingly important to catalogue biomarker variation across age groups and discover new biomarkers to distinguish chronological and biological aging.