Priority Research Paper Volume 15, Issue 17 pp 8552—8575

Fail-tests of DNA methylation clocks, and development of a noise barometer for measuring epigenetic pressure of aging and disease

class="figure-viewer-img"

Figure 2. Fail-tests of EN models that are trained on biological v. non-biological parameters. (A) Standard first generation EN model was trained on 450K DNAme arrays dataset; tests with the original and new datasets, as indicated, are shown (blue – healthy, red – disease). BRCA-1 studies were done on 27K dataset, thus, after training on 27K dataset. EN was tested on BRCA-1 data. EN models were testable with the original and new datasets, except for the BRCA-1, where only the subjects with BRCA1 mutation and cancer but not healthy controls received fairly accurate age predictions. All EN models overlapped predictions for healthy subjects and patients (see Supplementary Figure 2 for the tests of EN models on additional disease v. healthy datasets). (B) PhenoAge scatter plots overlay for the patients with the studied datasets (red dots-Combined Disease, blue dots-Combined Healthy). (C) PhenoAge tests on the six 450K datasets are represented as the bar graphs on Mean and SDs of predicted age minus actual age (left) and as the comparison in absolute residuals (right); patients – red bars n=806, healthy controls – blue bars; n=754, p-values are shown, NS- non-significant. (D) PhenoAge predictions of DNAme age for the indicated age-intervals are shown as bar graphs of Means with SDs for the patients with Down Syndrome and Parkinson’s Disease (red bars), and their healthy controls (blue bars). DS, n=26, Control, n=58; PD, n=289, Control, n=219; p-values are shown. (E) EN model was trained on the US population numbers, at subjects’ birth years, using the same 450K DNAme array, as in A. The test shows excellent linearity and low MAE in predicting the numbers of people living in US from DNAme array, through 151 clock cytosines. (F) The US population EN model was then trained to predict persons’ age with successful tests (near perfect linearity and MAE of 2.6 years) by 149 clock cytosines. (G) This US population clock was also testable as an age predictor on the out-of-sample 450K datasets of indicated diseases (red dots) and their healthy controls (blue dots); health and disease predictions overlapped. (H) The age predictions made by the US population clock are shown as scatter plots overlay for Combined Disease (arthritis, Down Syndrome, HIV and Chron’s – red dots) and their healthy controls (Combined Healthy, blue dots). All GSEs are in Methods.