Figure 3. Computational strategy for building cell-type specific epigenetic clocks. (A) Given a tissue-type of interest with DNAm profiles measured over a relatively large number of samples, we first estimate the underlying cell-type proportions in each sample using an existing tissue-specific DNAm reference matrix. Density plots depict the distribution of cell-type fractions across all samples. Next, using a sufficiently large training set of samples encompassing a relatively wide age-range, we apply the CellDMC algorithm to infer age-associated DNAm changes in each of the underlying cell-types (age-DMCTs). Barplot depicts the number of age-DMCTs in each cell-type. (B) The construction of cell-type specific clocks then proceeds by restricting to age-DMCTs of one cell-type: an intrinsic clock is built by adjusting the DNAm training dataset for variations in cell-type fractions (CTFs), defining a matrix of DNAm residuals. Alternatively, the training over the age-DMCTs can be done on the DNAm data matrix without adjustment for CTFs which will result in a “semi-intrinsic” clock. In either case, Elastic Net models are learned for each choice of a penalty parameter L, and the optimal model L* is selected based on the best generalization performance (smallest root mean square error (RMSE)) obtained in a blinded model selection set. This optimal model then defines the corresponding cell-type specific DNAm-clock. This procedure can be done for each cell-type separately, assuming that sufficient numbers of age-DMCTs in that cell-type can be identified. Once the cell-type specific clocks are built, these are then validated in independent DNAm datasets. (C) Top row: boxplots display the root mean square error (RMSE) between predicted and true ages, for semi-intrinsic (Sin) and intrinsic (In) clocks, as assessed in 50 simulated validation sets of 200 mixtures for four different scenarios. Cell-type specific clocks were constructed from 50 simulated training sets of 200 mixtures (mixing together 3 cell-types that we call granulocytes, monocytes and lymphocytes) with age-DMCTs occurring only in one cell-type (lymphocytes). In scenarios (i) and (iii), no cell-type fractions change with age, in scenarios (ii) and (iv) the lymphocyte fraction changes with age. In scenarios (i) and (ii), age-DMCTs do not discriminate immune-cell types from each other, in scenarios (iii) and (iv) age-DMCTs discriminate lymphocytes from granulocytes and monocytes. P-values from a two-tailed paired Wilcoxon test are given. Bottom row: as top-row but now displaying the PCC of predicted vs. true age.