Figure 2. Prediction of the cellular composition in mouth swab samples. (A) Representative mouth swab smears with different proportions of leukocytes and epithelial cells. Smears of freshly harvested cells were stained with haematoxylin and eosin. (B) Mean β-values of CpGs on Illumina 27k Bead Chip in datasets of buccal swabs (GSE50586) and blood (GSE39981). Red arrows indicate CpG sites selected for the “Buccal-Cell-Signature”. (C) As additional criterion for suitable cell type-specific CpGs we used the sum of variances in both datasets. (D) Mean β-values at cg07380416 (CD6) and cg20837735 (SERPINB5) were compared in whole blood (GSE41037, GSE39981), hematopoietic subsets (GSE39981), saliva (GSE28746, GSE34035, GSE39560), and buccal swabs (GSE25892, GSE50586). Error bars represent standard deviation. (E, F) The percentage of buccal epithelial cells versus leukocytes was determined by cell counting in 11 stained mouth swab smears. DNAm levels at the two cell type-specific CpGs were determined by pyrosequencing and correlated with cell counts. (G) Linear regressions of both CpGs were combined into the Buccal-Cell-Signature. Predicted percentages of buccal epithelial cells correlated with cell counts. (H) Percentages of epithelial cells were subsequently estimated using the Buccal-Cell-Signature for 55 samples of the training set and 26 samples of the validation set. Error bars represent standard deviation.