Research Paper Volume 16, Issue 4 pp 3856—3879

An assessment system for clinical and biological interpretability in ulcerative colitis

class="figure-viewer-img"

Figure 3. Machine learning-based integrative program generates UCRGs. (A) A schematic overview of the machine learning process based on the UCRGs development integration model. (B) Comprehensive performances of eight types of learners. Box plots depicted the distribution of the residuals, with red highlighted dots representing root mean square of residuals (RMSR). Circles showed the distribution of recall, precision, F1-score, C-index, and accuracy of each learner. (C) The influence of the number of decision trees on the error rate. The x-axis represents the number of decision trees, and the y-axis shows the error rate. (D) The importance of common DEGs varies. The barplot shows the distribution of the average decreasing Gini coefficients, while the line chart shows the average decreasing accuracy. The top 12 genes were identified as UCRGs. (E) Using 10-fold 10 repeated cross-validation combined with decreasing precision method (based on Gini coefficient of random forest) to eliminate the recursive features of commonDEGs, reduce the dimension of feature space and avoid over-fitting. When the number of variables is set to 14, the error is minimized. (F) Principal coordinate analysis of Bray-Curtis dissimilarities obtained for the UCRGs expression profiles in the GSE87466 cohorts. The circles and error bars indicated the mean and standard errors of the mean. (G) The distinction of UCRScore in the groups of meta-cohort.