Aging
Navigate
Research Paper|Volume 12, Issue 10|pp 9840—9854

Development of a machine learning-based multimode diagnosis system for lung cancer

Shuyin Duan1, Huimin Cao1, Hong Liu2, Lijun Miao2, Jing Wang2, Xiaolei Zhou3, Wei Wang1, Pingzhao Hu4, Lingbo Qu1,5, Yongjun Wu1,6
  • 1College of Public Health, Zhengzhou University, Zhengzhou 450001, China
  • 2The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450001, China
  • 3Henan Provincial Chest Hospital, Zhengzhou 450001, China
  • 4Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB R3E 3N4, Canada
  • 5Henan Joint International Research Laboratory of Green Construction of Functional Molecules and Their Bioanalytical Applications, Zhengzhou 450001, China
  • 6The Key Laboratory of Nanomedicine and Health Inspection of Zhengzhou, Zhengzhou 450001, China
Received: February 10, 2020Accepted: April 20, 2020Published: May 23, 2020

Copyright © 2020 Duan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

As an emerging technology, artificial intelligence has been applied to identify various physical disorders. Here, we developed a three-layer diagnosis system for lung cancer, in which three machine learning approaches including decision tree C5.0, artificial neural network (ANN) and support vector machine (SVM) were involved. The area under the curve (AUC) was employed to evaluate their decision powers. In the first layer, the AUCs of C5.0, ANN and SVM were 0.676, 0.736 and 0.640, ANN was better than C5.0 and SVM. In the second layer, ANN was similar with SVM but superior to C5.0 supported by the AUCs of 0.804, 0.889 and 0.825. Much higher AUCs of 0.908, 0.910 and 0.849 were identified in the third layer, where the highest sensitivity of 94.12% was found in C5.0. These data proposed a three-layer diagnosis system for lung cancer: ANN was used as a broad-spectrum screening subsystem basing on 14 epidemiological data and clinical symptoms, which was firstly adopted to screen high-risk groups; then, combining with additional 5 tumor biomarkers, ANN was used as an auxiliary diagnosis subsystem to determine the suspected lung cancer patients; C5.0 was finally employed to confirm lung cancer patients basing on 22 CT nodule-based radiomic features.