Aging
Navigate
Research Paper|Volume 12, Issue 14|pp 14506—14527

Identification of lncRNA biomarkers for lung cancer through integrative cross-platform data analyses

Tianying Zhao1,2, Vedbar Singh Khadka1, Youping Deng1
  • 1Department of Quantitative Health Sciences, University of Hawaii John A. Burns School of Medicine, The University of Hawaii at Manoa, Honolulu, HI 96813, USA
  • 2Department of Molecular Biosciences and Bioengineering, The University of Hawaii at Manoa College of Tropical Agriculture and Human Resources, Agricultural Sciences 218, Honolulu, HI 96822, USA
Received: January 9, 2020Accepted: June 1, 2020Published: July 16, 2020

Copyright: © 2020 Zhao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

This study was designed to identify lncRNA biomarker candidates using lung cancer data from RNA-Seq and microarray platforms separately.

Lung cancer datasets were obtained from the Gene Expression Omnibus (GEO, n = 287) and The Cancer Genome Atlas (TCGA, n = 216) repositories, only common lncRNAs were used. Differentially expressed (DE) lncRNAs in tumors with respect to normal were selected from the Affymetrix and TCGA datasets. A training model consisting of the top 20 DE Affymetrix lncRNAs was used for validation in the TCGA and Agilent datasets. A second similar training model was generated using the TCGA dataset.

First, a model using the top 20 DE lncRNAs from Affymetrix for training and validated using TCGA and Agilent, achieved high prediction accuracy for both training (98.5% AUC for Affymetrix) and validation (99.2% AUC for TCGA and 92.8% AUC for Agilent). A similar model using the top 20 DE lncRNAs from TCGA for training and validated using Affymetrix and Agilent, also achieved high prediction accuracy for both training (97.7% AUC for TCGA) and validation (96.5% AUC for Affymetrix and 80.9% AUC for Agilent). Eight lncRNAs were found to be overlapped from these two lists.