A Comparative Study on Classification Methods for Renal Cell and Lung Cancers Using RNA-Seq Data


HAZNEDAR B., Simsek N. Y.

IEEE ACCESS, cilt.10, ss.105412-105420, 2022 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 10
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1109/access.2022.3211505
  • Dergi Adı: IEEE ACCESS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Sayfa Sayıları: ss.105412-105420
  • Anahtar Kelimeler: Deep learning, Computer architecture, Microprocessors, Lung cancer, Gene expression, Classification algorithms, Support vector machines, Image classification, Machine learning, RNA, Classification, deep learning, gene expression, machine learning, RNA-Seq, GENE-EXPRESSION, DEEP
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Nowadays, The gene expression analysis gains a significant research interest and plays an important role for the classification and diagnosis of cancer types. In such research studies, the main difficulty is the processing time consumed due to numerous numbers of genes to be classified in human cell. RNA-Seq is a novel technology which enables researchers to obtain reliable knowledge in the analysis of numerous number of genes, so that can be effectively used for cancer classification. In this paper, commonly-used deep learning model based on deep neural network architecture has been proposed and utilized to analyze lung and renal cell cancer RNA-Seq datasets taken from The Cancer Genome Atlas (TCGA). The proposed method is compared with commonly-used other classical machine learning algorithms including decision trees (DT), random forests (RF), support vector machines (SVM) and artificial neural network (ANN) in terms of performance and accuracy for the same datasets. This study also presents the effects of different optimizers to the performance of deep learning algorithms. As a result, the proposed deep learning model have yielded the highest accuracy of 96.15% on renal cell and 95.54% on lung cancer data. It is found that the proposed deep learning model is very successful in classification of RNA-Seq datasets with large number of features compared. When results are compared with a previous study in literature which also analyses the same datasets, the proposed deep learning model outperforms the all other methods in various metrics.