Automated classification of chest X-rays: a deep learning approach with attention mechanisms


Creative Commons License

OLTU B., GÜNEY S., Yuksel S. E., DENGİZ B.

BMC MEDICAL IMAGING, no.1, 2025 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2025
  • Doi Number: 10.1186/s12880-025-01604-5
  • Journal Name: BMC MEDICAL IMAGING
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Biotechnology Research Abstracts, EMBASE, MEDLINE, Directory of Open Access Journals
  • Hacettepe University Affiliated: Yes

Abstract

BackgroundPulmonary diseases such as COVID-19 and pneumonia, are life-threatening conditions, that require prompt and accurate diagnosis for effective treatment. Chest X-ray (CXR) has become the most common alternative method for detecting pulmonary diseases such as COVID-19, pneumonia, and lung opacity due to their availability, cost-effectiveness, and ability to facilitate comparative analysis. However, the interpretation of CXRs is a challenging task.MethodsThis study presents an automated deep learning (DL) model that outperforms multiple state-of-the-art methods in diagnosing COVID-19, Lung Opacity, and Viral Pneumonia. Using a dataset of 21,165 CXRs, the proposed framework introduces a seamless combination of the Vision Transformer (ViT) for capturing long-range dependencies, DenseNet201 for powerful feature extraction, and global average pooling (GAP) for retaining critical spatial details. This combination results in a robust classification system, achieving remarkable accuracy.ResultsThe proposed methodology delivers outstanding results across all categories: achieving 99.4% accuracy and an F1-score of 98.43% for COVID-19, 96.45% accuracy and an F1-score of 93.64% for Lung Opacity, 99.63% accuracy and an F1-score of 97.05% for Viral Pneumonia, and 95.97% accuracy with an F1-score of 95.87% for Normal subjects.ConclusionThe proposed framework achieves a remarkable overall accuracy of 97.87%, surpassing several state-of-the-art methods with reproducible and objective outcomes. To ensure robustness and minimize variability in train-test splits, our study employs five-fold cross-validation, providing reliable and consistent performance evaluation. For transparency and to facilitate future comparisons, the specific training and testing splits have been made publicly accessible. Furthermore, Grad-CAM-based visualizations are integrated to enhance the interpretability of the model, offering valuable insights into its decision-making process. This innovative framework not only boosts classification accuracy but also sets a new benchmark in CXR-based disease diagnosis.