Prediction of gastric cancer by machine learning integrated with mass spectrometry-based N-glycomics


Demirhan D. B., Yılmaz H., Erol H., Kayili H. M., SALİH B.

Analyst, cilt.148, 2023 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 148
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1039/d2an02057b
  • Dergi Adı: Analyst
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Aqualine, Aquatic Science & Fisheries Abstracts (ASFA), CAB Abstracts, Chemical Abstracts Core, Chimica, Communication Abstracts, Compendex, EMBASE, Food Science & Technology Abstracts, MEDLINE, Metadex, Pollution Abstracts, Veterinary Science Database, Civil Engineering Abstracts
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Early and accurate diagnosis of gastric cancer is vital for effective and targeted treatment. It is known that glycosylation profiles differ in the cancer tissue development process. This study aimed to profile the N-glycans in gastric cancer tissues to predict gastric cancer using machine learning algorithms. The (glyco-) proteins of formalin-fixed parafilm embedded (FFPE) gastric cancer and adjacent control tissues were extracted by chloroform/methanol extraction after the conventional deparaffinization step. The N-glycans were released and labeled with a 2-amino benzoic (2-AA) tag. The MALDI-MS analysis of the 2-AA labeled N-glycans was performed in negative ionization mode, and fifty-nine N-glycan structures were determined. The relative and analyte areas of the detected N-glycans were extracted from the obtained data. Statistical analyses identified significant expression levels of 14 different N-glycans in gastric cancer tissues. The data were separated based on the physical characteristics of N-glycans and used to test in machine-learning models. It was determined that the multilayer perceptron (MLP) was the most appropriate model with the highest sensitivity, specificity, accuracy, Matthews correlation coefficient, and f1 scores for each dataset. The highest accuracy score (96.0 ± 1.3) was obtained from the whole N-glycans relative area dataset, and the AUC value was determined as 0.98. It was concluded that gastric cancer tissues could be distinguished from adjacent control tissues with high accuracy using mass spectrometry-based N-glycomic data.