Computational Screening Using a Combination of Ligand-Based Machine Learning and Molecular Docking Methods for the Repurposing of Antivirals Targeting the SARS-CoV-2 Main Protease

Yuda G. P. W. C., Hanif N., Hermawan A.

DARU, Journal of Pharmaceutical Sciences, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2023
  • Doi Number: 10.1007/s40199-023-00484-w
  • Journal Name: DARU, Journal of Pharmaceutical Sciences
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, BIOSIS, Biotechnology Research Abstracts, CAB Abstracts, CINAHL, EMBASE, International Pharmaceutical Abstracts, MEDLINE, Veterinary Science Database
  • Keywords: COVID-19, Drug repurposing, Ligand-based virtual screening, Mpro, Molecular docking
  • Hacettepe University Affiliated: Yes


Background: COVID-19 is an infectious disease caused by SARS-CoV-2, a close relative of SARS-CoV. Several studies have searched for COVID-19 therapies. The topics of these works ranged from vaccine discovery to natural products targeting the SARS-CoV-2 main protease (Mpro), a potential therapeutic target due to its essential role in replication and conserved sequences. However, published research on this target is limited, presenting an opportunity for drug discovery and development. Method: This study aims to repurpose 10692 drugs in DrugBank by using ligand-based virtual screening (LBVS) machine learning (ML) with Konstanz Information Miner (KNIME) to seek potential therapeutics based on Mpro inhibitors. The top candidate compounds, the native ligand (GC-376) of the Mpro inhibitor, and the positive control boceprevir were then subjected to absorption, distribution, metabolism, excretion, and toxicity (ADMET) characterization, drug-likeness prediction, and molecular docking (MD). Protein–protein interaction (PPI) network analysis was added to provide accurate information about the Mpro regulatory network. Results: This study identified 3,166 compound candidates inhibiting Mpro. The random forest (RF) molecular access system ML model provided the highest confidence score of 0.95 (bromo-7-nitroindazole) and identified the top 22 candidate compounds. Subjecting the 22 candidate compounds, the native ligand GC-376, and boceprevir to further ADMET property characterization and drug-likeness predictions revealed that one compound had two violations of Lipinski’s rule. Additional MD results showed that only five compounds had more negative binding energies than the native ligand (− 12.25 kcal/mol). Among these compounds, CCX-140 exhibited the lowest score of − 13.64 kcal/mol. Through literature analysis, six compound classes with potential activity for Mpro were discovered. They included benzopyrazole, azole, pyrazolopyrimidine, carboxylic acids and derivatives, benzene and substituted derivatives, and diazine. Four pathologies were also discovered on the basis of the Mpro PPI network. Conclusion: Results demonstrated the efficiency of LBVS combined with MD. This combined strategy provided positive evidence showing that the top screened drugs, including CCX-140, which had the lowest MD score, can be reasonably advanced to the in vitro phase. This combined method may accelerate the discovery of therapies for novel or orphan diseases from existing drugs. Graphical abstract: [Figure not available: see fulltext.]