Linear-Scaling Systematic Molecular Fragmentation Approach for Perturbation Theory and Coupled-Cluster Methods

Bozkaya U., Ermis B.

JOURNAL OF CHEMICAL THEORY AND COMPUTATION, vol.18, no.9, pp.5349-5359, 2022 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 18 Issue: 9
  • Publication Date: 2022
  • Doi Number: 10.1021/acs.jctc.2c00587
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Chemical Abstracts Core, EMBASE, MEDLINE
  • Page Numbers: pp.5349-5359
  • Hacettepe University Affiliated: Yes


The coupled-cluster (CC) singles and doubles with perturbative triples [CCSD(T)] method is frequently referred to as the "gold standard" of modern computational chemistry. However, the high computational cost of CCSD(T) [O(N-7)], where N is the number of basis functions, limits its applications to small-sized chemical systems. To address this problem, efficient implementations of linear-scaling coupled-cluster methods, which employ the systematic molecular fragmentation (SMF) approach, are reported. In this study, we aim to do the following: (1) To achieve exact linear scaling and to obtain a pure ab initio approach, we revise the handling of nonbonded interactions in the SMF approach, denoted by LSSMF. (2) A new fragmentation algorithm, which yields smaller-sized fragments, that better fits high-level CC methods is introduced. (3) A modified nonbonded fragmentation scheme is proposed to enhance the existent algorithm. Performances of the LSSMF-CC approaches, such as LSSMF-CCSD(T), are compared with their canonical versions for a set of alkane molecules, CnH2n+2 (n = 6-10), which includes 142 molecules. Our results demonstrate that the LSSMF approach introduces negligible errors compared with the canonical methods; mean absolute errors (MAEs) are between 0.20 and 0.59 kcal mol(-1) for LSSMF(3,1)-CCSD(T). For a larger alkanes set (L12), CnH2n+2 (n = 50-70), the performance of LSSMF for the second-order perturbation theory (MP2) is investigated. For the L12 set, various bonded and nonbonded levels are considered. Our results demonstrate that the combination of bonded level 6 with nonbonded level 2, LSSMF(6,2), provides very accurate results for the MP2 method with a MAE value of 0.32 kcal mol(-1). The LSSMF(6,2) approach yields more than a 26-fold reduction in errors compared with LSSMF(3,1). Hence, we obtain substantial improvements over the original SMF approach. To illustrate the efficiency and applicability of the LSSMF-CCSD(T) approach, we consider an alkane molecule with 10,004 atoms. For this molecule, the LSSMF(3,1)-CCSD(T)/cc-pVTZ energy computation, on a Linux cluster with 100 nodes, 4 cores, and 5 GB of memory provided to each node, is performed just in similar to 24 h. As a second test, we consider a biomolecular complex (PDB code: 1GLA), which includes 10,488 atoms, to assess the efficiency of the LSSMF approach. The LSSMF(3,1)-FNO-CCSD(T)/cc-pVTZ energy computation is completed in similar to 7 days for the biomolecular complex. Hence, our results demonstrate that the LSSMF-CC approaches are very efficient. Overall, we conclude the following: (1) The LSSMF(m, n)-CCSD(T) methods can be reliably used for large-scale chemical systems, where the canonical methods are not computationally affordable. (2) The accuracy of bonded level 3 is not satisfactory for large chemical systems. (3) For high-accuracy studies, bonded level 5 (or higher) and nonbonded level 2 should be employed.