IEEE ACCESS, cilt.12, ss.161688-161696, 2024 (SCI-Expanded)
The widespread use of web applications requires important changes in cybersecurity to protect online services and data. In the process of identifying security vulnerabilities in web applications, a systematic approach is employed to detect and mitigate cybersecurity risks. This approach utilizes web crawlers to identify attack vectors. Traditional web crawling methods are resource-intensive and often need to be more efficient in handling dynamic JavaScript-rich content. Addressing this crucial gap, our study introduces an innovative approach to predict the necessity of JavaScript rendering, thereby enhancing the effectiveness and efficiency of security-oriented web crawlers. This approach seeks to reduce computational requirements and quicken the security evaluation process through the use of machine learning algorithms. By utilizing a dataset containing the source code from the main pages of 17,160 websites, our experimental results demonstrate a 20% reduction in execution time compared to full JavaScript rendering, indicating an improvement in resource usage without any significant reduction in coverage. Our methodology significantly improves the efficiency of security-focused web crawlers and helps security scanners to detect security risks of web applications with fewer resources.