AI-Based Brute-Forcing Attack Outperforming Probabilistic Model

Web Vulnerability Assessment and Penetration Testing (Web VAPT) aims to identify vulnerabilities in web apps.

However, current wordlist-based methods are ineffective since directory brute-forcing attacks can establish reachable directories.

Offensive AI is the integration of AI technology to enhance cyber attacks. A new Language Model (LM) framework that the following researchers have proposed here for improving directory enumeration:- 

  • Alberto Castagnaro from Delft University of Technology
  • Mauro Conti from University of Padova
  • Luca Pajola from Spritz Matter Srl

The LM-based attack performed 969% better on average than traditional approaches in experiments on 1 million URLs from different domains.

AI-Based Brute-Forcing Attack

However, ethical hacking with permission is the same as hacking done with malicious intentions, and it helps to discover weaknesses before they are exploited. 

Japanese tech company NEC has developed a way of catching crooks by closely monitoring those using computer networks through the “NEC’s Cyber Attack Alert System.” 

Offensive AI involves blending AI’s flexibility and attack vectors, which enables sophisticated automated threats that analyze data rapidly and evade defenses.

The primary purpose of directory enumeration brute-force attacks is to find hidden files and directories on web servers by sending countless requests with URLs taken from wordlist, reads the report.

The objective is to discover sensitive information, admin interfaces, or the like that could be compromised for unauthorized access in case of misconfiguration. 

These kinds of attacks rely on tools such as DirBuster, wfuzz, and BurpSuite, which use various types of wordlists, such as general, backup-specific, and CMS, to generate URL payloads. 

Choosing the right wordlist is a vital factor in the attack’s success.

Free Webinar | Mastering WAAP/WAF ROI Analysis | Book Your Spot

Automated tools aid in brute-forcing, but selecting an appropriate wordlist specifically designed for targets can determine the number of vulnerabilities uncovered.

Traditional directory brute-forcing using wordlists is inefficient. This work explores probability-based, and Language Model approaches exploiting two key aspects:-

Prior knowledge from similar web apps is used to guide requests. Adaptive Decision-Making to dynamically generate URLs during the attack. 

Prediction of the next directory from our LM-based architecture (Source – Arxiv)

Two techniques are proposed: a Weighted Training Tree combining paths across multiple web apps with node weights indicating frequency and a Weighted Wordlist Tree pruned from a general wordlist based on the training data.

The probabilistic method uses a max heap ordered by the probabilities of words being valid subdirectories. This heap is computed on the fly by dividing the weight of each subdirectory by the total weights under that directory.

This uses personalized embeddings trained on corpora to generate URL platforms and avoid pitfalls in probabilistic models underpinning the weighted tree data structure.

So, instead of brute-forcing current directories, one may consider some ways to utilize prior knowledge while employing LM approaches grounded in probability theory.

On average, LM-based attacks outperformed brute force by 969%, while probabilistic models worked better for stealthy attacks when limited request budgets were accounted for.

Cross-lingual transfer may more effectively preserve contextuality when different languages transpose directory predictions.

Looking to Safeguard Your Company from Advanced Cyber Threats? Deploy TrustNet to Your Radar ASAP.

Tushar is a Cyber security content editor with a passion for creating captivating and informative content. With years of experience under his belt in Cyber Security, he is covering Cyber Security News, technology and other news.