In a recent incident, the Microsoft AI research team inadvertently exposed a staggering 38 terabytes of private data on their GitHub repository.
This exposure resulted from the misconfiguration of an Azure feature known as SAS tokens, which are used to share data from Azure Storage accounts.
The misconfiguration allowed access to the entire storage account, including sensitive information like personal computer backups, passwords, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees.
Attend the Live DDoS Website & API Attack Simulation webinar to gain knowledge on various types of attacks and how to prevent them.
What’s particularly concerning is that the access level was set to “full control,” enabling not just viewing but also deletion and overwriting of files.
This incident underscores the new security challenges organizations face as they leverage AI and work with massive volumes of training data.
The incident was discovered by the Wiz Research Team, which was scanning the internet for misconfigured storage containers.
Wiz is a cybersecurity company that allows companies to find security issues in public cloud infrastructure.
They stumbled upon a GitHub repository owned by Microsoft’s AI research division, where users were instructed to download models from an Azure Storage URL.
Unfortunately, this URL granted access to far more than just the intended open-source models.
In the world of Azure, Shared Access Signature (SAS) tokens play a crucial role in granting access to Azure Storage data.
These tokens are like keys to the kingdom, offering varying levels of access, from read-only to full control, and can be scoped to a single file, container, or even an entire storage account.
Their flexibility sets SAS tokens apart – you can tailor them to expire whenever you choose or make them practically eternal.
However, with great power comes great responsibility, and the potential for overreach is real.
At its most permissive, an SAS token can mimic the access capabilities of the entire account key, leaving your storage account wide open indefinitely.
Three flavors of SAS tokens exist Account SAS, Service SAS, and User Delegation SAS. In this piece, we’ll delve into Account SAS tokens, a popular choice and the kind used in Microsoft’s repository.
Generating an Account SAS token is relatively straightforward. Users configure the token’s scope, permissions, and expiration date, and voilà, the token is born.
It’s essential to note that this entire process occurs client-side, not on Azure servers. Consequently, the resulting token isn’t an Azure entity per se.
The ease of creating high-privilege, everlasting SAS tokens also raises security concerns. If a user unwittingly generates an ultra-permissive, never-expiring token, administrators may not even be aware it exists or where it’s being used.
Revoking such a token is far from a walk in the park – it entails rotating the account key that signed it, effectively rendering all tokens signed by that key useless.
This unique challenge creates a vulnerability that attracts the attention of attackers seeking exposed data.
Moreover, Azure’s SAS token system lacks robust monitoring capabilities, making it an enticing tool for attackers aiming to maintain a persistent foothold in compromised storage accounts.
To mitigate such risks, organizations are advised to limit the use of Account SAS tokens for external sharing and consider using Service SAS tokens with Stored Access Policies or User Delegation SAS tokens for time-limited sharing.
Creating dedicated storage accounts for external sharing can also help contain potential damage.
Security teams should actively participate in Microsoft AI development processes, addressing security risks associated with data sharing and potential supply chain attacks.
Awareness and collaboration between security, data science, and research teams are essential to establish proper security measures throughout the AI development lifecycle.
Microsoft has taken steps to address the issue, including invalidating the SAS token and replacing it on GitHub.