Translated data: Microsoft recently experienced a data breach on GitHub, resulting in the leakage of 38TB of private data, including passwords, keys, and internal messages, due to the misuse of shared access signature tokens (SAS) on the Azure platform. The improper use of SAS tokens and lack of monitoring exposed the data for years, highlighting the security challenges in AI model training. This incident underscores the need for stronger security measures and collaborative efforts in the AI development process that relies heavily on large-scale data.