Recently, an investigation conducted by researcher Naphtali Deutsch has garnered significant attention. Through web scanning, he discovered that hundreds of open-source large language model (LLM) construction servers and dozens of vector databases are leaking a vast amount of sensitive information. This leakage stems from many companies' neglect of the security of related tools when they rush to integrate artificial intelligence (AI) into their workflows.

Hacker, Cyber Attack, Coding

Image Source: Generated by AI, Image Licensed by Midjourney

Flowise is a low-code tool that allows users to create various LLM applications, whether they are customer service bots or data generation and extraction tools. Although most Flowise servers are password-protected, relying solely on passwords does not ensure security.

Deutsch mentioned that as early as this year, researchers discovered a vulnerability in Flowise's authentication bypass. Simply capitalizing a few characters in the program's API endpoint could easily trigger this vulnerability. Using this flaw, Deutsch successfully hacked into 438 Flowise servers and found that they stored sensitive data such as GitHub access tokens, OpenAI API keys, Flowise passwords and API keys, as well as configuration and prompt information related to Flowise applications.

In addition to Flowise, Deutsch also found about 30 vector databases without any authentication checks. These databases contained obviously sensitive information, including private emails from an engineering services company, documents from a fashion company, customer personal and financial data from an industrial equipment company, etc. The risk of vector database leakage is greater because hackers can not only steal data but also delete or damage it, or implant malicious software, affecting users when using AI tools.

To mitigate these risks, Deutsch recommends that companies restrict access to the AI services they rely on, monitor and record activities related to these services, protect sensitive data transmitted by LLM applications, and always apply software updates where possible.

He emphasized that with the proliferation of these tools, many developers lack sufficient security knowledge to set up properly, and security often lags behind technological development.

Key Points:

1. 🔒 Hundreds of LLM and vector databases are leaking sensitive information, posing a significant security concern for enterprises.

2. 💻 Flowise tool has an authentication bypass vulnerability that has been exploited by hackers to breach multiple servers.

3. 🛡️ Experts advise companies to strengthen access control to AI services, regularly monitor, and update security measures.