2024-07-30 09:32:25.AIbase.10.7k
Embarrassing! Meta's AI Security System Easily Bypassed by 'Spaces' Attack
The Prompt-Guard-86M model released by Meta is designed to defend against prompt injection attacks by restricting large language models from processing inappropriate inputs, thereby protecting system security. However, the model itself also exposes risks of being attacked. Research conducted by Aman Priyanshu found that by adding simple character spacing such as spaces or removing punctuation in the input, the model disregards prior security instructions, achieving an almost 100% success rate for attacks. This finding highlights the importance of AI security, despite Prompt