Prompt injection is killing our self-hosted LLM deployment

We moved to self-hosted models specifically to avoid sending customer data to external APIs. Everything was working fine until last week when someone from QA tried injecting prompts during testing and our entire system prompt got dumped in the response. Now I'm realizing we have zero protection against this. Traditional web application firewalls don't understand LLM-specific attacks. The model just treats malicious prompts like normal user input and happily complies. Has anyone actually solved prompt injection for production LLM apps? Not talking about basic input sanitization because adversarial prompts can be crafted to look completely normal.

Prompt injection is killing our self-hosted LLM deployment

More from r/LocalLLaMA

This is where we are right now, LocalLLaMA

Anthropic: "We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax." 🚨

I feel personally attacked