Posted by u/pigeon57434
Heretic has FINALLY defeated GPT-OSS with a new experimental decensoring method called ARA
The creator of heretic p-e-w opened a pull request #211 with a new method called Arbitrary-Rank Ablation (ARA) the creator of the project explanation For comparison, the previous best was eww 74 refusals even after heretic, which is pretty ridiculous. It still refuses almost all the same things as the base model since OpenAI lobotomized it so heavily, but now with the new method, ARA has finally defeated GPT-OSS (no system messages even needed to get results like this one) rest of output not shown for obvious reasons but go download it yourself if you wanna see This means the future of open source AI is actually open and actually free, not even OpenAI's ultra sophisticated lobotomization can defeat what the open source community can do! https://huggingface.co/p-e-w/gpt-oss-20b-heretic-ara-v3 This is still experimental, so most heretic models you see online for the time being will probably not use this method. It's only in an unreleased version of Heretic for now, make sure you get ones that say they use MPOA+SOMA for now, but if you can once this becomes available in the full Heretic release, there will be more that use ARA, so almost always use those if available.
More from r/LocalLLaMA
Anthropic: "We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax." 🚨
I feel personally attacked
Distillation when you do it. Training when we do it.