GPT-OSS 120b Uncensored Aggressive Release (MXFP4 GGUF)

Hey everyone, made an uncensored version of GPT-OSS 120B. Quick specs: 117B total params, \~5.1B active (MoE with 128 experts, top-4 routing), 128K context. MXFP4 is the model's native precision - this isn't a quantization, it's how it was trained. No overall quality loss, though you can see CoT behave differently at times. This is the aggressive variant - **observed 0 refusals to any query during testing.** **Completely uncensored while keeping full model capabilities intact.** Link: https://huggingface.co/HauhauCS/GPTOSS-120B-Uncensored-HauhauCS-Aggressive Sampling settings: \- --temp 1.0 --top-k 40 \- Disable everything else (top\_p, min\_p, repeat penalty, etc.) - some clients turn these on by default \- llama.cpp users: --jinja is required for the Harmony response format or the model won't work right \- Example: llama-server -m model.gguf --jinja -fa -b 2048 -ub 2048 Single 61GB file. Fits on one H100. For lower VRAM, use --n-cpu-moe N in llama.cpp to offload MoE layers to CPU. Works with llama.cpp, LM Studio, Ollama, etc. If you want smaller models, I also have GPT-OSS 20B, GLM 4.7 Flash and Qwen3 8b VL uncensored: \- https://huggingface.co/HauhauCS/models/ As with all my releases, the goal is effectively lossless uncensoring - no dataset changes and no capability loss.

GPT-OSS 120b Uncensored Aggressive Release (MXFP4 GGUF)

More from r/LocalLLaMA

This is where we are right now, LocalLLaMA

Anthropic: "We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax." 🚨

I feel personally attacked