Glm 5.1 👀
Top AI discussions from Reddit's best communities. Updated automatically with the hottest posts from r/MachineLearning, r/LocalLLaMA, r/ChatGPT, and more.
Until now, LMStudio has basically been the "go-to" solution for more advanced LLM users in the GGUF ecosystem, but...
Hey r/LocalLlama, we're super excited to launch Unsloth Studio (Beta), a new open-source web UI to train and run LLMs...
Hey, I thought I'd do an update on my Homelab I posted a while back. I have it running on LLM experiments, which I...
It all started with using "the AI" to help me study for a big exam. Can it make some flashcards or questions? Then...
The creator of heretic p-e-w opened a pull request #211 with a new method called Arbitrary-Rank Ablation (ARA) the...
Let me pre-apologize for this long and rambling post but I get excited by stuff like this. I think a lot of folks here...
Quick context: I run a personal automation system built on Claude Code. It's model-agnostic, so switching to Ollama was...
I am genuinely surprised at how good the model is and that it can run on 14 years old device: 2nd gen i5 + 4GB DDR3 RAM.
The work I do involves customers that are sensitive to nation state politics. We cannot and do not use cloud API...
It just happens to be entirely against their will and TOS. I say: Distill Baby Distill!
It's quite ironic that they went for the censorship and authoritarian angles here. Full blog:
Why would they care about distillation when they probably have done the same with OpenAI models and the Chinese labs...
I’ve been working on a little side project comparing tokenizer efficiency across different companies’ models for...
Reading the comments, I’m guessing you didn’t bother to read this: "Safety and alignment at Meta Superintelligence."
I gave a try to zeroclaw agent (intstead of the bloated and overhyped one). After few hours of fuckery with configs...
Did you know that Qwen3 TTS utilizes voice embedding for voice cloning? Your voice is turned into a vector of 1024...
Three days ago, the following repository was published, which its “creator” has been aggressively promoting on various...
the first time i see a model exceed 3 trillion tokens per week on openrouter! the first time i see more than one model...
Hello everyone, A fast inference hardware startup, Taalas, has released a free chatbot interface and API endpoint...
I'm absolutely sure of it. The same usual suspects, the same language, the same who stole from whom the next million...
Inspired by this post from u/VoidAlchemy a few months back: Intrusive thoughts had me try to reproduce and extend the...
Hey r/LocalLLaMA, So I live in Ukraine during the war. Power goes out a lot here – russia regularly attacks our power...
Hello all, Just wanted to note that RDIMM prices are so wild.. Stacking rdimms starts to be as expensive as stacking...
Hey everyone, we just open-sourced KaniTTS2 - a text-to-speech model designed for real-time conversational use cases....
Llamas and Gentlemen, Heretic ( is the leading software for removing censorship from language models. In the three...
Hey everyone, made an uncensored version of GPT-OSS 120B. Quick specs: 117B total params, \~5.1B active (MoE with 128...
Hi all, I’m Anton from Nebius. We’ve updated the SWE-rebench leaderboard with our January runs on 48 fresh GitHub PR...
You can monitor quants begin to appear with this search:
Hii everyone, I present Dhi-5B: A 5 billion parameter Multimodal Language Model trained compute optimally with just...
OpenHands reveals the model size in their announcement. Still waiting for the model to appear on HF.
Title somewhat says it all. I get that it's related but if links to new models are being discussed shouldn't it be a...
Only official webpages released now. But the bench looks very promising: SWE-Bench Verified 80.2% Multi-SWE-Bench 51.3%...
We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of...
Hey r/LocalLlama! We’re excited to introduce \~12x faster Mixture of Experts (MoE) training with >35% less VRAM and...
Kimi > ChatGPT = Claude
Qwen team just released Qwen-Image-2.0. Before anyone asks - no open weights yet, it's API-only on Alibaba Cloud...
I know it has already been done but this is my AI trained on Epstein Emails. Surprisingly hard to do, as most LLMs will...
Like many of you, I like to use LLM as tools to help improve my daily life, from editing my emails, to online search....
I hacked together a small tool that lets you upload a .gguf file and visualize its internals in a 3D-ish way (layers /...
I've tried lots of "small" models < 60 GB in the past. GLM 4.5 Air, GLM 4.7 Flash, GPT OSS 20B and 120B, Magistral,...
Looking at the code at src/transformers/models/qwen35/modelingqwen3_5.py, it looks like Qwen3.5 series will have VLMs...
Ok so I've been working & experimenting with my own simple architecture. I call it Strawberry Here's the repo for...
We moved to self-hosted models specifically to avoid sending customer data to external APIs. Everything was working...
Been playing around with llama.cpp and some 30-80B parameter models with CPU offloading. Currently have one 3090 and 32...
Here we go! As expected by most of us here. Jason Meller from 1password argues that OpenClaw’s agent “skills” ecosystem...
Hey everyone, Last week I shared preliminary results on a new subquadratic attention mechanism ( Following up with the...
While it’s great that so many people on LocalLLaMA are pushing the envelope with what can be done locally with...
I installed qwen3-235b on my desktop system, and I had to join here to brag about it. It's such a careful model, the...
If you own a copy of Balatro, you can make your local LLM play it. I built tools to let LLMs play Balatro autonomously....
About 2 weeks ago, I posted about running GLM-4.7-Flash on 16 GB of VRAM here...
Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source...
It’s already supported in Comfy. MIT license. HuggingFace Demo is also available! Pretty much the whole package - LoRAs...
ACE-Step 1.5 is an open-source music model that can generate a full song in about 2 seconds on an A100, runs locally on...
Qwen3-Coder-Next is out!
Twitter Link:
For those who used Cline with local models, heads up that the core team appears to have joined OpenAI's Codex group...
The newly released LingBot-World framework offers the first high capability world model that is fully open source,...
It this the js framework hell moment of ai?
I haven't seen a system with this format before but with how successful the result was I figured I might as well share...
I tried many MoE models at 30B or under and all of them failed sooner or later in an agentic framework. If z.ai is not...
Disclaimer: I am from Germany and my English is not perfect, so I used an LLM to help me structure and write this post....
This is a sequel to my previous thread from 2024. I originally planned to pick up another pair of MI100s and an...
I’ve been trying to find an AI that’s genuinely unfiltered and technically advanced, uncensored something that can...
Hey peeps, I'm feeling in a bit of a omg the world is ending mood and have been amusing myself by downloading and...
DeeepSeek AI released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large...
Hi all, I’m Anton from Nebius. We’ve updated the SWE-bench leaderboard with our December runs on 48 fresh GitHub PR...
Thank you guys, thanks to everyone who took the time to write a comment or a post explaining, teaching people how...
Originally this was my gaming rig but I went ITX and basically bought a new computer. So I had the case, fans, AIO, 64...
Just want to sing the praises of this model. I am stunned at how intelligent it is for a 30b model. Comparing it to...
Hey r/LocalLlama! We're excited to show how Unsloth now enables 7x longer context lengths (up to 12x) for Reinforcement...
Nvidia has essentially killed off supply for the RTX 5070 Ti. Also supply of RTX 5060 Ti 16 GB has been significantly...
I ran passages from Project Gutenberg through GPT-4o-mini 10 times over, each time telling it to "make it read far...
Hey everyone, The team at Neuphonic is back with a new open-source release: NeuTTS Nano. After NeuTTS Air trended #1 on...
Hello everyone! Today, I am announcing Soprano 1.1! I’ve designed it for massively improved stability and audio quality...
I’ve seen some arguments we’ve reached AGI, it’s just about putting the separate pieces together in the right context....