My gpu poor comrades, GLM 4.7 Flash is your local agent

Tools 442 points 150 comments Yesterday

I tried many MoE models at 30B or under and all of them failed sooner or later in an agentic framework. If z.ai is not redirecting my requests to another model, then GLM 4.7 Flash is finally the reliable (soon local) agent that I desperately wanted. I am running it since more than half an hour on opencode and it produced hundreds of thousands tokens in one session (with context compacting obviously) without any tool calling errors. It clones github repos, it runs all kind of commands, edits files, commits changes, all perfect, not a single error yet. Can't wait for GGUFs to try this locally.

More from r/LocalLLaMA