Posted by u/Technical-Love-8479
DeepSeek Engram : A static memory unit for LLMs
DeeepSeek AI released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" introducing Engram. The key idea: instead of recomputing static knowledge (like entities, facts, or patterns) every time through expensive transformer layers, Engram **adds native memory lookup**. Think of it as separating **remembering from reasoning**. Traditional MoE focuses on conditional computation, Engram introduces **conditional memory**. Together, they let LLMs reason deeper, handle long contexts better, and offload early-layer compute from GPUs. **Key highlights:** * Knowledge is **looked up in O(1)** instead of recomputed. * Uses **explicit parametric memory** vs implicit weights only. * Improves reasoning, math, and code performance. * Enables massive memory scaling **without GPU limits**. * Frees attention for **global reasoning** rather than static knowledge. Paper : https://github.com/deepseek-ai/Engram/blob/main/Engram\_paper.pdf Video explanation : https://youtu.be/btDV86sButg?si=fvSpHgfQpagkwiub
More from r/LocalLLaMA
My story of underestimating /r/LocalLLaMA's thirst for VRAM
zai-org/GLM-4.7-Flash · Hugging Face
NVIDIA's new 8B model is Orchestrator-8B, a specialized 8-billion-parameter AI designed not to answer everything itself, but to intelligently manage and route complex tasks to different tools (like web search, code execution, other LLMs) for greater efficiency
I’ve seen some arguments we’ve reached AGI, it’s just about putting the separate pieces together in the right context....