DeepSeek Engram : A static memory unit for LLMs

DeeepSeek AI released a new paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" introducing Engram. The key idea: instead of recomputing static knowledge (like entities, facts, or patterns) every time through expensive transformer layers, Engram **adds native memory lookup**. Think of it as separating **remembering from reasoning**. Traditional MoE focuses on conditional computation, Engram introduces **conditional memory**. Together, they let LLMs reason deeper, handle long contexts better, and offload early-layer compute from GPUs. **Key highlights:** * Knowledge is **looked up in O(1)** instead of recomputed. * Uses **explicit parametric memory** vs implicit weights only. * Improves reasoning, math, and code performance. * Enables massive memory scaling **without GPU limits**. * Frees attention for **global reasoning** rather than static knowledge. Paper : https://github.com/deepseek-ai/Engram/blob/main/Engram\_paper.pdf Video explanation : https://youtu.be/btDV86sButg?si=fvSpHgfQpagkwiub

DeepSeek Engram : A static memory unit for LLMs

More from r/LocalLLaMA

My story of underestimating /r/LocalLLaMA's thirst for VRAM

zai-org/GLM-4.7-Flash · Hugging Face

NVIDIA's new 8B model is Orchestrator-8B, a specialized 8-billion-parameter AI designed not to answer everything itself, but to intelligently manage and route complex tasks to different tools (like web search, code execution, other LLMs) for greater efficiency