Posted by u/obxsurfer06
LingBot-World achieves the "Holy Grail" of video generation: Emergent Object Permanence without a 3D engine
The newly open sourced LingBot-World report reveals a breakthrough capability where the model effectively builds an implicit map of the world rather than just hallucinating pixels based on probability. This emergent understanding allows it to reason about spatial logic and unobserved states purely through next-frame prediction. The "Stonehenge Test" demonstrates this perfectly. You can observe a complex landmark, turn the camera away for a full 60 seconds, and when you return, the structure remains perfectly intact with its original geometry preserved. It even simulates unseen dynamics. If a vehicle drives out of the frame, the model continues to calculate its trajectory off-screen. When you pan the camera back, the car appears at the mathematically correct location rather than vanishing or freezing in place. This signals a fundamental shift from models that merely dream visuals to those that truly simulate physical laws.
External link:
https://v.redd.it/71403i6q8agg1More from r/singularity
Well, this is funny
Grok, I wasn't familiar with your game.
Priorities
Credit: Charles Curran on X (Seedance 2.0)