
Imagine that you can create a world for yourself as easily as you write a message to a friend. You just describe it in a few sentences, and an island, an ancient city, or a space station appears in front of you. And what’s more, you can not only look at them, but also walk the streets, feel the atmosphere, and interact with the objects. Genie 3 is a new model from Google DeepMind that can create interactive 3D virtual worlds in minutes. We will tell you more about it later in this article…
Unlike image or video generators that produce a static or short clip, Genie 3 is a world model. It doesn’t render a picture – it simulates the logic of the world: space, movement, interaction with objects, and the consequences of actions. The user sets a textual description, and the system builds a dynamic scene that can be navigated in real time. According to official data, the key parameters include 24 fps and 720p resolution with support for simulating the world (for now) for several minutes.
Until now, world models were more like laboratory demos that quickly crumbled with prolonged interaction. Genie 3 demonstrates a shift from “watching” to “living inside” the simulation – and it does so steadily, with a noticeably higher visual quality than the previous Genie 2. It is this tangible interactivity that prompts many to talk about a new level of development of agent systems and even a step towards AI/AGI.
How it works in practice
Everything is as usual with any other AI – you formulate a hint: “storm over the coastal road”, “Japanese rock garden“, “drone flights in an Icelandic canyon” – the system generates a scene and responds to keystrokes or other user actions. Additionally, there are “promptable world events”: by the middle of the session, you can make it rain, change the lighting, add an object or a character, and the simulation adjusts on the fly without destroying the sequence of events.
Why you need it outside of games
Yes, this will obviously help game development a lot: quick prototyping of levels, testing mechanics without building a content sequence, usability tests of navigation and cameras. But not only that, because the spectrum is wider – education will get living laboratories where students can interact with phenomena ranging from waves on the water to lava flows without risks and costs. Robotics is a limitless simulator for agents/robots that will learn in a variety of conditions before entering the physical world.
AI in the mind of another AI
The fun begins when agents, not people, enter these worlds. DeepMind is already showing experiments with its own SIMA agent: it receives goals (reach a certain point, collect objects) and interacts with Genie 3 as a full-fledged environment. It is literally “one AI playing in the imaginary world of another” – a perfect sandbox for learning.
Compared to previous models, this one maintains the integrity of the scene longer, behaves more naturally with water and light materials, and responds more adequately to actions – transitions between objects, obstacles, or weather changes do not break the world. For the user, this means that experiments, tests, and demos cease to be 30-second clips and turn into full-fledged short sessions.
But there are still limits…
The developers directly admit that the text is not always readable; imitating the exact locations of our world is beyond the scope of the application; the duration of interaction is minutes, not hours. The range of actions that the agent can perform is also limited – some events have to be prompted. These caveats are important – prototypes are fine, but for real products, it still needs to be finalized. In any case, future versions of Genie will soon be released that will fix/improve all this.
Today, Genie 3 has been opened to a limited number of researchers and creators in the research preview format to collect feedback and work out security protocols. Extended access will be provided to everyone gradually, when the team is 100% sure that it can be released, having fixed all the nuances.
Real-time world generation is not just another feature of generative AI, but an entirely new language for interacting with digital environments. Genie 3 shows that this language is already at arm’s length…