Introducing Project Astra. We created a demo in which a tester interacts with a prototype of AI agents supported by our multimodal foundation model, Gemini.
There are two continuous takes: one with the prototype running on a Google Pixel phone and another on a prototype glasses device.
The agent takes in a constant stream of audio and video input. It can reason about its environment in real time and interact with the tester in a conversation about what it is seeing.