Gemini Robotics, Gemini 2.0-based model designed for robotics, Google DeepMind Technologies Limited, London, United Kingdom


Gemini Robotics: Bringing AI to the physical world

Mar 12, 2025

Our new Gemini Robotics model brings Gemini 2.0 to the physical world. It's our most advanced vision language action model, enabling robots that are interactive, dexterous, and general. Learn more about how we're enabling the next generation of robotic AI agents at deepmind.google/robotics

"Gemini Robotics brings AI into the physical world"

by Carolina Parada
March 12, 2025
 

Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

Sep 4, 2025

Discover EmbeddingGemma, a state-of-the-art 308 million parameter text embedding model designed to power generative AI experiences directly on your hardware. Ideal for mobile-first Al, EmbeddingGemma brings powerful capabilities to your applications, enabling features like semantic search, information retrieval, and custom classification – all while running efficiently on-device.In this video, Alice Lisak and Lucas Gonzalez from the Gemma team introduce EmbeddingGemma and explain how it works. Learn how you can run this model on less than 200MB of RAM with quantization, customize its output dimensions with Matryoshka Representation Learning (MRL), and build powerful offline Al features.

"Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings"

by Min Choi, Sahil Dua, Alice Lisak
September 4, 2025
 
Last edited:

Gemini Robotics 1.5: Enabling robots to plan, think and use tools to solve complex tasks

Sep 25, 2025

We’re powering an era of physical agents with Gemini Robotics 1.5 — enabling robots to perceive, plan, think, use tools and act to better solve complex, multi-step tasks.

🤖 Gemini Robotics 1.5 is our most capable vision-language-action (VLA) model that turns visual information and instructions into motor commands for a robot to perform a task. This model thinks before taking action and shows its process, helping robots assess and complete complex tasks more transparently. It also learns across embodiments, accelerating skill learning.

🤖 Gemini Robotics-ER 1.5 is our most capable vision-language model (VLM) that reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission. This model now achieves state-of-the-art performance across spatial understanding benchmarks.

We’re making Gemini Robotics-ER 1.5 available to developers via the Gemini API in Google AI Studio and Gemini Robotics 1.5 to select partners.
 
Back
Top