Scaling Robot Learning
Quan Vuong
Quan Vuong is a researcher in the robotic manipulation team at Google DeepMind. He obtained his Ph.D. from the University of California San Diego with Dr. Henrik Christensen and Dr. Hao Su. He recently led the robotics-transformer-x project, which is a collaboration with 176 researchers across 34 different research labs to build large scale datasets and generalist foundation models for robotics. His works were recently featured on the New York Times, TechCrunch and various other news outlets.
In this talk, I will discuss the historical development of the robotic manipulation team at Google and recent efforts to scale robot learning at Google DeepMind. I will begin by describing our model scaling effort, starting with robotics-transformer-1, a scalable architecture for learning low level robotic actions. I will then introduce robotics-transformer-2, our next generation model built on top of vision language models demonstrating the transfer of web knowledge directly to low-level control. Afterwards, I will present our data scaling efforts, robotics-transformer-x, which demonstrates a foundation model for low level robotic control that exhibits positive transfer across datasets from different robotic embodiments. robotics-transformer-x is a collaboration with 34 research labs across 22 different institutions across the world. To end the talk, I will discuss our efforts in going beyond high quality demonstrations, q-transformer for leveraging suboptimal data, and going beyond language as a conditioning modality, including using trajectory and sketches to convey the intended tasks.