|
Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation
Tong Zhang, Yingdong Hu, Jiacheng You, Yang Gao
CoRL, 2024
project page /
arXiv /
code /
X summary
We introduce SGRv2, an imitation learning framework that enhances sample efficiency through improved visual and action representations. Central to the design of SGRv2 is the incorporation of a critical inductive bias-action locality, which posits that robot's actions are predominantly influenced by the target object and its interactions with the local environment.
|
|
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own
Weirui Ye, Yunsheng Zhang, Haoyang Weng, Xianfan Gu, Shengjie Wang, Tong Zhang, Mengchen Wang, Pieter Abbeel, Yang Gao
CoRL, 2024 (Oral Presentation)
project page /
arXiv /
code
We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models. Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions.
|
|
General Flow as Foundation Affordance for Scalable Robot Learning
Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao
CoRL, 2024
project page /
arXiv /
code
We build a 3D flow prediction model directly from large-scale RGBD human video datasets. Based on this model, we achieve stable zero-shot human-to-robot skill transfer in the real world.
|
|
Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning
Yingdong Hu*, Fanqi Lin*, Tong Zhang, Li Yi, Yang Gao
ICRA Workshop on Vision-Language Models for Navigation and Manipulation, 2024
project page /
arXiv
We introduce ViLa, a novel approach for long-horizon robotic planning that leverages GPT-4V to generate a sequence of actionable steps. ViLa empowers robots to execute complex tasks with a profound understanding of the visual world.
|
|
A Universal Semantic-Geometric Representation for Robotic Manipulation
Tong Zhang*, Yingdong Hu*, Hanchen Cui, Hang Zhao, Yang Gao
CoRL, 2023
CVPR Workshop on 3D Vision and Robotics, 2023
project page /
arXiv /
code
We present Semantic-Geometric Representation (SGR), a universal perception module for robotics that leverages the rich semantic information of large-scale pre-trained 2D models and inherits the merits of 3D spatial reasoning.
|