hypes.news
← Back to feed
Papers·5일 전

DeVI: Dexterous Video Imitation from Synthetic Videos for Physically Plausible Hand-Object Interaction

DeVI: Dexterous Video Imitation from Synthetic Videos for Physically Plausible Hand-Object Interaction

DeVI enables physically plausible dexterous agent control by imitating text-conditioned synthetic videos, outperforming prior 3D demonstration-based methods in hand-object interaction fidelity. The framework uses a hybrid tracking reward combining 3D human tracking with robust 2D object tracking to overcome imprecise generative cues. It achieves zero-shot generalization across diverse objects and interaction types, validated in multi-object scenes and text-driven action diversity.

Visual Computing Lab

Comments

— 첫 댓글을 남겨보세요 —