Pick up A (from B) and drop it on/in C”, where A is an object and B and C are places in a real-world environment such as homes
Open-home, open-vocabulary object navigation
负责空间重建,识别物体大致位置,机器人导航 用到的方法:
CLIP-Fields [[CLIP-Fields- Weakly Supervised Semantic Fields for Robotic Memory]] : a RGB-D video of the home -> a sequence of posed ( with camera pose and positions) RGB-D images,用于重建环境,该研究还基于此获取了环境中物体和容器旁边的地板表面。
VoxcelMap: similar to object-centric memory of CLIP-Fields [[CLIP-Fields- Weakly Supervised Semantic Fields for Robotic Memory]], 基于点云中每一个点的CLIP semantic vector,每一个5cm的体素都包含一个CLIP-embedding的detector-confidence weighted average.
Querying the memory module: 先将language query 转化成CLIP semantic vector,然后基于voxelmap的clip-embeding,寻找最语义接近的那个voxel,以此定位。
Experiment
Posted Updated Notea few seconds read (About 0 words)
IL是区别于传统手动编程来赋予机器人自主能力的方法。 IL 允许机器通过演示(人类演示专家行为)来学习所需的行为,从而消除了对显式编程或特定于任务的奖励函数的需要。 IL主要有两个类别:
行为克隆(BC)
反向强化学习(IRL)
Behavior Cloning
BC 是一种 IL 技术,它将学习行为的问题视为监督学习任务 。 BC 涉及通过建立环境状态与相应专家操作之间的映射来训练模型来复制专家的行为。专家的行为被记录为一组state-action pair,也称为演示。在训练过程中,模型学习一个函数,利用这些演示作为输入,将当前状态转换为相应的专家操作。经过训练,模型可以利用这个学习函数来生成遇到新状态的动作。
不需要了解环境的潜在动态,计算效率很高,相对简单的方法。
The covariate shift problem: 测试期间观察到的状态分布可能与训练期间观察到的状态分布有所不同,使得代理在遇到未见过的状态时容易出错,而对于如何进行操作缺乏明确的指导。BC监督方法的问题是,当智能体漂移并遇到分布外状态时,它不知道如何返回到演示的状态。
The agent strives to deceive the discriminator by generating trajectories closely resembling those of the expert.
Imitation From Observation
仅通过图像序列来学习,不需要具体的关节动作操作数据。
Unlike the traditional methods, IfO presents a more organic approach to learning from experts, mirroring how humans and animals approach imitation. Humans often learn new behaviors by observing others without detailed knowledge of their actions (e.g., the muscle commands). People learn a diverse range of tasks, from weaving to swimming to playing games, by watching online videos. Despite differences in body shapes, sensory inputs, and timing, humans exhibit an impressive ability to apply knowledge gained from the online demonstrations