Language Models as Zero-Shot Planners= Extracting Actionable Knowledge for Embodied Agents
(Mindmap) Part-level Scene Understanding for Robots
 Part-level Scene Understanding for Robots/Pasted_image_20250414142333.png)
A scene graph is a structural representation, which can capture detailed semantics by explicitly Modeling:
(Roadmap) Deeper Scene Graph For Robots
 Deeper Scene Graph For Robots/Pasted_image_20250328103811.png)
Robotic planning and execution in open-world environments is a complex problem due to the vast state spaces and high variability of task embodiment.
例如针对家用场景:
ConceptGraphs= Open-Vocabulary 3D Scene Graphs for Perception and Planning

通过LLM来判断位置关系,以此构建scene graph
SayPlan= Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning

主要的思想都在上面这个伪代码里,通过只展开部分场景图(严格层级结构),来控制输入llm的场景图大小。