
ConceptGraphs= Open-Vocabulary 3D Scene Graphs for Perception and Planning
SayPlan= Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning
主要的思想都在上面这个伪代码里,通过只展开部分场景图(严格层级结构),来控制输入llm的场景图大小。
A scalable approach to ground LLM-based task planners across environments spanning multiple rooms and floors
Scene Graph 通过networkx (python package)表示
Clio= Real-time Task-Driven Open-Set 3D Scene Graphs
贡献:
提出了针对不同任务需要不同粒度的语义信息,本文是通过结合SAM和[[CLIP多模态预训练模型]]实现,但是忽略了物体之间的谓语关系或者父子关系。本质还是智能做导航,拾取,放下,导航的基本操作。
Factorizable Net= An Efficient Subgraph-based Framework for Scene Graph Generation
我的想法是将场景进行panoptic segmentation 之后再在每个物体上进行hierarchical part relation detection,异曲同工。