Posted 2025-04-16Updated 2025-05-06Notea few seconds read (About 3 words)Vision-Language Interpreter for Robot Task Planning
Posted 2025-04-15Updated 2025-05-06Reviewa few seconds read (About 30 words)Pixtral 12BWeb: https://mistral.ai/news/pixtral-12bDemo: https://chat.mistral.ai/chatFinetune: https://github.com/2U1/Pixtral-FinetuneModel: https://huggingface.co/mistralai/Pixtral-12B-2409
Posted 2025-03-19Updated 2025-05-06Reviewa few seconds read (About 42 words)ConceptGraphs= Open-Vocabulary 3D Scene Graphs for Perception and Planning 通过LLM来判断位置关系,以此构建scene graph 还是只能判断object-level空间关系,做不了part-level manipulation
Posted 2025-03-13Updated 2025-05-06Reviewa few seconds read (About 6 words)From Pixels to Graphs= Open-Vocabulary Scene Graph Generation with Vision-Language Models