Part-level Scene Reconstruction Affords Robot Interaction
DETR

DETR

Semantic-SAM

Semantic-SAM

MaskDINO

MaskDINO

ALBEF

ALBEF

ViLT

ViLT

ZegCLIP

ZegCLIP

BLIP

BLIP

GLIP

GLIP

Extract Free Dense Labels from CLIP

Extract Free Dense Labels from CLIP