Gounded-SAM

Gounded-SAM

Scene-LLM

Scene-LLM

3D-LLM

3D-LLM

PointLLM

PointLLM

ProgPrompt

ProgPrompt

Momentum Contrast for Unsupervised Visual Representation Learning

Momentum Contrast for Unsupervised Visual Representation Learning

Vision Transformers Need Registers

Vision Transformers Need Registers

DINOv2- Learning Robust Visual Features without Supervision
AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

DINO

DINO