ALBEF

ALBEF

GLIP

GLIP

Grounding-DINO

Grounding-DINO

Momentum Contrast for Unsupervised Visual Representation Learning

Momentum Contrast for Unsupervised Visual Representation Learning

CLIP

CLIP