Semantic-SAM Repository Application

(UVtransE) Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

PyTorch Einsum

PyTorch Einsum

NSFC

Momentum Contrast for Unsupervised Visual Representation Learning

Vision Transformers Need Registers

AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

DINO