Vision Transformers Need Registers

DINO

CLIP

LERF- Language Embedded Radiance Fields

Some Thoughts Regarding -Reconstruct Anything-

CLIP-Fields- Weakly Supervised Semantic Fields for Robotic Memory