Posted 2025-02-16Updated 2026-04-16Reviewa minute read (About 216 words)Grounding-DINO,#CVObject-DetectionResearch-paperTransformerMultiModalContrastive-LearningOpen-VocabularyDINOImage-Grounding
Posted 2025-02-16Updated 2026-04-16Reviewa few seconds read (About 17 words)Gounded-SAMhttps://github.com/IDEA-Research/Grounded-Segment-Anything By [[Grounding-DINO]] + SAMAchieving Open-Vocab. Det & Seg #CVObject-DetectionResearch-paperSemanticOpen-VocabularySegmentation
Posted 2025-01-09Updated 2026-04-16Note5 minutes read (About 722 words)Momentum Contrast for Unsupervised Visual Representation Learning伪代码:#CVResearch-paperRepresentation-LearningContrastive-Learning
Posted 2025-01-09Updated 2026-04-16Notea few seconds read (About 0 words)Vision Transformers Need Registers#CVResearch-paperTransformerViT
Posted 2025-01-09Updated 2026-04-16Notea few seconds read (About 0 words)DINOv2- Learning Robust Visual Features without Supervision#CVResearch-paperTransformerRepresentation-LearningDINOViT
Posted 2025-01-09Updated 2026-04-16Notea few seconds read (About 71 words)AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALEhttps://www.youtube.com/watch?v=j3VNqtJUoz0&t=16s#CVResearch-paperTransformerViT
Posted 2025-01-08Updated 2026-04-16Note4 minutes read (About 561 words)DINOhttps://github.com/facebookresearch/dino/tree/main#CVResearch-paperTransformerRepresentation-LearningDINOViT
Posted 2025-01-06Updated 2026-04-16Notea minute read (About 197 words)CLIPhttps://blog.csdn.net/h661975/article/details/135116957#CVResearch-paperImage2TextMultiModalCLIPContrastive-LearningVLPImage-Text
Posted 2025-01-06Updated 2026-04-16Note5 minutes read (About 790 words)LERF- Language Embedded Radiance FieldsNeRF+CLIP#CVResearch-paperLLMReconstructSemanticCLIP3D-SceneEmbodied-AI
Posted 2025-01-06Updated 2026-04-16Notea few seconds read (About 3 words)Simple Open-Vocabulary Object Detection with Vision Transformers#CVObject-DetectionResearch-paperTransformerOpen-VocabularyViT