Posted 2025-03-11Updated 2026-02-15Reviewa minute read (About 161 words)DETRDETR是一个使用transformer作为基本架构的 object detection 模型。#Research-paperCVTransformerObject-Detection
Posted 2025-03-06Updated 2026-02-15Reviewa few seconds read (About 26 words)MaskDINO注:此DINO并非自蒸馏自监督的那个[[DINO]],而是派生自[[DETR]]#Research-paperCVTransformerObject-DetectionSemanticSegmentationMultiModal
Posted 2025-03-04Updated 2026-02-15Reviewa few seconds read (About 3 words)ViLT#Research-paperCVTransformerImage2TextMultiModalVLPImage-Text
Posted 2025-02-16Updated 2026-02-15Reviewa minute read (About 216 words)Grounding-DINO,#Research-paperCVTransformerObject-DetectionOpen-VocabularyContrastive-LearningMultiModalDINOImage-Grounding
Posted 2025-01-09Updated 2026-02-15Notea few seconds read (About 0 words)Vision Transformers Need Registers#Research-paperCVTransformerViT
Posted 2025-01-09Updated 2026-02-15Notea few seconds read (About 0 words)DINOv2- Learning Robust Visual Features without Supervision#Research-paperCVTransformerRepresentation-LearningViTDINO
Posted 2025-01-09Updated 2026-02-15Notea few seconds read (About 71 words)AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALEhttps://www.youtube.com/watch?v=j3VNqtJUoz0&t=16s#Research-paperCVTransformerViT
Posted 2025-01-08Updated 2026-02-15Note4 minutes read (About 561 words)DINOhttps://github.com/facebookresearch/dino/tree/main#Research-paperCVTransformerRepresentation-LearningViTDINO
Posted 2025-01-06Updated 2026-02-15Notea few seconds read (About 3 words)Simple Open-Vocabulary Object Detection with Vision Transformers#Research-paperCVTransformerObject-DetectionOpen-VocabularyViT
Posted 2024-09-27Updated 2026-02-15Reviewa minute read (About 203 words)Attention Is All You NeedTransformer是一种基于注意力机制,完全不需要递归或卷积网络的序列预测模型,且更易于训练#Research-paperTransformerNNMLNLP