Posted 2025-03-11Updated 2026-03-08Reviewa minute read (About 161 words)DETRDETR是一个使用transformer作为基本架构的 object detection 模型。#Research-paperTransformerCVObject-Detection
Posted 2025-03-06Updated 2026-03-08Reviewa few seconds read (About 26 words)MaskDINO注:此DINO并非自蒸馏自监督的那个[[DINO]],而是派生自[[DETR]]#Research-paperTransformerCVObject-DetectionSemanticSegmentationMultiModal
Posted 2025-03-04Updated 2026-03-08Reviewa few seconds read (About 3 words)ViLT#Research-paperTransformerImage2TextCVMultiModalVLPImage-Text
Posted 2025-02-16Updated 2026-03-08Reviewa minute read (About 216 words)Grounding-DINO,#Research-paperTransformerCVObject-DetectionOpen-VocabularyContrastive-LearningMultiModalDINOImage-Grounding
Posted 2025-01-09Updated 2026-03-08Notea few seconds read (About 0 words)Vision Transformers Need Registers#Research-paperTransformerCVViT
Posted 2025-01-09Updated 2026-03-08Notea few seconds read (About 0 words)DINOv2- Learning Robust Visual Features without Supervision#Research-paperTransformerCVRepresentation-LearningViTDINO
Posted 2025-01-09Updated 2026-03-08Notea few seconds read (About 71 words)AN IMAGE IS WORTH 16X16 WORDS- TRANSFORMERS FOR IMAGE RECOGNITION AT SCALEhttps://www.youtube.com/watch?v=j3VNqtJUoz0&t=16s#Research-paperTransformerCVViT
Posted 2025-01-08Updated 2026-03-08Note4 minutes read (About 561 words)DINOhttps://github.com/facebookresearch/dino/tree/main#Research-paperTransformerCVRepresentation-LearningViTDINO
Posted 2025-01-06Updated 2026-03-08Notea few seconds read (About 3 words)Simple Open-Vocabulary Object Detection with Vision Transformers#Research-paperTransformerCVObject-DetectionOpen-VocabularyViT
Posted 2024-09-27Updated 2026-03-08Reviewa minute read (About 203 words)Attention Is All You NeedTransformer是一种基于注意力机制,完全不需要递归或卷积网络的序列预测模型,且更易于训练#Research-paperTransformerNNMLNLP