Posted 2025-02-19Updated 2026-05-05Review2 minutes read (About 273 words)GLIPGLIP是一个学习了object-level, language-aware, and semantic-rich visual representations 的模型。统一对象检测和短语接地进行预训练。#Research-paperCVObject-DetectionMulti-modalCLIPContrastive-LearningVLPImage-Grounding
Posted 2025-02-16Updated 2026-05-05Reviewa minute read (About 216 words)Grounding-DINO,#Research-paperCVObject-DetectionTransformerMultiModalOpen-VocabularyContrastive-LearningDINOImage-Grounding