Chen Yulin's Blog

Posted 2025-03-14Updated 2025-07-24Reviewa few seconds read (About 3 words)

Visual Relationship Detection with Language Priors

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 0 words)

Image generation from scene graphs

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 3 words)

Scene Graph Generation With Hierarchical Context

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 20 words)

Energy-Based Learning for Scene Graph Generation

这篇看不懂。。。 Can effectively utilize structural information in the output space

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 0 words)

Iterative Scene Graph Generation

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 21 words)

Image Retrieval using Scene Graphs

Building an efficient structured representation that captures comprehensive semantic knowledge is a crucial step towards a deeper understanding of visual scenes

Posted 2025-03-13Updated 2025-07-24Reviewa few seconds read (About 6 words)

From Pixels to Graphs= Open-Vocabulary Scene Graph Generation with Vision-Language Models

Posted 2025-03-13Updated 2025-07-24Reviewa minute read (About 180 words)

SGTR+= End-to-end Scene Graph Generation with Transformer

SGTR 是一种自上而下的方法，该方法首先使用基于Transformer的生成器来生成一组可学习的triplet queries (subject–predicate–object)，然后使用级联的triplet detector逐步完善这些查询并生成最终场景图。它还提出了一种基于结构化发生器的实体感知关系表示方法，该方法利用了关系的组成属性。

Top-down approach (SGTR):
- Starts with higher-level structures (triplet queries) and refines them
- Begins by generating complete subject-predicate-object triplet candidates
- Then progressively refines these triplets to match the image content
- Works with the complete structural units from the beginning
- Analogous to starting with a rough sketch of the entire tree and then refining each branch