
process:
- trained:
- swint_mcm
- swint_cb_mcm
- swint_cb_scm
- resdcn_scm
- resdcn_cb_scm
- resdcn_cb_mcm
- resdcn_mcm
- swint_scm
training:
- evaluated:
- swint_mcm
- swint_cb_scm
- swint_cb_mcm
- resdcn_cb_scm
- resdcn_cb_mcm
- resdcn_scm
- resdcn_mcm
- swint_scm
evaluating:
evaluated on 1000 samples
KAF-Net: Part-level Kinematic Relation Graph Generation For Robot Manipulation
| backbone | In. Aug | class balance | $mAp_{50}$ | $R@10/20/40$ | $mR@10/20/40$ |
|---|---|---|---|---|---|
| SwinT | MCM | yes | 24.2 | 33.4/53.8/73.9 | 25.1/52.9/69.2 |
| no | 23.4 | 43.5/62.3/78.7 | 26.9/55.3/67.9 | ||
| Single | yes | 23.5 | 32.3/51.6/69.2 | 34.4/52.9/65.4 | |
| no | |||||
| ResDCN | MCM | yes | 23.1 | 32.8/52.8/70.1 | 24.8/52.8/67.7 |
| no | train~ | ||||
| Single | yes | 20.5 | 39.9/54.3/69.3 | 37.7/51.2/65.1 | |
| no | 22.3 | 33.8/52.4/69.4 | 35.7/55.6/66.7 | ||
On swint_cb_mcm: |
| Image-Mask Branch | $mAp_{50}$ | $mR@10/20/40$ |
|---|---|---|
| Yes | ||
| No |
| VLM | VI | VR | unVR |
|---|---|---|---|
| Gemini 2.5 Flash | |||
| Pixtral 12B |
Task Planning:
example: