Chen Yulin's Blog

Posted 2024-11-17Updated 2025-07-24Note18 minutes read (About 2722 words)

Setup

仓库: https://github.com/Simple-Robotics/cosypose

1
2
3

git clone --recurse-submodules https://github.com/Simple-Robotics/cosypose.git
cd cosypose
conda env create -n cosypose --file environment.yaml

注意执行这一步的时候pip 会提示setuptools 和matplotlib-inline不符合3.7.6的python，到环境中手动安装适配的版本

1
2
3

conda activate cosypose
pip install setuptools==63.4.1
pip install matplotlib-inline==0.1.6

1
2
3

git lfs pull
python setup.py install
python setup.py develop

根据README下载数据
注意第一块指令无法下载成功，由 https://bop.felk.cvut.cz/datasets/ 得知下载链接迁移到了huggingface, https://huggingface.co/datasets/bop-benchmark/datasets/tree/main/ycbv 可以从这里手动下载测试集并放置到local_data/bop_datasets/ycbv/test

设置测试使用的models

1	cp ./local_data/bop_datasets/ycbv/model_bop_compat_eval ./local_data/bop_datasets/ycbv/models

Debug

`np.where(mask)[0].item()`

运行

1 2	export CUDA_VISIBLE_DEVICES=0 python -m cosypose.scripts.run_cosypose_eval --config ycbv

时出现报错

Traceback (most recent call last):
  File "/home/cyl/.conda/envs/cosypose/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/cyl/.conda/envs/cosypose/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/cyl/cosypose/cosypose/scripts/run_cosypose_eval.py", line 491, in <module>
    main()
  File "/home/cyl/cosypose/cosypose/scripts/run_cosypose_eval.py", line 332, in main
    scene_ds = make_scene_dataset(ds_name)
  File "/home/cyl/cosypose/cosypose/datasets/datasets_cfg.py", line 68, in make_scene_dataset
    ids.append(np.where(mask)[0].item())
ValueError: can only convert an array of size 1 to a Python scalar

添加debug输出，得到

Debug - scene_id: 48, view_id: 1
Debug - mask matches: 1
Debug - where result shape: (1,), values: [225]
Debug - scene_id: 48, view_id: 36
Debug - mask matches: 1
Debug - where result shape: (1,), values: [226]
Debug - scene_id: 48, view_id: 47
Debug - mask matches: 1
Debug - where result shape: (1,), values: [227]
Debug - scene_id: 48, view_id: 83
Debug - mask matches: 1
Debug - where result shape: (1,), values: [228]
Debug - scene_id: 48, view_id: 112
Debug - mask matches: 1
Debug - where result shape: (1,), values: [229]
Debug - scene_id: 48, view_id: 135
Debug - mask matches: 0
Debug - where result shape: (0,), values: []
0:00:00.912023 - Expected exactly one match, got 0 matches for scene_id=48, view_id=135

发现是下载的测试数据集并不包含数据集keyframe.txt中所有的帧，导致一些关键帧识别不到

运行到一半被终止的情况

如果想重新开始新的训练：清空local_data/joblib_cache

Framework

Prediction Script `cosypose.scripts.run_cosypose_eval`

AI explanation

The script predicts object poses based on multi-view input by following these steps:

Dataset Loading: It first loads the dataset using the make_scene_dataset function, which prepares the scene data for evaluation. The dataset is wrapped in a MultiViewWrapper to handle multiple views.
Model Loading: The script loads pre-trained models for pose prediction using the load_models function. It loads both coarse and refiner models based on the configuration specified in the command-line arguments.
Prediction Setup: The script sets up the prediction parameters, including the number of iterations for coarse and refiner models, and whether to skip multi-view processing based on the number of views specified.
Multi-view Prediction: The MultiviewScenePredictor is initialized with the mesh database, which is used to predict poses across multiple views. The MultiviewPredictionRunner is then used to run predictions on the dataset, leveraging the multi-view setup to improve pose estimation accuracy.
Pose Estimation: The script uses the loaded models to predict object poses. It processes detections from either pix2pose or posecnn depending on the dataset, and refines these predictions using the refiner model.
Evaluation: After predictions, the script evaluates the predicted poses using the PoseEvaluation class. It calculates various metrics like ADD-S and AUC to assess the accuracy of the pose predictions.
Results Logging: Finally, the script logs the results, including evaluation metrics, and saves them to a specified directory.

The multi-view approach allows the script to leverage information from different viewpoints, which can help resolve ambiguities and improve the robustness of the pose estimation.

Prediction Script `run_custom_scenario`

Terms

TCO

Transformation from Camera to Object.
It represents the transformation matrix or parameters that describe the pose of an object relative to the camera’s coordinate system

TWO

Transformation from World to Object.
It represents the transformation matrix or parameters that describe the pose of an object relative to the world’s coordinate system

Model dataset

class MeshDataBase:
    def __init__(self, obj_list):
        self.infos = {obj['label']: obj for obj in obj_list}
        self.meshes = {l: trimesh.load(obj['mesh_path']) for l, obj in self.infos.items()}

    @staticmethod
    def from_object_ds(object_ds):
        obj_list = [object_ds[n] for n in range(len(object_ds))]
        return MeshDataBase(obj_list)
...

一般使用的初始化方式：

1 2	object_ds = BOPObjectDataset(scenario_dir / 'models') mesh_db = MeshDataBase.from_object_ds(object_ds)

也可以通过load models一起加载：

1	predictor, mesh_db = load_models(coarse_run_id, refiner_run_id, n_workers=n_plotters, object_set=object_set)

Important Classes

`Multiview_wrapper`

作用：
读取 scene_dataset 并且通过视角数量n_views来分割这些数据为不同场景，然后方便遍历其中的场景元素（这里都是ground truth）
遍历时返回的值为

n_views张不同视角下的RGB图像
n_views张对应的mask

n_views份对应的observation

识别到的物体位姿和类型
相机位姿和内参

frame_info，没太多用

1 2	scene_ds_pred = MultiViewWrapper(scene_ds, n_views=n_views) scene_ds_pred[0][2] # scene48 multiview_group1 's observations in five views

[
 {'objects': 
  [
   {'label': 'obj_000001',
    'name': 'obj_000001',
    'TWO': array([[-0.02062261, -0.99870347, -0.04654345, -0.05380909],
           [ 0.99854439, -0.022895  ,  0.04883047,  0.00189095],
           [-0.04983272, -0.04546878,  0.9977229 ,  0.07060698],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'T0O': array([[-0.02062261, -0.99870347, -0.04654345, -0.05380909],
           [ 0.99854439, -0.022895  ,  0.04883047,  0.00189095],
           [-0.04983272, -0.04546878,  0.9977229 ,  0.07060698],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'visib_fract': 0.7769277845777234,
    'id_in_segm': 1,
    'bbox': [347, 210, 467, 374]},
   {'label': 'obj_000006',
    'name': 'obj_000006',
    'TWO': array([[-0.40056693,  0.91475543, -0.05262471,  0.03103553],
           [-0.91622629, -0.39934108,  0.03248866, -0.02365388],
           [ 0.00870386,  0.06123014,  0.9980863 ,  0.01391488],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'T0O': array([[-0.40056693,  0.91475543, -0.05262471,  0.03103553],
           [-0.91622629, -0.39934108,  0.03248866, -0.02365388],
           [ 0.00870386,  0.06123014,  0.9980863 ,  0.01391488],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'visib_fract': 0.9990349353406678,
    'id_in_segm': 2,
    'bbox': [328, 343, 422, 405]},
   {'label': 'obj_000014',
    'name': 'obj_000014',
    'TWO': array([[ 0.24178672, -0.96941339, -0.04215706, -0.05206396],
           [ 0.96977496,  0.2399519 ,  0.0442575 ,  0.0179453 ],
           [-0.03278805, -0.05158388,  0.99813144,  0.16636215],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'T0O': array([[ 0.24178672, -0.96941339, -0.04215706, -0.05206396],
           [ 0.96977496,  0.2399519 ,  0.0442575 ,  0.0179453 ],
           [-0.03278805, -0.05158388,  0.99813144,  0.16636215],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'visib_fract': 0.9938250428816466,
    'id_in_segm': 3,
    'bbox': [372, 143, 490, 241]},
   {'label': 'obj_000019',
    'name': 'obj_000019',
    'TWO': array([[-0.69888905,  0.1926738 , -0.68878937,  0.01412755],
           [ 0.711967  ,  0.27928957, -0.64428215,  0.05127768],
           [ 0.06823575, -0.94067797, -0.33237011,  0.06472594],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'T0O': array([[-0.69888905,  0.1926738 , -0.68878937,  0.01412755],
           [ 0.711967  ,  0.27928957, -0.64428215,  0.05127768],
           [ 0.06823575, -0.94067797, -0.33237011,  0.06472594],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'visib_fract': 0.9890470974808324,
    'id_in_segm': 4,
    'bbox': [419, 222, 527, 410]},
   {'label': 'obj_000020',
    'name': 'obj_000020',
    'TWO': array([[-0.74512542, -0.66691536,  0.00352083,  0.07854437],
           [-0.6669148 ,  0.74507458, -0.00940455, -0.15283599],
           [ 0.00364864, -0.00935569, -0.99995023,  0.01854317],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'T0O': array([[-0.74512542, -0.66691536,  0.00352083,  0.07854437],
           [-0.6669148 ,  0.74507458, -0.00940455, -0.15283599],
           [ 0.00364864, -0.00935569, -0.99995023,  0.01854317],
           [ 0.        ,  0.        ,  0.        ,  1.        ]]),
    'visib_fract': 0.9953060637992145,
    'id_in_segm': 5,
    'bbox': [92, 328, 288, 442]}],
  'camera': 
  {'T0C': array([[-0.0792652 ,  0.241296  , -0.967209  ,  0.946419  ],
          [ 0.996102  ,  0.0568396 , -0.0674529 , -0.02116569],
          [ 0.0386997 , -0.968786  , -0.244861  ,  0.36645836],
          [ 0.        ,  0.        ,  0.        ,  1.        ]]),
   'K': array([[1.066778e+03, 0.000000e+00, 3.129869e+02],
          [0.000000e+00, 1.067487e+03, 2.413109e+02],
          [0.000000e+00, 0.000000e+00, 1.000000e+00]]),
   'TWC': array([[-0.0792652 ,  0.241296  , -0.967209  ,  0.946419  ],
          [ 0.996102  ,  0.0568396 , -0.0674529 , -0.02116569],
          [ 0.0386997 , -0.968786  , -0.244861  ,  0.36645836],
          [ 0.        ,  0.        ,  0.        ,  1.        ]]),
   'resolution': torch.Size([480, 640])},
  'frame_info': 
  {
   'scene_id': 48,
   'cam_id': 'cam',
   'view_id': 1626,
   'cam_name': 'cam',
   'group_id': 0
   }
  },
  ... # other views
]

`MultiviewPredictorRunner`

作用：
接收Multiview_wrapper作为输入，并做出预测

首先是数据集接收：

dataloader = DataLoader(scene_ds, batch_size=batch_size,
						num_workers=n_workers,
						sampler=sampler,
						collate_fn=self.collate_fn)

use collate_fn to process the row data （最后的注释里面有真正用到的数据）

def collate_fn(self, batch):
	batch_im_id = -1

	cam_infos, K = [], []
	det_infos, bboxes = [], []
	for n, data in enumerate(batch): # normally only one batch
		assert n == 0
		images, masks, obss = data
		for c, obs in enumerate(obss): # iterate along different views
			batch_im_id += 1
			frame_info = obs['frame_info']
			im_info = {k: frame_info[k] for k in ('scene_id', 'view_id', 'group_id')} # info for the image
			im_info.update(batch_im_id=batch_im_id)
			cam_info = im_info.copy() # info for camera

			K.append(obs['camera']['K']) # info for 相机内参
			cam_infos.append(cam_info)

			for o, obj in enumerate(obs['objects']):
				obj_info = dict(
					label=obj['name'],
					score=1.0,
				)
				obj_info.update(im_info) # add key-value pair from im_info to obj_info
				bboxes.append(obj['bbox'])
				det_infos.append(obj_info)

	gt_detections = tc.PandasTensorCollection(
		infos=pd.DataFrame(det_infos),
		bboxes=torch.as_tensor(np.stack(bboxes)),
	) # 包括每一个ground truthdetection的的基本info,和检测框 
	cameras = tc.PandasTensorCollection(
		infos=pd.DataFrame(cam_infos),
		K=torch.as_tensor(np.stack(K)),
	)# 包括每一view 相机的基本info（和detection info相同）,和内参
	data = dict(
		images=images,
		cameras=cameras,
		gt_detections=gt_detections,
	)
	return data

最重要的function: get_predictions

def get_predictions(self, pose_predictor, mv_predictor,
					detections=None,
					n_coarse_iterations=1, n_refiner_iterations=1,
					sv_score_th=0.0, skip_mv=True,
					use_detections_TCO=False):

Responsible for generating predictions for object poses in a scene using both single-view and multi-view approaches.

Input Parameters:
- pose_predictor: single view predictor，比如ycbv数据集用的就是posecnn的检测模型
- mv_predictor: An object or function that predicts scene states using multi-view information.
- detections: A collection of detected objects with associated information, pre-generated and saved in a .pkl file
- n_coarse_iterations, n_refiner_iterations: Number of iterations for coarse and refinement pose estimation.
- sv_score_th: Score threshold for single-view detections.
- skip_mv: A flag to skip multi-view predictions.
- use_detections_TCO: A flag to use detections for initial pose estimation.
Filtering Detections:
需要注意的是这里使用的detection是直接来自预存好的检测数据（非ground truth）
1
posecnn_detections = load_posecnn_results()
- The function filters the input detections based on the sv_score_th threshold.
- It assigns a unique detection ID to each detection and creates an index based on scene_id and view_id.
Iterating Over Data:
- The function iterates over batches of data from the dataloader.
- For each batch, it extracts images, camera information, and ground truth detections.
Matching Detections:
- It matches the detections with the current batch of data using the index created earlier.
- It filters and prepares the detections for processing.
Pose Prediction:
- If there are detections, it uses the pose_predictor to get single-view predictions.
- It registers the initial bounding boxes with the candidates.
Multi-View Prediction:
- If skip_mv is False, it uses the mv_predictor to predict the scene state using multi-view information.
Collecting Predictions:
- It collects the single-view and multi-view predictions into a dictionary.
Concatenating Results:
- It concatenates the predictions across all batches and returns the final predictions.

`MultiviewScenePredictor`

作用：
used by Myltiview_PredictionRunner.get_predictions
In run_cosypose_eval we initialize MultiviewScenePredictor in this way:

1	mv_predictor = MultiviewScenePredictor(mesh_db)

In the MultiviewScenePredictor we use the mesh_db to initialize MultiviewRefinement and solve:

problem = MultiviewRefinement(candidates=candidates_n,
                    cameras=cameras,
	                pairs_TC1C2=pairs_TC1C2,
	                mesh_db=self.mesh_db_ba)
ba_outputs = problem.solve(
	n_iterations=ba_n_iter,
	optimize_cameras=not use_known_camera_poses,
)

The solve function of MultiviewRefinement:

def solve(self, sample_n_init=1, **lm_kwargs):
	timer_init = Timer()
	timer_opt = Timer()
	timer_misc = Timer()

	timer_init.start()
	TWO_9d_init, TCW_9d_init = self.robust_initialization_TWO_TCW(n_init=sample_n_init)
	timer_init.pause()

	timer_opt.start()
	TWO_9d_opt, TCW_9d_opt, history = self.optimize_lm(
		TWO_9d_init, TCW_9d_init, **lm_kwargs)
	timer_opt.pause()

	timer_misc.start()
	objects, cameras = self.make_scene_infos(TWO_9d_opt, TCW_9d_opt)
	objects_init, cameras_init = self.make_scene_infos(TWO_9d_init, TCW_9d_init)
	history = self.convert_history(history)
	timer_misc.pause()

	outputs = dict(
		objects_init=objects_init,
		cameras_init=cameras_init,
		objects=objects,
		cameras=cameras,
		history=history,
		time_init=timer_init.stop(),
		time_opt=timer_opt.stop(),
		time_misc=timer_misc.stop(),
	)
	return outputs

Adaption

准备基于run_custom_scenario进行修改
run_custom_scenario的使用方式：

1	python -m cosypose.scripts.run_custom_scenario --scenario=example

Setting OMP and MKL num threads to 1.
pybullet build time: Jan 28 2022 20:13:03
0:00:00.000859 - -----------------------------------------------
---------------------------------
0:00:00.000921 - scenario: example
0:00:00.000942 - sv_score_th: 0.3
0:00:00.000956 - n_symmetries_rot: 64
0:00:00.000956 - n_symmetries_rot: 64
0:00:00.000968 - ransac_n_iter: 2000
0:00:00.000980 - ransac_dist_threshold: 0.02
0:00:00.001002 - nms_th: 0.04
0:00:00.001015 - no_visualization: False
0:00:00.001026 - -----------------------------------------------
---------------------------------
0:00:00.569089 - Loaded 796 candidates in 8 views.
0:00:00.570278 - Loaded cameras intrinsics.
0:00:00.690990 - Loaded 30 3D object models.
0:00:00.691047 - Running stage 2 and 3 of CosyPose...
0:00:01.145408 - Num candidates: 107
0:00:01.145468 - Num views: 8
0:00:01.145728 - Estimating camera poses using RANSAC.
0:00:04.588304 - Matched candidates: 49
0:00:04.588375 - RANSAC time_models: 0:00:02.390068
0:00:04.588398 - RANSAC time_score: 0:00:00.990740
0:00:04.588415 - RANSAC time_misc: 0:00:00.061626
0:00:04.902268 - BA time_init: 0:00:00.005349
0:00:04.902333 - BA time_opt: 0:00:00.091822
0:00:04.902351 - BA time_misc: 0:00:00.004793
0:00:04.491746 - Subscene 0 has 8 objects and 7 cameras.
0:00:04.512850 - Wrote predicted scene (objects+cameras): /home/cyl/cosypose/local_data/custom_scenarios/example/
results/subscene=0/predicted_scene.json
0:00:04.512906 - Wrote predicted objects with pose expressed in camera frame: /home/cyl/cosypose/local_data/custo
m_scenarios/example/results/subscene=0/scene_reprojected.csv

该脚本只接收了candidates, mesh_db和camera_k信息，直接运行mv_predictor

写一个通过list输入构建candidates的function:

def read_list_candidates_cameras(self, data_list, cameras_K_list):
	"""
	Creates a PandasTensorCollection from a list of candidates information.

	Args:
		data_list (list): Each element is a dictionary with keys:
			- "candidates" (list of dict): Each candidate dictionary includes:
				- "label" (str): The label of the object.
				- "score" (float): The confidence score of the object.
				- "pose" (torch.Tensor): A [4, 4] torch.Tensor representing the pose matrix.

	Returns:
		PandasTensorCollection: Contains poses and infos.
	"""
	all_poses = []
	all_infos = []
	all_K = []

	# Initialize view_id to be assigned automatically
	view_id = 0
	scene_id = 0  # Fixed value for scene_id

	for view, K in zip(data_list, cameras_K_list):
		all_K.append(K)
		for candidate in view["candidates"]:
			label = candidate["label"]
			score = candidate["score"]
			pose = candidate["pose"]

			# Append the pose tensor
			all_poses.append(pose)

			# Append the metadata
			all_infos.append({
				"view_id": view_id,
				"scene_id": scene_id,
				"score": score,
				"label": label
			})

		# Increment view_id for the next set of candidates
		view_id += 1

	K_tensor = torch.stack(all_K).to(dtype=torch.float32, device="cuda:0")

	# Stack poses into a single tensor
	poses_tensor = torch.stack(all_poses).to(dtype=torch.float32, device="cuda:0")

	# Create a Pandas DataFrame for infos
	infos_df = pd.DataFrame(all_infos)
	# Return the PandasTensorCollection-like structure
	ptc_candidate = tc.PandasTensorCollection(poses=poses_tensor, infos=infos_df)
	cam_info = infos_df.loc[:,["view_id"]]
	cam_info = cam_info.drop_duplicates()
	ptc_cam = tc.PandasTensorCollection(K=K_tensor, infos=cam_info)
	return ptc_candidate, ptc_cam

# Example usage:
example_data = [
    {
        "candidates": [
            {"label": "obj_000017", "score": 0.829675, "pose": torch.eye(4)},
            {"label": "obj_000010", "score": 0.820436, "pose": torch.eye(4) * 2},
        ]
    },
    {
        "candidates": [
            {"label": "obj_000005", "score": 0.104478, "pose": torch.eye(4) * 3},
        ]
    }
]
example_cameras_K = [
    torch.eye(3),
    torch.eye(3) * 2,
]

cd, cam= read_list_candidates(example_data, example_cameras_K)
cd, cam

(PandasTensorCollection(
     poses: torch.Size([3, 4, 4]) torch.float32 cuda:0,
 ----------------------------------------
     infos:
    view_id  scene_id     score       label
 0        0         0  0.829675  obj_000017
 1        0         0  0.820436  obj_000010
 2        1         0  0.104478  obj_000005
 ),
 PandasTensorCollection(
     K: torch.Size([2, 3, 3]) torch.float32 cuda:0,
 ----------------------------------------
     infos:
    view_id
 0        0
 1        1
 ))

之后就正常调用MultiviewScenePredictor.predict_scene_state() to estimate the scene:

predictions = self.mv_predictor.predict_scene_state(candidates, cameras,
									   score_th=self.sv_score_th,
									   use_known_camera_poses=False,
									   ransac_n_iter= self.ransac_n_iter,
									   ransac_dist_threshold= self.ransac_dist_threshold,
									   ba_n_iter= self.ba_n_iter)

之后再使用Non-Maximum Suppression来聚合重复检出的物体

objects = predictions['scene/objects']
cameras = predictions['scene/cameras']
reproj = predictions['ba_output']
#print(predictions)
for view_group in np.unique(objects.infos['view_group']):
	objects_ = objects[np.where(objects.infos['view_group'] == view_group)[0]]
	cameras_ = cameras[np.where(cameras.infos['view_group'] == view_group)[0]]
	reproj_ = reproj[np.where(reproj.infos['view_group'] == view_group)[0]]
	objects_ = nms3d(objects_, th= self.nms_th, poses_attr='TWO')

最终输出objects_

PandasTensorCollection(
    TWO: torch.Size([10, 4, 4]) torch.float32 cuda:0,
----------------------------------------
    infos:
   obj_id     score       label  n_cand  view_group  group_id  scene_id
0       2  5.469747  obj_000016       7           0         0        16
1       0  5.450335  obj_000017       8           0         0        16
2       4  4.098602  obj_000012       8           0         0        16
3       1  3.380887  obj_000010       6           0         0        16
4       5  2.771779  obj_000015       6           0         0        16
5       3  1.453180  obj_000011       4           0         0        16
6       9  1.183983  obj_000014       3           0         0        16
7       8  1.106775  obj_000013       2           0         0        16
)

Usage

Please refer to the notebook custom_scene.ipynb.

Setup

Debug

`np.where(mask)[0].item()`

运行到一半被终止的情况

Framework

Prediction Script `cosypose.scripts.run_cosypose_eval`

AI explanation

Prediction Script `run_custom_scenario`

Terms

TCO

TWO

Model dataset

Important Classes

`Multiview_wrapper`

`MultiviewPredictorRunner`

`MultiviewScenePredictor`

Adaption

Usage

6d pose -> unity coordinate

6-D Pose Estimation Survey

Model based (CAD model)

RGB

D-RGB

Non-model

Archives

Recents

Tags

Setup

Debug

np.where(mask)[0].item()

运行到一半被终止的情况

Framework

Prediction Script cosypose.scripts.run_cosypose_eval

AI explanation

Prediction Script run_custom_scenario

Terms

TCO

TWO

Model dataset

Important Classes

Multiview_wrapper

MultiviewPredictorRunner

MultiviewScenePredictor

Adaption

Usage

6d pose -> unity coordinate

6-D Pose Estimation Survey

Model based (CAD model)

RGB

D-RGB

Non-model

Archives

Recents

Tags

`np.where(mask)[0].item()`

Prediction Script `cosypose.scripts.run_cosypose_eval`

Prediction Script `run_custom_scenario`

`Multiview_wrapper`

`MultiviewPredictorRunner`

`MultiviewScenePredictor`