vis4d.op.track3d.cc_3dt¶

CC-3DT graph.

Functions

`cam_to_global`(boxes_3d_list, extrinsics)	Convert camera coordinates to global coordinates.
`get_track_3d_out`(boxes_3d, class_ids, ...)	Get track 3D output.

Classes

CC3DTrackAssociation([init_score_thr, ...])

Data association relying on quasi-dense instance similarity and 3D clue.

class CC3DTrackAssociation(init_score_thr=0.8, obj_score_thr=0.5, match_score_thr=0.5, nms_backdrop_iou_thr=0.3, nms_class_iou_thr=0.7, nms_conf_thr=0.5, with_cats=True, bbox_affinity_weight=0.5)[source]¶

Data association relying on quasi-dense instance similarity and 3D clue.

This class assigns detection candidates to a given memory of existing tracks and backdrops. Backdrops are low-score detections kept in case they have high similarity with a high-score detection in succeeding frames.

Creates an instance of the class.

Parameters:

init_score_thr (float) – Confidence threshold for initializing a new track.
obj_score_thr (float) – Confidence treshold s.t. a detection is considered in the track / det matching process.
match_score_thr (float) – Similarity score threshold for matching a detection to an existing track.
nms_backdrop_iou_thr (float) – Maximum IoU of a backdrop with another detection.
nms_class_iou_thr (float) – Maximum IoU of a high score detection with another of a different class.
with_cats (bool) – If to consider category information for tracking (i.e. all detections within a track must have consistent category labels).
nms_conf_thr (float) – Confidence threshold for NMS.
bbox_affinity_weight (float) – Weight of bbox affinity in the overall affinity score.

__call__(detections, camera_ids, detection_scores, detections_3d, detection_scores_3d, detection_class_ids, detection_embeddings, memory_boxes_3d=None, memory_track_ids=None, memory_class_ids=None, memory_embeddings=None, memory_boxes_3d_predict=None, memory_velocities=None, with_depth_confidence=True)[source]¶

Process inputs, match detections with existing tracks.

Parameters:

detections (Tensor) – [N, 4] detected boxes.
camera_ids (Tensor) – [N,] camera ids.
detection_scores (Tensor) – [N,] confidence scores.
detections_3d (Tensor) – [N, 7] detected boxes in 3D.
detection_scores_3d (Tensor) – [N,] confidence scores in 3D.
detection_class_ids (Tensor) – [N,] class indices.
detection_embeddings (Tensor) – [N, C] appearance embeddings.
memory_boxes_3d (Tensor) – [M, 7] boxes in memory.
memory_track_ids (Tensor) – [M,] track ids in memory.
memory_class_ids (Tensor) – [M,] class indices in memory.
memory_embeddings (Tensor) – [M, C] appearance embeddings in memory.
memory_boxes_3d_predict (Tensor) – [M, 7] predicted boxes in memory.
memory_velocities (Tensor) – [M, 7] velocities in memory.

Returns:

track ids of active tracks and selected: detection indices corresponding to tracks.

Return type:

tuple[Tensor, Tensor]

static depth_ordering(obsv_boxes_3d, memory_boxes_3d_predict, memory_boxes_3d, memory_velocities)[source]¶

Depth ordering matching.

Return type:: Tensor

cam_to_global(boxes_3d_list, extrinsics)[source]¶

Convert camera coordinates to global coordinates.

Return type:: list[Tensor]

get_track_3d_out(boxes_3d, class_ids, scores_3d, track_ids)[source]¶

Get track 3D output.

Parameters:

boxes_3d (Tensor) – (N, 12): x,y,z,h,w,l,rx,ry,rz,vx,vy,vz
class_ids (Tensor) – (N,)
scores_3d (Tensor) – (N,)
track_ids (Tensor) – (N,)

Returns:

output

Return type:

Track3DOut