vis4d.op.track3d.cc_3dt

CC-3DT graph.

Functions

cam_to_global(boxes_3d_list, extrinsics)

Convert camera coordinates to global coordinates.

get_track_3d_out(boxes_3d, class_ids, ...)

Get track 3D output.

Classes

CC3DTrackAssociation([init_score_thr, ...])

Data association relying on quasi-dense instance similarity and 3D clue.

class CC3DTrackAssociation(init_score_thr=0.8, obj_score_thr=0.5, match_score_thr=0.5, nms_backdrop_iou_thr=0.3, nms_class_iou_thr=0.7, nms_conf_thr=0.5, with_cats=True, bbox_affinity_weight=0.5)[source]

Data association relying on quasi-dense instance similarity and 3D clue.

This class assigns detection candidates to a given memory of existing tracks and backdrops. Backdrops are low-score detections kept in case they have high similarity with a high-score detection in succeeding frames.

Creates an instance of the class.

Parameters:
  • init_score_thr (float) – Confidence threshold for initializing a new track.

  • obj_score_thr (float) – Confidence treshold s.t. a detection is considered in the track / det matching process.

  • match_score_thr (float) – Similarity score threshold for matching a detection to an existing track.

  • nms_backdrop_iou_thr (float) – Maximum IoU of a backdrop with another detection.

  • nms_class_iou_thr (float) – Maximum IoU of a high score detection with another of a different class.

  • with_cats (bool) – If to consider category information for tracking (i.e. all detections within a track must have consistent category labels).

  • nms_conf_thr (float) – Confidence threshold for NMS.

  • bbox_affinity_weight (float) – Weight of bbox affinity in the overall affinity score.

__call__(detections, camera_ids, detection_scores, detections_3d, detection_scores_3d, detection_class_ids, detection_embeddings, memory_boxes_3d=None, memory_track_ids=None, memory_class_ids=None, memory_embeddings=None, memory_boxes_3d_predict=None, memory_velocities=None, with_depth_confidence=True)[source]

Process inputs, match detections with existing tracks.

Parameters:
  • detections (Tensor) – [N, 4] detected boxes.

  • camera_ids (Tensor) – [N,] camera ids.

  • detection_scores (Tensor) – [N,] confidence scores.

  • detections_3d (Tensor) – [N, 7] detected boxes in 3D.

  • detection_scores_3d (Tensor) – [N,] confidence scores in 3D.

  • detection_class_ids (Tensor) – [N,] class indices.

  • detection_embeddings (Tensor) – [N, C] appearance embeddings.

  • memory_boxes_3d (Tensor) – [M, 7] boxes in memory.

  • memory_track_ids (Tensor) – [M,] track ids in memory.

  • memory_class_ids (Tensor) – [M,] class indices in memory.

  • memory_embeddings (Tensor) – [M, C] appearance embeddings in memory.

  • memory_boxes_3d_predict (Tensor) – [M, 7] predicted boxes in memory.

  • memory_velocities (Tensor) – [M, 7] velocities in memory.

Returns:

track ids of active tracks and selected

detection indices corresponding to tracks.

Return type:

tuple[Tensor, Tensor]

static depth_ordering(obsv_boxes_3d, memory_boxes_3d_predict, memory_boxes_3d, memory_velocities)[source]

Depth ordering matching.

Return type:

Tensor

cam_to_global(boxes_3d_list, extrinsics)[source]

Convert camera coordinates to global coordinates.

Return type:

list[Tensor]

get_track_3d_out(boxes_3d, class_ids, scores_3d, track_ids)[source]

Get track 3D output.

Parameters:
  • boxes_3d (Tensor) – (N, 12): x,y,z,h,w,l,rx,ry,rz,vx,vy,vz

  • class_ids (Tensor) – (N,)

  • scores_3d (Tensor) – (N,)

  • track_ids (Tensor) – (N,)

Returns:

output

Return type:

Track3DOut