vis4d.op.track.qdtrack

Quasi-dense embedding similarity based graph.

Functions

get_default_box_matcher()

Get default box matcher of qdtrack.

get_default_box_sampler()

Get default box sampler of qdtrack.

Classes

QDSimilarityHead([proposal_pooler, in_dim, ...])

Instance embedding head for quasi-dense similarity learning.

QDTrackAssociation([init_score_thr, ...])

Data association relying on quasi-dense instance similarity.

QDTrackHead([similarity_head, box_sampler, ...])

QDTrack - quasi-dense instance similarity learning.

QDTrackInstanceSimilarityLoss([softmax_temp])

Instance similarity loss as in QDTrack.

QDTrackInstanceSimilarityLosses(track_loss, ...)

QDTrack losses return type.

QDTrackOut(key_embeddings, ref_embeddings, ...)

Output of QDTrack during training.

class QDSimilarityHead(proposal_pooler=None, in_dim=256, num_convs=4, conv_out_dim=256, conv_has_bias=False, num_fcs=1, fc_out_dim=1024, embedding_dim=256, norm='GroupNorm', num_groups=32, start_level=2)[source]

Instance embedding head for quasi-dense similarity learning.

Given a set of input feature maps and RoIs, pool RoI representations from feature maps and process them to a per-RoI embeddings vector.

Creates an instance of the class.

Parameters:
  • proposal_pooler (None | RoIPooler, optional) – RoI pooling module. Defaults to None.

  • in_dim (int, optional) – Input feature dimension. Defaults to 256.

  • num_convs (int, optional) – Number of convolutional layers inside the head. Defaults to 4.

  • conv_out_dim (int, optional) – Output dimension of the last conv layer. Defaults to 256.

  • conv_has_bias (bool, optional) – If the conv layers have a bias parameter. Defaults to False.

  • num_fcs (int, optional) – Number of fully connected layers following the conv layers. Defaults to 1.

  • fc_out_dim (int, optional) – Output dimension of the last fully connected layer. Defaults to 1024.

  • embedding_dim (int, optional) – Dimensionality of the output instance embedding. Defaults to 256.

  • norm (str, optional) – Normalization of the layers inside the head. One of BatchNorm2d, GroupNorm. Defaults to “GroupNorm”.

  • num_groups (int, optional) – Number of groups for the GroupNorm normalization. Defaults to 32.

  • start_level (int, optional) – starting level of feature maps. Defaults to 2.

__call__(features, boxes)[source]

Type definition.

Return type:

list[Tensor]

forward(features, boxes)[source]

Similarity head forward pass.

Parameters:
  • features (list[Tensor]) – A feature pyramid. The list index represents the level, which has a downsampling raio of 2^index. fp[0] is a feature map with the image resolution instead of the original image.

  • boxes (list[Tensor]) – A list of [N, 4] 2D bounding boxes per batch element.

Returns:

An embedding vector per input box, .

Return type:

list[Tensor]

class QDTrackAssociation(init_score_thr=0.7, obj_score_thr=0.3, match_score_thr=0.5, nms_conf_thr=0.5, nms_backdrop_iou_thr=0.3, nms_class_iou_thr=0.7, with_cats=True)[source]

Data association relying on quasi-dense instance similarity.

This class assigns detection candidates to a given memory of existing tracks and backdrops. Backdrops are low-score detections kept in case they have high similarity with a high-score detection in succeeding frames.

init_score_thr

Confidence threshold for initializing a new track

obj_score_thr

Confidence treshold s.t. a detection is considered in

the track / det matching process.
match_score_thr

Similarity score threshold for matching a detection to

an existing track.
memo_backdrop_frames

Number of timesteps to keep backdrops.

memo_momentum

Momentum of embedding memory for smoothing embeddings.

nms_backdrop_iou_thr

Maximum IoU of a backdrop with another detection.

nms_class_iou_thr

Maximum IoU of a high score detection with another

of a different class.
with_cats

If to consider category information for tracking (i.e. all

detections within a track must have consistent category labels).

Creates an instance of the class.

__call__(detections, detection_scores, detection_class_ids, detection_embeddings, memory_track_ids=None, memory_class_ids=None, memory_embeddings=None)[source]

Process inputs, match detections with existing tracks.

Parameters:
  • detections (Tensor) – [N, 4] detected boxes.

  • detection_scores (Tensor) – [N,] confidence scores.

  • detection_class_ids (Tensor) – [N,] class indices.

  • detection_embeddings (Tensor) – [N, C] appearance embeddings.

  • memory_track_ids (Tensor) – [M,] track ids in memory.

  • memory_class_ids (Tensor) – [M,] class indices in memory.

  • memory_embeddings (Tensor) – [M, C] appearance embeddings in memory.

Returns:

track ids of active tracks and selected

detection indices corresponding to tracks.

Return type:

tuple[Tensor, Tensor]

class QDTrackHead(similarity_head=None, box_sampler=None, box_matcher=None, proposal_append_gt=True)[source]

QDTrack - quasi-dense instance similarity learning.

Creates an instance of the class.

__call__(features, det_boxes, target_boxes=None, target_track_ids=None)[source]

Type definition for call implementation.

Return type:

QDTrackOut

forward(features, det_boxes, target_boxes=None, target_track_ids=None)[source]

Forward function.

Return type:

QDTrackOut

class QDTrackInstanceSimilarityLoss(softmax_temp=-1)[source]

Instance similarity loss as in QDTrack.

Given a number of key frame embeddings and a number of reference frame embeddings along with their track identities, compute two losses: 1. Multi-positive cross-entropy loss. 2. Cosine similarity loss (auxiliary).

Creates an instance of the class.

Parameters:

softmax_temp (float, optional) – Temperature parameter for multi-positive cross-entropy loss. Defaults to -1.

__call__(key_embeddings, ref_embeddings, key_track_ids, ref_track_ids)[source]

Type definition.

Return type:

QDTrackInstanceSimilarityLosses

forward(key_embeddings, ref_embeddings, key_track_ids, ref_track_ids)[source]

The QDTrack instance similarity loss.

Key inputs are of type list[Tensor/Boxes2D] (Lists are length N) Ref inputs are of type list[list[Tensor/Boxes2D]] where the lists are of length MxN. Where M is the number of reference views and N is the number of batch elements.

NOTE: this only works if key only contains positives and all negatives in ref have track_id -1

Parameters:
  • key_embeddings (list[Tensor]) – key frame embeddings.

  • ref_embeddings (list[list[Tensor]]) – reference frame embeddings.

  • key_track_ids (list[Tensor]) – associated track ids per embedding in key frame.

  • ref_track_ids (list[list[Tensor]]) – associated track ids per embedding in reference frame(s).

Returns:

Scalar loss tensors.

Return type:

QDTrackInstanceSimilarityLosses

class QDTrackInstanceSimilarityLosses(track_loss: Tensor, track_loss_aux: Tensor)[source]

QDTrack losses return type. Consists of two scalar loss tensors.

Create new instance of QDTrackInstanceSimilarityLosses(track_loss, track_loss_aux)

track_loss: Tensor

Alias for field number 0

track_loss_aux: Tensor

Alias for field number 1

class QDTrackOut(key_embeddings: list[Tensor], ref_embeddings: list[list[Tensor]] | None, key_track_ids: list[Tensor] | None, ref_track_ids: list[list[Tensor]] | None)[source]

Output of QDTrack during training.

Create new instance of QDTrackOut(key_embeddings, ref_embeddings, key_track_ids, ref_track_ids)

key_embeddings: list[Tensor]

Alias for field number 0

key_track_ids: list[Tensor] | None

Alias for field number 2

ref_embeddings: list[list[Tensor]] | None

Alias for field number 1

ref_track_ids: list[list[Tensor]] | None

Alias for field number 3

get_default_box_matcher()[source]

Get default box matcher of qdtrack.

Return type:

MaxIoUMatcher

get_default_box_sampler()[source]

Get default box sampler of qdtrack.

Return type:

CombinedSampler