vis4d.op.box.encoder.delta_xywh¶

XYWH Delta coder for 2D boxes.

Modified from mmdetection (https://github.com/open-mmlab/mmdetection).

Functions

`bbox2delta`(proposals, gt_boxes[, means, stds])	Compute deltas of proposals w.r.t.
`delta2bbox`(rois, deltas[, means, stds, ...])	Apply deltas to shift/scale base boxes.

Classes

`DeltaXYWHBBoxDecoder`([target_means, ...])	Delta XYWH BBox decoder.
`DeltaXYWHBBoxEncoder`([target_means, target_stds])	Delta XYWH BBox encoder.

class DeltaXYWHBBoxDecoder(target_means=(0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016)[source]¶

Delta XYWH BBox decoder.

Following the practice in R-CNN, it decodes delta (dx, dy, dw, dh) back to original bbox (x1, y1, x2, y2).

Creates an instance of the class.

Parameters:

target_means (tuple, optional) – Denormalizing means of target for delta coordinates. Defaults to (0.0, 0.0, 0.0, 0.0).
target_stds (tuple, optional) – Denormalizing standard deviation of target for delta coordinates. Defaults to (1.0, 1.0, 1.0, 1.0).
wh_ratio_clip (float, optional) – Maximum aspect ratio for boxes. Defaults to 16/1000.

__call__(boxes, box_deltas)[source]¶

Apply box offset energies box_deltas to boxes.

Parameters:

boxes (Tensor) – Basic boxes. Shape (B, N, 4) or (N, 4)
box_deltas (Tensor) – Encoded offsets with respect to each roi. Has shape (B, N, num_classes * 4) or (B, N, 4) or (N, num_classes * 4) or (N, 4). Note N = num_anchors * W * H when rois is a grid of anchors.Offset encoding follows [1].

Returns:

Decoded boxes.

Return type:

Tensor

class DeltaXYWHBBoxEncoder(target_means=(0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0))[source]¶

Delta XYWH BBox encoder.

Following the practice in R-CNN, it encodes bbox (x1, y1, x2, y2) into delta (dx, dy, dw, dh).

Creates an instance of the class.

Parameters:

target_means (tuple, optional) – Denormalizing means of target for delta coordinates. Defaults to (0.0, 0.0, 0.0, 0.0).
target_stds (tuple, optional) – Denormalizing standard deviation of target for delta coordinates. Defaults to (1.0, 1.0, 1.0, 1.0).

__call__(boxes, targets)[source]¶

Get box regression transformation deltas.

Used to transform target boxes into target regression parameters.

Parameters:

boxes (Tensor) – Source boxes, e.g., object proposals.
targets (Tensor) – Target of the transformation, e.g., ground-truth boxes.

Returns:

Box transformation deltas

Return type:

Tensor

bbox2delta(proposals, gt_boxes, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0))[source]¶

Compute deltas of proposals w.r.t. gt.

We usually compute the deltas of x, y, w, h of proposals w.r.t ground truth boxes to get regression target. This is the inverse function of delta2bbox().

Parameters:

proposals (Tensor) – Boxes to be transformed, shape (N, …, 4).
gt_boxes (Tensor) – Gt boxes to be used as base, shape (N, …, 4).
means (Sequence[float]) – Denormalizing means for delta coordinates.
stds (Sequence[float]) – Denormalizing standard deviation for delta coordinates.

Returns:

deltas with shape (N, 4), where columns represent dx, dy,: dw, dh.

Return type:

Tensor

delta2bbox(rois, deltas, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016)[source]¶

Apply deltas to shift/scale base boxes.

Typically the rois are anchor or proposed bounding boxes and the deltas are network outputs used to shift/scale those boxes. This is the inverse function of bbox2delta().

Parameters:

rois (Tensor) – Boxes to be transformed. Has shape (N, 4).
deltas (Tensor) – Encoded offsets relative to each roi. Has shape (N, num_classes * 4) or (N, 4). Note N = num_base_anchors * W * H, when rois is a grid of anchors. Offset encoding follows [1].
means (Sequence[float]) – Denormalizing means for delta coordinates. Default (0., 0., 0., 0.).
stds (Sequence[float]) – Denormalizing standard deviation for delta coordinates. Default (1., 1., 1., 1.).
wh_ratio_clip (float) – Maximum aspect ratio for boxes. Default 16 / 1000.

Returns:

Boxes with shape (N, num_classes * 4) or (N, 4), where 4: represent tl_x, tl_y, br_x, br_y.

Return type:

Tensor

References