vis4d.op.box.anchor

Anchor and point generators.

class AnchorGenerator(strides, ratios, scales=None, base_sizes=None, scale_major=True, octave_base_scale=None, scales_per_octave=None, centers=None, center_offset=0.0)[source]

Standard anchor generator for 2D anchor-based detectors.

Examples

>>> from vis4d.op.box.anchor import AnchorGenerator
>>> self = AnchorGenerator([16], [1.], [1.], [9])
>>> all_anchors = self.grid_priors([(2, 2)], device='cpu')
>>> print(all_anchors)
[tensor([[-4.5000, -4.5000,  4.5000,  4.5000],
        [11.5000, -4.5000, 20.5000,  4.5000],
        [-4.5000, 11.5000,  4.5000, 20.5000],
        [11.5000, 11.5000, 20.5000, 20.5000]])]
>>> self = AnchorGenerator([16, 32], [1.], [1.], [9, 18])
>>> all_anchors = self.grid_priors([(2, 2), (1, 1)], device='cpu')
>>> print(all_anchors)
[tensor([[-4.5000, -4.5000,  4.5000,  4.5000],
        [11.5000, -4.5000, 20.5000,  4.5000],
        [-4.5000, 11.5000,  4.5000, 20.5000],
        [11.5000, 11.5000, 20.5000, 20.5000]]),         tensor([[-9., -9., 9., 9.]])]

Creates an instance of the class.

Parameters:
  • strides (list[int] | list[tuple[int, int]]) – Strides of anchors in multiple feature levels in order (w, h).

  • ratios (list[float]) – The list of ratios between the height and width of anchors in a single level.

  • scales (list[int] | None) – Anchor scales for anchors in a single level. It cannot be set at the same time if octave_base_scale and scales_per_octave are set.

  • base_sizes (list[int] | None) – The basic sizes of anchors in multiple levels. If None is given, strides will be used as base_sizes. (If strides are non square, the shortest stride is taken.)

  • scale_major (bool) – Whether to multiply scales first when generating base anchors. If true, the anchors in the same row will have the same scales. By default it is True in V2.0

  • octave_base_scale (int) – The base scale of octave.

  • scales_per_octave (int) – Number of scales for each octave. octave_base_scale and scales_per_octave are usually used in retinanet and the scales should be None when they are set.

  • centers (list[tuple[float, float]] | None) – The centers of the anchor relative to the feature grid center in multiple feature levels. By default it is set to be None and not used. If a list of tuple of float is given, they will be used to shift the centers of anchors.

  • center_offset (float) – The offset of center in proportion to anchors’ width and height. By default it is 0 in V2.0.

__repr__()[source]

str: a string that describes the module.

Return type:

str

gen_base_anchors()[source]

Generate base anchors.

Returns:

Base anchors of a feature grid in multiple feature levels.

Return type:

list(torch.Tensor)

gen_single_level_base_anchors(base_size, scales, ratios, center=None)[source]

Generate base anchors of a single level.

Parameters:
  • base_size (int) – Basic size of an anchor.

  • scales (Tensor) – Scales of the anchor.

  • ratios (Tensor) – The ratio between between the height and width of anchors in a single level.

  • center (tuple[float], optional) – The center of the base anchor related to a single feature grid. Defaults to None.

Returns:

Anchors in a single-level feature maps.

Return type:

Tensor

grid_priors(featmap_sizes, dtype=torch.float32, device=device(type='cpu'))[source]

Generate grid anchors in multiple feature levels.

Parameters:
  • featmap_sizes (list[tuple]) – List of feature map sizes in multiple feature levels.

  • dtype (torch.dtype) – Dtype of priors. Default: torch.float32.

  • device (torch.device) – The device where the anchors will be put on.

Returns:

Anchors in multiple feature levels. The sizes of each

tensor should be [N, 4], where N = width * height * num_base_anchors, width and height are the sizes of the corresponding feature level, num_base_anchors is the number of anchors for that level.

Return type:

list[Tensor]

single_level_grid_priors(featmap_size, level_idx, dtype=torch.float32, device=device(type='cpu'))[source]

Generate grid anchors of a single level.

Parameters:
  • featmap_size (tuple[int, int]) – Size of the feature maps.

  • level_idx (int) – The index of corresponding feature map level.

  • dtype (torch.dtype, optional) – Data type of points. Defaults to torch.float32.

  • device (torch.device) – The device the tensor will be put on.

Returns:

Anchors in the overall feature maps.

Return type:

Tensor

property num_base_priors: list[int]

The number of priors at a point on the feature grid.

Type:

list[int]

property num_levels: int

number of feature levels that the generator will be applied.

Type:

int

class MlvlPointGenerator(strides, offset=0.5)[source]

Standard points generator for multi-level feature maps.

Used for 2D points-based detectors.

Parameters:
  • strides (list[int] | list[tuple[int, int]]) – Strides of anchors in multiple feature levels in order (w, h).

  • offset (float) – The offset of points, the value is normalized with corresponding stride. Defaults to 0.5.

Init.

grid_priors(featmap_sizes, dtype=torch.float32, device=device(type='cuda'), with_stride=False)[source]

Generate grid points of multiple feature levels.

Parameters:
  • featmap_sizes (list[tuple[int, int]]) – List of feature map sizes in multiple feature levels, each (H, W).

  • dtype (torch.dtype) – Dtype of priors. Defaults to torch.float32.

  • device (torch.device) – The device where the anchors will be put on. Defaults to torch.device(“cuda”).

  • with_stride (bool) – Whether to concatenate the stride to the last dimension of points. Defaults to False,

Returns:

Points of multiple feature levels.

The sizes of each tensor should be (N, 2) when with stride is False, where N = width * height, width and height are the sizes of the corresponding feature level, and the last dimension 2 represent (coord_x, coord_y), otherwise the shape should be (N, 4), and the last dimension 4 represent (coord_x, coord_y, stride_w, stride_h).

Return type:

list[torch.Tensor]

single_level_grid_priors(featmap_size, level_idx, dtype=torch.float32, device=device(type='cuda'), with_stride=False)[source]

Generate grid Points of a single level.

Note

This function is usually called by method self.grid_priors.

Parameters:
  • featmap_size (tuple[int, int]) – Size of the feature maps, (H, W).

  • level_idx (int) – The index of corresponding feature map level.

  • dtype (torch.dtype) – Dtype of priors. Defaults to torch.float32.

  • device (torch.device) – The device where the tensors will be put on. Defaults to torch.device(“cuda”).

  • with_stride (bool) – Concatenate the stride to the last dimension of points. Defaults to False,

Returns:

Points of single feature levels.

The shape of tensor should be (N, 2) when with stride is False, where N = width * height, width and height are the sizes of the corresponding feature level, and the last dimension 2 represent (coord_x, coord_y), otherwise the shape should be (N, 4), and the last dimension 4 represent (coord_x, coord_y, stride_w, stride_h).

Return type:

Tensor

single_level_valid_flags(featmap_size, valid_size, device=device(type='cuda'))[source]

Generate the valid flags of points of a single feature map.

Parameters:
  • featmap_size (tuple[int, int]) – The size of feature maps, (H, W).

  • valid_size (tuple[int, int]) – The valid size of the feature maps, (H, W).

  • device (torch.device, optional) – The device where the flags will be put on. Defaults to torch.device(“cuda”).

Returns:

The valid flags of each points in a single level

feature map.

Return type:

torch.Tensor

valid_flags(featmap_sizes, pad_shape, device=device(type='cuda'))[source]

Generate valid flags of points of multiple feature levels.

Parameters:
  • featmap_sizes (list[tuple[int, int]]) – List of feature map sizes in multiple feature levels, each (H, W).

  • pad_shape (tuple[int, int]) – The padded shape of the image, (H, W).

  • device (torch.device) – The device where the anchors will be put on. Defaults to torch.device(“cuda”).

Returns:

Valid flags of points of multiple levels.

Return type:

list(torch.Tensor)

property num_base_priors: list[int]

Number of points at a point on the feature grid.

property num_levels: int

Number of feature levels.

anchor_inside_image(flat_anchors, img_shape, allowed_border=0)[source]

Check whether the anchors are inside the border.

Parameters:
  • flat_anchors (Tensor) – Flatten anchors, shape (n, 4).

  • img_shape (tuple(int)) – Shape of current image.

  • allowed_border (int) – The border to allow the valid anchor. Defaults to 0.

Returns:

Flags indicating whether the anchors are inside a valid range.

Return type:

Tensor

Modules

vis4d.op.box.anchor.anchor_generator

Anchor generator for 2D bounding boxes.

vis4d.op.box.anchor.point_generator

Point generator for 2D bounding boxes.

vis4d.op.box.anchor.util

Anchor utils.