vis4d.op.detect.rpn

Faster RCNN RPN Head.

Functions

get_default_rpn_box_codec([target_means, ...])

Get the default bounding box encoder and decoder for RPN.

Classes

RPN2RoI(anchor_generator[, box_decoder, ...])

Generate Proposals (RoIs) from RPN network output.

RPNHead(num_anchors[, num_convs, ...])

Faster RCNN RPN Head.

RPNLoss(anchor_generator, box_encoder[, ...])

Loss of region proposal network.

RPNLosses(rpn_loss_cls, rpn_loss_bbox)

RPN loss container.

RPNOut(cls, box)

Output of RPN head.

class RPN2RoI(anchor_generator, box_decoder=None, num_proposals_pre_nms_train=2000, num_proposals_pre_nms_test=1000, max_per_img=1000, proposal_nms_threshold=0.7, min_proposal_size=(0, 0))[source]

Generate Proposals (RoIs) from RPN network output.

This class acts as a stateless functor that does the following: 1. Create anchor grid for feature grids (classification and regression

outputs) at all scales.

For each image
For each level
  1. Get a topk pre-selection of flattened classification scores and

    box energies from feature output before NMS.

  1. Decode class scores and box energies into proposal boxes, apply NMS.

Return proposal boxes for all images.

Creates an instance of the class.

Parameters:
  • anchor_generator (AnchorGenerator) – Creates anchor grid serving as for bounding box regression.

  • box_decoder (DeltaXYWHBBoxDecoder, optional) – decodes box energies predicted by the network into 2D bounding box parameters. Defaults to None. If None, uses the default decoder.

  • num_proposals_pre_nms_train (int, optional) – How many boxes are kept prior to NMS during training. Defaults to 2000.

  • num_proposals_pre_nms_test (int, optional) – How many boxes are kept prior to NMS during inference. Defaults to 1000.

  • max_per_img (int, optional) – Maximum boxes per image. Defaults to 1000.

  • proposal_nms_threshold (float, optional) – NMS threshold on proposal boxes. Defaults to 0.7.

  • min_proposal_size (tuple[int, int], optional) – Minimum size of a proposal box. Defaults to (0, 0).

forward(class_outs, regression_outs, images_hw)[source]

Compute proposals from RPN network outputs.

Generate anchor grid for all scales. For each batch element:

Compute classification, regression and anchor pairs for all scales. Decode those pairs into proposals, post-process with NMS.

Parameters:
  • class_outs (list[torch.Tensor]) – [N, 1 * A, H, W] per scale.

  • regression_outs (list[torch.Tensor]) – [N, 4 * A, H, W] per scale.

  • images_hw (list[tuple[int, int]]) – list of image sizes.

Returns:

proposal boxes and scores.

Return type:

Proposals

class RPNHead(num_anchors, num_convs=1, in_channels=256, feat_channels=256, start_level=2)[source]

Faster RCNN RPN Head.

Creates RPN network output from a multi-scale feature map input.

Creates an instance of the class.

Parameters:
  • num_anchors (int) – Number of anchors per cell.

  • num_convs (int, optional) – Number of conv layers before RPN heads. Defaults to 1.

  • in_channels (int, optional) – Feature channel size of input feature maps. Defaults to 256.

  • feat_channels (int, optional) – Feature channel size of conv layers. Defaults to 256.

  • start_level (int, optional) – starting level of feature maps. Defaults to 2.

__call__(features)[source]

Type definition.

Return type:

RPNOut

forward(features)[source]

Forward pass of RPN.

Return type:

RPNOut

class RPNLoss(anchor_generator, box_encoder, matcher=None, sampler=None, loss_cls=<function binary_cross_entropy_with_logits>, loss_bbox=<function l1_loss>)[source]

Loss of region proposal network.

Creates an instance of the class.

Parameters:
  • anchor_generator (AnchorGenerator) – Generates anchor grid priors.

  • box_encoder (DeltaXYWHBBoxEncoder) – Encodes bounding boxes to the desired network output.

  • matcher (Matcher) – Matches ground truth boxes to anchor grid priors. Defaults to None. If None, uses MaxIoUMatcher.

  • sampler (Sampler) – Samples anchors for training. Defaults to None. If None, uses RandomSampler.

  • loss_cls (TorchLossFunc) – Classification loss function. Defaults to F.binary_cross_entropy_with_logits.

  • loss_bbox (TorchLossFunc) – Regression loss function. Defaults to l1_loss.

forward(cls_outs, reg_outs, target_boxes, images_hw, target_class_ids=None)[source]

Compute RPN classification and regression losses.

Parameters:
  • cls_outs (list[torch.Tensor]) – Network classification outputs at all scales.

  • reg_outs (list[torch.Tensor]) – Network regression outputs at all scales.

  • target_boxes (list[torch.Tensor]) – Target bounding boxes.

  • images_hw (list[tuple[int, int]]) – Image dimensions without padding.

  • target_class_ids (list[torch.Tensor] | None) – Target class labels.

Returns:

Classification and regression losses.

Return type:

DenseAnchorHeadLosses

class RPNLosses(rpn_loss_cls: torch.Tensor, rpn_loss_bbox: torch.Tensor)[source]

RPN loss container.

Create new instance of RPNLosses(rpn_loss_cls, rpn_loss_bbox)

rpn_loss_bbox: Tensor

Alias for field number 1

rpn_loss_cls: Tensor

Alias for field number 0

class RPNOut(cls: list[torch.Tensor], box: list[torch.Tensor])[source]

Output of RPN head.

Create new instance of RPNOut(cls, box)

box: list[Tensor]

Alias for field number 1

cls: list[Tensor]

Alias for field number 0

get_default_rpn_box_codec(target_means=(0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0))[source]

Get the default bounding box encoder and decoder for RPN.

Return type:

tuple[DeltaXYWHBBoxEncoder, DeltaXYWHBBoxDecoder]