vis4d.model.detect.mask_rcnn

Mask RCNN model implementation and runtime.

Classes

MaskDetectionOut(boxes, masks)

Mask detection output.

MaskRCNN(num_classes[, basemodel, ...])

Mask RCNN model.

MaskRCNNOut(boxes, masks)

Mask RCNN output.

class MaskDetectionOut(boxes: DetOut, masks: MaskOut)[source]

Mask detection output.

Create new instance of MaskDetectionOut(boxes, masks)

boxes: DetOut

Alias for field number 0

masks: MaskOut

Alias for field number 1

class MaskRCNN(num_classes, basemodel=None, faster_rcnn_head=None, mask_head=None, rcnn_box_decoder=None, no_overlap=False, weights=None)[source]

Mask RCNN model.

Parameters:
  • num_classes (int) – Number of classes.

  • basemodel (BaseModel, optional) – Base model network. Defaults to None. If None, will use ResNet50.

  • faster_rcnn_head (FasterRCNNHead, optional) – Faster RCNN head. Defaults to None. if None, will use default FasterRCNNHead.

  • mask_head (MaskRCNNHead, optional) – Mask RCNN head. Defaults to None. if None, will use default MaskRCNNHead.

  • rcnn_box_decoder (DeltaXYWHBBoxDecoder, optional) – Decoder for RCNN bounding boxes. Defaults to None.

  • no_overlap (bool, optional) – Whether to remove overlapping pixels between masks. Defaults to False.

  • weights (None | str, optional) – Weights to load for model. If set to “mmdet”, will load MMDetection pre-trained weights. Defaults to None.

Creates an instance of the class.

forward(images, input_hw, boxes2d=None, boxes2d_classes=None, original_hw=None)[source]

Forward pass.

Parameters:
  • images (torch.Tensor) – Input images.

  • input_hw (list[tuple[int, int]]) – Input image resolutions.

  • boxes2d (None | list[torch.Tensor], optional) – Bounding box labels. Required for training. Defaults to None.

  • boxes2d_classes (None | list[torch.Tensor], optional) – Class labels. Required for training. Defaults to None.

  • original_hw (None | list[tuple[int, int]], optional) – Original image resolutions (before padding and resizing). Required for testing. Defaults to None.

Returns:

Either raw model

outputs (for training) or predicted outputs (for testing).

Return type:

MaskRCNNOut | MaskDetectionOut

forward_test(images, images_hw, original_hw)[source]

Forward testing stage.

Parameters:
  • images (torch.Tensor) – Input images.

  • images_hw (list[tuple[int, int]]) – Input image resolutions.

  • original_hw (list[tuple[int, int]]) – Original image resolutions (before padding and resizing).

Returns:

Predicted outputs.

Return type:

MaskDetectionOut

forward_train(images, images_hw, target_boxes, target_classes)[source]

Forward training stage.

Parameters:
  • images (torch.Tensor) – Input images.

  • images_hw (list[tuple[int, int]]) – Input image resolutions.

  • target_boxes (list[torch.Tensor]) – Bounding box labels. Required for training. Defaults to None.

  • target_classes (list[torch.Tensor]) – Class labels. Required for training. Defaults to None.

Returns:

Raw model outputs.

Return type:

MaskRCNNOut

class MaskRCNNOut(boxes: FRCNNOut, masks: MaskRCNNHeadOut)[source]

Mask RCNN output.

Create new instance of MaskRCNNOut(boxes, masks)

boxes: FRCNNOut

Alias for field number 0

masks: MaskRCNNHeadOut

Alias for field number 1