vis4d.op.mask.util

Utility functions for segmentation masks.

Functions

clip_mask(mask, target_shape)

Clip mask.

masks2boxes(masks)

Obtain the tight bounding boxes of binary masks.

nhw_to_hwc_mask(masks, class_ids[, ignore_class])

Convert N binary HxW masks to HxW semantic mask.

paste_masks_in_image(masks, boxes, image_shape)

Paste masks that are of a fixed resolution into an image.

postprocess_segms(segms, images_hw, original_hw)

Postprocess segmentations.

remove_overlap(mask, score)

Remove overlapping pixels between masks.

clip_mask(mask, target_shape)[source]

Clip mask.

Parameters:
  • mask (Tensor) – Mask with shape [C, H, W].

  • target_shape (tuple[int, int]) – Target shape (Ht, Wt).

Returns:

Clipped mask with shape [C, Ht, Wt].

Return type:

Tensor

masks2boxes(masks)[source]

Obtain the tight bounding boxes of binary masks.

Parameters:

masks (Tensor) – Binary mask of shape (N, H, W).

Returns:

Boxes with shape (N, 4) of positive region in binary mask.

Return type:

Tensor

nhw_to_hwc_mask(masks, class_ids, ignore_class=255)[source]

Convert N binary HxW masks to HxW semantic mask.

Parameters:
  • masks (Tensor) – Masks with shape [N, H, W].

  • class_ids (Tensor) – Class IDs with shape [N, 1].

  • ignore_class (int, optional) – Ignore label. Defaults to 255.

Returns:

Masks with shape [H, W], where each location indicate the

class label.

Return type:

Tensor

paste_masks_in_image(masks, boxes, image_shape, threshold=0.5, bytes_per_float=4, gpu_mem_limit=1073741824)[source]

Paste masks that are of a fixed resolution into an image.

The location, height, and width for pasting each mask is determined by their corresponding bounding boxes in boxes.

This implementation is modified from https://github.com/facebookresearch/detectron2/

Parameters:
  • masks (Tensor) – Masks with shape [N, Hmask, Wmask], where N is the number of detected object instances in the image and Hmask, Wmask are the mask width and mask height of the predicted mask (e.g., Hmask = Wmask = 28). Values are in [0, 1].

  • boxes (Tensor) – Boxes with shape [N, 4]. boxes[i] and masks[i] correspond to the same object instance.

  • image_shape (tuple[int, int]) – Image resolution (width, height).

  • threshold (float, optional) – Threshold for discretization of mask. Defaults to 0.5.

  • bytes_per_float (int, optional) – Number of bytes per float. Defaults to 4.

  • gpu_mem_limit (int, optional) – GPU memory limit. Defaults to 1024**3.

Returns:

Masks with shape [N, Himage, Wimage], where N is the

number of detected object instances and Himage, Wimage are the image width and height.

Return type:

Tensor

postprocess_segms(segms, images_hw, original_hw)[source]

Postprocess segmentations.

Parameters:
  • segms (Tensor) – Segmentations with shape [B, C, H, W].

  • images_hw (list[tuple[int, int]]) – Image resolutions.

  • original_hw (list[tuple[int, int]]) – Original image resolutions.

Returns:

Post-processed segmentations.

Return type:

Tensor

remove_overlap(mask, score)[source]

Remove overlapping pixels between masks.

Parameters:
  • mask (Tensor) – Mask with shape [N, H, W].

  • score (Tensor) – Score with shape [N].

Returns:

Mask with shape [N, H, W].

Return type:

Tensor