vis4d.op.mask.util¶
Utility functions for segmentation masks.
Functions
|
Clip mask. |
|
Obtain the tight bounding boxes of binary masks. |
|
Convert N binary HxW masks to HxW semantic mask. |
|
Paste masks that are of a fixed resolution into an image. |
|
Postprocess segmentations. |
|
Remove overlapping pixels between masks. |
- clip_mask(mask, target_shape)[source]¶
Clip mask.
- Parameters:
mask (Tensor) – Mask with shape [C, H, W].
target_shape (tuple[int, int]) – Target shape (Ht, Wt).
- Returns:
Clipped mask with shape [C, Ht, Wt].
- Return type:
Tensor
- masks2boxes(masks)[source]¶
Obtain the tight bounding boxes of binary masks.
- Parameters:
masks (Tensor) – Binary mask of shape (N, H, W).
- Returns:
Boxes with shape (N, 4) of positive region in binary mask.
- Return type:
Tensor
- nhw_to_hwc_mask(masks, class_ids, ignore_class=255)[source]¶
Convert N binary HxW masks to HxW semantic mask.
- Parameters:
masks (Tensor) – Masks with shape [N, H, W].
class_ids (Tensor) – Class IDs with shape [N, 1].
ignore_class (int, optional) – Ignore label. Defaults to 255.
- Returns:
- Masks with shape [H, W], where each location indicate the
class label.
- Return type:
Tensor
- paste_masks_in_image(masks, boxes, image_shape, threshold=0.5, bytes_per_float=4, gpu_mem_limit=1073741824)[source]¶
Paste masks that are of a fixed resolution into an image.
The location, height, and width for pasting each mask is determined by their corresponding bounding boxes in boxes.
This implementation is modified from https://github.com/facebookresearch/detectron2/
- Parameters:
masks (Tensor) – Masks with shape [N, Hmask, Wmask], where N is the number of detected object instances in the image and Hmask, Wmask are the mask width and mask height of the predicted mask (e.g., Hmask = Wmask = 28). Values are in [0, 1].
boxes (Tensor) – Boxes with shape [N, 4]. boxes[i] and masks[i] correspond to the same object instance.
image_shape (tuple[int, int]) – Image resolution (width, height).
threshold (float, optional) – Threshold for discretization of mask. Defaults to 0.5.
bytes_per_float (int, optional) – Number of bytes per float. Defaults to 4.
gpu_mem_limit (int, optional) – GPU memory limit. Defaults to 1024**3.
- Returns:
- Masks with shape [N, Himage, Wimage], where N is the
number of detected object instances and Himage, Wimage are the image width and height.
- Return type:
Tensor
- postprocess_segms(segms, images_hw, original_hw)[source]¶
Postprocess segmentations.
- Parameters:
segms (Tensor) – Segmentations with shape [B, C, H, W].
images_hw (list[tuple[int, int]]) – Image resolutions.
original_hw (list[tuple[int, int]]) – Original image resolutions.
- Returns:
Post-processed segmentations.
- Return type:
Tensor