vis4d.vis.image.util

Utility functions for image processing operations.

Functions

get_intersection_point(point1, point2, ...)

Get point intersecting with camera near plane on line point1 -> point2.

preprocess_boxes(boxes[, scores, class_ids, ...])

Preprocesses bounding boxes.

preprocess_boxes3d(image_hw, boxes3d, intrinsics)

Preprocesses bounding boxes.

preprocess_image(image[, mode])

Validate and convert input image.

preprocess_masks(masks[, class_ids, ...])

Preprocesses predicted semantic or instance segmentation masks.

project_point(point, intrinsics)

Project single point into the image plane.

get_intersection_point(point1, point2, camera_near_clip)[source]

Get point intersecting with camera near plane on line point1 -> point2.

The line is defined by two points in camera coordinates and their depth.

Parameters:
  • point1 (tuple[float x 3]) – First point in camera coordinates.

  • point2 (tuple[float x 3]) – Second point in camera coordinates

  • camera_near_clip (float) – camera_near_clip

Returns:

The intersection point in camera

coordiantes.

Return type:

tuple[float, float, float]

preprocess_boxes(boxes, scores=None, class_ids=None, track_ids=None, color_palette=[(0, 114, 178), (121, 178, 0), (142, 178, 0), (164, 0, 178), (178, 42, 0), (0, 135, 178), (99, 0, 178), (0, 49, 178), (0, 178, 114), (178, 85, 0), (0, 7, 178), (35, 0, 178), (0, 157, 178), (14, 0, 178), (0, 178, 28), (178, 149, 0), (57, 178, 0), (178, 0, 107), (178, 0, 42), (0, 92, 178), (35, 178, 0), (0, 71, 178), (0, 28, 178), (14, 178, 0), (178, 0, 171), (0, 178, 71), (178, 0, 149), (178, 171, 0), (78, 178, 0), (0, 178, 178), (178, 107, 0), (0, 178, 7), (142, 0, 178), (178, 0, 21), (178, 21, 0), (99, 178, 0), (78, 0, 178), (0, 178, 157), (178, 128, 0), (0, 178, 135), (57, 0, 178), (0, 178, 92), (0, 178, 49), (164, 178, 0), (121, 0, 178), (178, 0, 85), (178, 64, 0), (178, 0, 0), (178, 0, 64), (178, 0, 128)], class_id_mapping=None, default_color=(255, 0, 0))[source]

Preprocesses bounding boxes.

Converts the given predicted bounding boxes and class/track information into lists of corners, labels and colors.

Parameters:
  • boxes (ArrayLikeFloat) – Boxes of shape [N, 4] where N is the number of boxes and the second channel consists of (x1,y1,x2,y2) box coordinates.

  • scores (ArrayLikeFloat) – Scores for each box shape [N]

  • class_ids (ArrayLikeInt) – Class id for each box shape [N]

  • track_ids (ArrayLikeInt) – Track id for each box shape [N]

  • color_palette (list[tuple[float, float, float]]) – Color palette for each id.

  • class_id_mapping (dict[int, str], optional) – Mapping from class id to color tuple (0-255).

  • default_color (tuple[int, int, int]) – fallback color for boxes of no class or track id is given.

Returns:

List of box

corners.

labels_proc (list[str]): List of labels. colors_proc (list[tuple[int, int, int]]): List of colors.

Return type:

boxes_proc (list[tuple[float, float, float, float]])

preprocess_boxes3d(image_hw, boxes3d, intrinsics, extrinsics=None, scores=None, class_ids=None, track_ids=None, color_palette=[(0, 114, 178), (121, 178, 0), (142, 178, 0), (164, 0, 178), (178, 42, 0), (0, 135, 178), (99, 0, 178), (0, 49, 178), (0, 178, 114), (178, 85, 0), (0, 7, 178), (35, 0, 178), (0, 157, 178), (14, 0, 178), (0, 178, 28), (178, 149, 0), (57, 178, 0), (178, 0, 107), (178, 0, 42), (0, 92, 178), (35, 178, 0), (0, 71, 178), (0, 28, 178), (14, 178, 0), (178, 0, 171), (0, 178, 71), (178, 0, 149), (178, 171, 0), (78, 178, 0), (0, 178, 178), (178, 107, 0), (0, 178, 7), (142, 0, 178), (178, 0, 21), (178, 21, 0), (99, 178, 0), (78, 0, 178), (0, 178, 157), (178, 128, 0), (0, 178, 135), (57, 0, 178), (0, 178, 92), (0, 178, 49), (164, 178, 0), (121, 0, 178), (178, 0, 85), (178, 64, 0), (178, 0, 0), (178, 0, 64), (178, 0, 128)], class_id_mapping=None, default_color=(255, 0, 0), axis_mode=AxisMode.OPENCV)[source]

Preprocesses bounding boxes.

Converts the given predicted bounding boxes and class/track information into lists of centers, corners, labels, colors and track_ids.

Return type:

tuple[list[tuple[float, float, float]], list[list[tuple[float, float, float]]], list[str], list[tuple[int, int, int]], list[int]]

preprocess_image(image, mode='RGB')[source]

Validate and convert input image.

Parameters:
  • image (ArrayLike) – CHW or HWC image (ArrayLike) with C = 3.

  • mode (str) – input channel format (e.g. BGR, HSV).

Returns:

Processed image_np in RGB.

Return type:

np.array[uint8]

preprocess_masks(masks, class_ids=None, color_mapping=[(0, 114, 178), (121, 178, 0), (142, 178, 0), (164, 0, 178), (178, 42, 0), (0, 135, 178), (99, 0, 178), (0, 49, 178), (0, 178, 114), (178, 85, 0), (0, 7, 178), (35, 0, 178), (0, 157, 178), (14, 0, 178), (0, 178, 28), (178, 149, 0), (57, 178, 0), (178, 0, 107), (178, 0, 42), (0, 92, 178), (35, 178, 0), (0, 71, 178), (0, 28, 178), (14, 178, 0), (178, 0, 171), (0, 178, 71), (178, 0, 149), (178, 171, 0), (78, 178, 0), (0, 178, 178), (178, 107, 0), (0, 178, 7), (142, 0, 178), (178, 0, 21), (178, 21, 0), (99, 178, 0), (78, 0, 178), (0, 178, 157), (178, 128, 0), (0, 178, 135), (57, 0, 178), (0, 178, 92), (0, 178, 49), (164, 178, 0), (121, 0, 178), (178, 0, 85), (178, 64, 0), (178, 0, 0), (178, 0, 64), (178, 0, 128)])[source]

Preprocesses predicted semantic or instance segmentation masks.

Parameters:
  • masks (ArrayLikeUInt) – Masks of shape [H, W] or [N, H, W]. If the masks are of shape [H, W], they are assumed to be semantic segmentation masks, i.e. each pixel contains the class id. If the masks are of shape [N, H, W], they are assumed to be the binary masks of N instances.

  • class_ids (ArrayLikeInt, None) – An array with class ids for each mask shape [N]. If None, then the masks must be semantic segmentation masks and the class ids are extracted from the masks.

  • color_mapping (list[tuple[int, int, int]]) – Color mapping for each class.

Returns:

Returns a list with all masks of

shape [H, W] as well as a list with the corresponding colors.

Return type:

tuple[list[masks], list[colors]]

Raises:

ValueError – If the masks have an invalid shape.

project_point(point, intrinsics)[source]

Project single point into the image plane.

Return type:

tuple[float, float]