vis4d.data.datasets.scalabel

Scalabel type dataset.

Functions

add_data_path(data_root, frames)

Add filepath to frame using data_root.

boxes2d_from_scalabel(labels, class_to_idx)

Convert from scalabel format to Vis4D.

boxes3d_from_scalabel(labels, class_to_idx)

Convert 3D bounding boxes from scalabel format to Vis4D.

discard_labels_outside_set(dataset, class_set)

Discard labels outside given set of classes.

filter_frames_by_attributes(frames, ...)

Filter frames based on attributes.

instance_ids_to_global(frames, ...)

Use local (per video) instance ids to produce global ones.

instance_masks_from_scalabel(labels, ...[, ...])

Convert instance masks from scalabel format to Vis4D.

load_extrinsics(extrinsics)

Transform extrinsics from Scalabel to Vis4D.

load_image(url, backend, image_channel_mode)

Load image tensor from url.

load_intrinsics(intrinsics)

Transform intrinsic camera matrix according to augmentations.

load_pointcloud(url, backend)

Load pointcloud tensor from url.

nhw_to_hwc_mask(masks, class_ids[, ignore_class])

Convert N binary HxW masks to HxW semantic mask.

prepare_labels(frames, class_list[, ...])

Add category id and instance id to labels, return class frequencies.

remove_empty_samples(frames)

Remove empty samples.

semantic_masks_from_scalabel(labels, ...[, ...])

Convert masks from scalabel format to Vis4D.

Classes

Scalabel(data_root, annotation_path[, ...])

Scalabel type dataset.

class Scalabel(data_root, annotation_path, keys_to_load=('images', 'boxes2d'), category_map=None, config_path=None, global_instance_ids=False, bg_as_class=False, skip_empty_samples=False, attributes_to_load=None, cache_as_binary=False, cached_file_path=None, **kwargs)[source]

Scalabel type dataset.

This class loads scalabel format data into Vis4D.

Creates an instance of the class.

Parameters:
  • data_root (str) – Root directory of the data.

  • annotation_path (str) – Path to the annotation json(s).

  • keys_to_load (Sequence[str, ...], optional) – Keys to load from the dataset. Defaults to (K.images, K.boxes2d).

  • category_map (None | CategoryMap, optional) – Mapping from a Scalabel category string to an integer index. If None, the standard mapping in the dataset config will be used. Defaults to None.

  • config_path (None | str | Config, optional) – Path to the dataset config, can be added if it is not provided together with the labels or should be modified. Defaults to None.

  • global_instance_ids (bool) – Whether to convert tracking IDs of annotations into dataset global IDs or stay with local, per-video IDs. Defaults to false.

  • bg_as_class (bool) – Whether to include background pixels as an additional class for masks.

  • skip_empty_samples (bool) – Whether to skip samples without annotations.

  • attributes_to_load (Sequence[dict[str, str]]) – List of attributes dictionaries to load. Each dictionary is a mapping from the attribute name to its desired value. If any of the attributes dictionaries is matched, the corresponding frame will be loaded. Defaults to None.

  • cache_as_binary (bool) – Whether to cache the dataset as binary. Default: False.

  • cached_file_path (str | None) – Path to a cached file. If cached file exist then it will load it instead of generating the data mapping. Default: None.

__getitem__(index)[source]

Get item from dataset at given index.

Return type:

Dict[str, Any]

__len__()[source]

Length of dataset.

Return type:

int

add_data_path(data_root, frames)[source]

Add filepath to frame using data_root.

Return type:

None

boxes2d_from_scalabel(labels, class_to_idx, label_id_to_idx=None)[source]

Convert from scalabel format to Vis4D.

NOTE: The box definition in Scalabel includes x2y2 in the box area, whereas Vis4D and other software libraries like detectron2 and mmdet do not include this, which is why we convert via box2d_to_xyxy.

Parameters:
  • labels (list[Label]) – list of scalabel labels.

  • class_to_idx (dict[str, int]) – mapping from class name to index.

  • label_id_to_idx (dict[str, int] | None, optional) – mapping from label id to index. Defaults to None.

Returns:

boxes, classes, track_ids

Return type:

tuple[NDArrayF32, NDArrayI64, NDArrayI64]

boxes3d_from_scalabel(labels, class_to_idx, label_id_to_idx=None)[source]

Convert 3D bounding boxes from scalabel format to Vis4D.

Return type:

tuple[ndarray[Any, dtype[float32]], ndarray[Any, dtype[int64]], ndarray[Any, dtype[int64]]]

discard_labels_outside_set(dataset, class_set)[source]

Discard labels outside given set of classes.

Parameters:
  • dataset (list[Frame]) – List of frames to filter.

  • class_set (list[str]) – List of classes to keep.

Return type:

None

filter_frames_by_attributes(frames, attributes_to_load)[source]

Filter frames based on attributes.

Return type:

list[Frame]

instance_ids_to_global(frames, local_instance_ids)[source]

Use local (per video) instance ids to produce global ones.

Return type:

None

instance_masks_from_scalabel(labels, class_to_idx, image_size=None)[source]

Convert instance masks from scalabel format to Vis4D.

Parameters:
  • labels (list[Label]) – list of scalabel labels.

  • class_to_idx (dict[str, int]) – mapping from class name to index.

  • image_size (ImageSize, optional) – image size. Defaults to None.

Returns:

instance masks.

Return type:

NDArrayUI8

load_extrinsics(extrinsics)[source]

Transform extrinsics from Scalabel to Vis4D.

Return type:

ndarray[Any, dtype[float32]]

load_image(url, backend, image_channel_mode)[source]

Load image tensor from url.

Return type:

ndarray[Any, dtype[float32]]

load_intrinsics(intrinsics)[source]

Transform intrinsic camera matrix according to augmentations.

Return type:

ndarray[Any, dtype[float32]]

load_pointcloud(url, backend)[source]

Load pointcloud tensor from url.

Return type:

ndarray[Any, dtype[float32]]

nhw_to_hwc_mask(masks, class_ids, ignore_class=255)[source]

Convert N binary HxW masks to HxW semantic mask.

Parameters:
  • masks (NDArrayUI8) – Masks with shape [N, H, W].

  • class_ids (NDArrayI64) – Class IDs with shape [N, 1].

  • ignore_class (int, optional) – Ignore label. Defaults to 255.

Returns:

Masks with shape [H, W], where each location indicate the

class label.

Return type:

NDArrayUI8

prepare_labels(frames, class_list, global_instance_ids=False)[source]

Add category id and instance id to labels, return class frequencies.

Parameters:
  • frames (list[Frame]) – List of frames.

  • class_list (list[str]) – List of classes.

  • global_instance_ids (bool) – Whether to use global instance ids. Defaults to False.

Return type:

dict[str, int]

remove_empty_samples(frames)[source]

Remove empty samples.

Return type:

list[Frame]

semantic_masks_from_scalabel(labels, class_to_idx, image_size=None, bg_as_class=False)[source]

Convert masks from scalabel format to Vis4D.

Parameters:
  • labels (list[Label]) – list of scalabel labels.

  • class_to_idx (dict[str, int]) – mapping from class name to index.

  • image_size (ImageSize, optional) – image size. Defaults to None.

  • bg_as_class (bool, optional) – whether to include background as a class. Defaults to False.

Returns:

instance masks.

Return type:

NDArrayUI8