vis4d.data.datasets.util

Utility functions for datasets.

Functions

filter_by_keys(data_dict, keys_to_keep)

Filter a dictionary by keys.

get_used_data_groups(data_groups, keys)

Get the data groups that are used by the given keys.

im_decode(im_bytes[, mode, backend])

Decode to image (numpy array, RGB) from bytes.

npy_decode(npy_bytes[, key])

Decode to numpy array from npy/npz file bytes.

ply_decode(ply_bytes[, mode])

Decode to point clouds (numpy array) from bytes.

print_class_histogram(class_frequencies)

Prints out given class frequencies.

to_onehot(categories, num_classes)

Transform integer categorical labels to onehot vectors.

Classes

CacheMappingMixin()

Caches a mapping for fast I/O and multi-processing.

DatasetFromList(lst[, deepcopy, serialize])

Wrap a list to a torch Dataset.

class CacheMappingMixin[source]

Caches a mapping for fast I/O and multi-processing.

This class provides functionality for caching a mapping from dataset index requested by a call on __getitem__ to a dictionary that holds relevant information for loading the sample in question from the disk. Caching the mapping reduces startup time by loading the mapping instead of re-computing it at every startup.

NOTE: Make sure your annotations file is up-to-date. Otherwise, the mapping will be wrong and you will get wrong samples.

class DatasetFromList(lst, deepcopy=False, serialize=True)[source]

Wrap a list to a torch Dataset.

We serialize and wrap big python objects in a torch.Dataset due to a memory leak when dealing with large python objects using multiple workers. See: https://github.com/pytorch/pytorch/issues/13246

Creates an instance of the class.

Parameters:
  • lst (list[Any]) – a list which contains elements to produce.

  • deepcopy (bool) – whether to deepcopy the element when producing it, s.t.

  • source (the result can be modified in place without affecting the)

  • list. (in the)

  • serialize (bool) – whether to hold memory using serialized objects. When

  • enabled

  • master (data loader workers can use shared RAM from)

  • copy. (process instead of making a)

__getitem__(idx)[source]

Return item of list at idx.

Return type:

Any

__len__()[source]

Return len of list.

Return type:

int

cvtColor(src, code[, dst[, dstCn]]) dst

. @brief Converts an image from one color space to another. . . The function converts an input image from one color space to another. In case of a transformation . to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR). Note . that the default color format in OpenCV is often referred to as RGB but it is actually BGR (the . bytes are reversed). So the first byte in a standard (24-bit) color image will be an 8-bit Blue . component, the second byte will be Green, and the third byte will be Red. The fourth, fifth, and . sixth bytes would then be the second pixel (Blue, then Green, then Red), and so on. . . The conventional ranges for R, G, and B channel values are: . - 0 to 255 for CV_8U images . - 0 to 65535 for CV_16U images . - 0 to 1 for CV_32F images . . In case of linear transformations, the range does not matter. But in case of a non-linear . transformation, an input RGB image should be normalized to the proper value range to get the correct . results, for example, for RGB f$rightarrowf$ L*u*v* transformation. For example, if you have a . 32-bit floating-point image directly converted from an 8-bit image without any scaling, then it will . have the 0..255 value range instead of 0..1 assumed by the function. So, before calling #cvtColor , . you need first to scale the image down: . @code . img *= 1./255; . cvtColor(img, img, COLOR_BGR2Luv); . @endcode . If you use #cvtColor with 8-bit images, the conversion will have some information lost. For many . applications, this will not be noticeable but it is recommended to use 32-bit images in applications . that need the full range of colors or that convert an image before an operation and then convert . back. . . If conversion adds the alpha channel, its value will set to the maximum of corresponding channel . range: 255 for CV_8U, 65535 for CV_16U, 1 for CV_32F. . . @param src input image: 8-bit unsigned, 16-bit unsigned ( CV_16UC… ), or single-precision . floating-point. . @param dst output image of the same size and depth as src. . @param code color space conversion code (see #ColorConversionCodes). . @param dstCn number of channels in the destination image; if the parameter is 0, the number of the . channels is derived automatically from src and code. . . @see @ref imgproc_color_conversions

filter_by_keys(data_dict, keys_to_keep)[source]

Filter a dictionary by keys.

Parameters:
  • data_dict (DictData) – The dictionary to filter.

  • keys_to_keep (list[str]) – The keys to keep.

Returns:

The filtered dictionary.

Return type:

DictData

get_used_data_groups(data_groups, keys)[source]

Get the data groups that are used by the given keys.

Parameters:
  • data_groups (dict[str, list[str]]) – The data groups.

  • keys (list[str]) – The keys to check.

Returns:

The used data groups.

Return type:

list[str]

im_decode(im_bytes, mode='RGB', backend='PIL')[source]

Decode to image (numpy array, RGB) from bytes.

Return type:

ndarray[Any, dtype[uint8]]

imdecode(buf, flags) retval

. @brief Reads an image from a buffer in memory. . . The function imdecode reads an image from the specified buffer in the memory. If the buffer is too short or . contains invalid data, the function returns an empty matrix ( Mat::data==NULL ). . . See cv::imread for the list of supported formats and flags description. . . @note In the case of color images, the decoded images will have the channels stored in B G R order. . @param buf Input array or vector of bytes. . @param flags The same flags as in cv::imread, see cv::ImreadModes.

npy_decode(npy_bytes, key=None)[source]

Decode to numpy array from npy/npz file bytes.

Return type:

Union[ndarray[Any, dtype[float32]], ndarray[Any, dtype[float64]]]

ply_decode(ply_bytes, mode='XYZI')[source]

Decode to point clouds (numpy array) from bytes.

Parameters:
  • ply_bytes (bytes) – The bytes of the ply file.

  • mode (str, optional) – The point format of the ply file. If “XYZI”, the intensity channel will be included, otherwise only the XYZ coordinates. Defaults to “XYZI”.

Return type:

Union[ndarray[Any, dtype[float32]], ndarray[Any, dtype[float64]]]

print_class_histogram(class_frequencies)[source]

Prints out given class frequencies.

Return type:

None

to_onehot(categories, num_classes)[source]

Transform integer categorical labels to onehot vectors.

Parameters:
  • categories (NDArrayI64) – Integer categorical labels of shape (N, ).

  • num_classes (int) – Number of classes.

Returns:

Onehot vector of shape (N, num_classes).

Return type:

NDArrayFloat