vis4d.eval.common

Common evaluation code.

class BinaryEvaluator(threshold=0.5)[source]

Creates a new Evaluater that evaluates binary predictions.

Creates a new binary evaluator.

Parameters:

threshold (float) – Threshold for prediction to convert to binary. All prediction that are higher than this value will be assigned the ‘True’ label

evaluate(metric)[source]

Evaluate predictions.

Returns a dict containing the raw data and a short description string containing a readable result.

Parameters:

metric (str) – Metric to use. See @property metric

Return type:

tuple[Dict[str, Union[float, int, Tensor]], str]

Returns:

metric_data, description tuple containing the metric data (dict with metric name and value) as well as a short string with shortened information.

Raises:
  • RuntimeError – if no data has been registered to be evaluated.

  • ValueError – if metric is not supported.

process_batch(prediction, groundtruth)[source]

Processes a new (batch) of predictions.

Calculates the metrics and caches them internally.

Parameters:
  • prediction (ArrayLike) – the prediction(continuous values or bin) (Batch x Pts)

  • groundtruth (ArrayLike) – the groundtruth (binary) (Batch x Pts)

Return type:

None

reset()[source]

Reset the saved predictions to start new round of evaluation.

Return type:

None

property metrics: list[str]

Supported metrics.

class ClassificationEvaluator[source]

Multi-class classification evaluator.

Initialize the classification evaluator.

evaluate(metric)[source]

Evaluate predictions.

Returns a dict containing the raw data and a short description string containing a readable result.

Parameters:

metric (str) – Metric to use. See @property metric

Return type:

tuple[Dict[str, Union[float, int, Tensor]], str]

Returns:

metric_data, description tuple containing the metric data (dict with metric name and value) as well as a short string with shortened information.

Raises:
  • RuntimeError – if no data has been registered to be evaluated.

  • ValueError – if the metric is not supported.

gather(gather_func)[source]

Accumulate predictions across processes.

Return type:

None

process_batch(prediction, groundtruth)[source]

Process a batch of predictions and groundtruths.

Parameters:
  • prediction (ArrayLike) – Prediction, in shape (N, C).

  • groundtruth (ArrayLike) – Groundtruth, in shape (N, ).

reset()[source]

Reset evaluator for new round of evaluation.

Return type:

None

property metrics: list[str]

Supported metrics.

class DepthEvaluator(min_depth=0.0, max_depth=80.0, scale=1.0, epsilon=0.001)[source]

Depth estimation evaluator.

Initialize the optical flow evaluator.

Parameters:
  • min_depth (float) – Minimum depth to evaluate. Defaults to 0.001.

  • max_depth (float) – Maximum depth to evaluate. Defaults to 80.0.

  • scale (float) – Scale factor for depth. Defaults to 1.0.

  • epsilon (float) – Small value to avoid logarithms of small values. Defaults to 1e-3.

__repr__()[source]

Concise representation of the evaluator.

Return type:

str

evaluate(metric)[source]

Evaluate predictions.

Returns a dict containing the raw data and a short description string containing a readablae result.

Parameters:

metric (str) – Metric to use. See @property metric

Return type:

tuple[Dict[str, Union[float, int, Tensor]], str]

Returns:

metric_data, description tuple containing the metric data (dict with metric name and value) as well as a short string with shortened information.

Raises:
  • RuntimeError – if no data has been registered to be evaluated.

  • ValueError – if metric is not supported.

gather(gather_func)[source]

Accumulate predictions across processes.

Return type:

None

process_batch(prediction, groundtruth)[source]

Process a batch of data.

Parameters:
  • prediction (np.array) – Prediction optical flow, in shape (H, W, 2).

  • groundtruth (np.array) – Target optical flow, in shape (H, W, 2).

Return type:

None

reset()[source]

Reset evaluator for new round of evaluation.

Return type:

None

property metrics: list[str]

Supported metrics.

class OpticalFlowEvaluator(max_flow=400.0, use_degrees=False, scale=1.0, epsilon=1e-06)[source]

Optical flow evaluator.

Initialize the optical flow evaluator.

Parameters:
  • max_flow (float, optional) – Maximum flow value. Defaults to 400.0.

  • use_degrees (bool, optional) – Whether to use degrees for angular error. Defaults to False.

  • scale (float, optional) – Scale factor for the optical flow. Defaults to 1.0.

  • epsilon (float, optional) – Epsilon value for numerical stability.

evaluate(metric)[source]

Evaluate predictions.

Returns a dict containing the raw data and a short description string containing a readable result.

Parameters:

metric (str) – Metric to use. See @property metric

Return type:

tuple[Dict[str, Union[float, int, Tensor]], str]

Returns:

metric_data, description tuple containing the metric data (dict with metric name and value) as well as a short string with shortened information.

Raises:
  • RuntimeError – if no data has been registered to be evaluated.

  • ValueError – if metric is not supported.

gather(gather_func)[source]

Accumulate predictions across processes.

Return type:

None

process_batch(prediction, groundtruth)[source]

Process a batch of data.

Parameters:
  • prediction (NDArrayNumber) – Prediction optical flow, in shape (N, H, W, 2).

  • groundtruth (NDArrayNumber) – Target optical flow, in shape (N, H, W, 2).

Return type:

None

reset()[source]

Reset evaluator for new round of evaluation.

Return type:

None

property metrics: list[str]

Supported metrics.

class SegEvaluator(num_classes=None, class_to_ignore=None, class_mapping=None)[source]

Creates an evaluator that calculates mIoU score and confusion matrix.

Creates a new evaluator.

Parameters:
  • num_classes (int) – Number of semantic classes

  • class_to_ignore (int | None) – Groundtruth class that should be ignored

  • class_mapping (int) – dict mapping each class_id to a readable name

calc_confusion_matrix(prediction, groundtruth)[source]

Calculates the confusion matrix for multi class predictions.

Parameters:
  • prediction (array) – Class predictions

  • groundtruth (array) – Groundtruth classes

Return type:

ndarray[Any, dtype[int64]]

Returns:

Confusion Matrix of dimension n_classes x n_classes.

evaluate(metric)[source]

Evaluate predictions.

Returns a dict containing the raw data and a short description string containing a readable result.

Parameters:

metric (str) – Metric to use. See @property metric.

Return type:

tuple[Dict[str, Union[float, int, Tensor]], str]

Returns:

(dict, str) containing the raw data and a short description string.

Raises:

ValueError – If metric is not supported.

process_batch(prediction, groundtruth)[source]

Process sample and update confusion matrix.

Parameters:
  • prediction (ArrayLike) – Predictions of shape [N,C,…] or [N,…] with C* being any number if channels. Note, C is passed, the prediction is converted to target labels by applying the max operations along the second axis

  • groundtruth (ArrayLike) – Groundtruth of shape [N_batch, …] type int

Return type:

None

reset()[source]

Reset the saved predictions to start new round of evaluation.

Return type:

None

property metrics: list[str]

Supported metrics.

Modules

vis4d.eval.common.binary

Binary occupancy evaluator.

vis4d.eval.common.cls

Image classification evaluator.

vis4d.eval.common.depth

Depth estimation evaluator.

vis4d.eval.common.flow

Optical flow evaluator.

vis4d.eval.common.seg

Common segmentation evaluator.