Getting started¶
This notebook illustrates the basic usage of Vis4D. We run Faster R-CNN on COCO images.
CLI¶
The following is an example using provided Faster R-CNN toy config file to run training and inference on COCO images.
Inference¶
Run inference on the validation set.
!vis4d test --config faster_rcnn_example.py --config.params.num_epochs 1
[09/06 00:38:10 Vis4D]: Environment info: PyTorch version: 2.0.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 13.5.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: Could not collect
Libc version: N/A
Python version: 3.10.8 (main, Nov 24 2022, 08:08:27) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-13.5.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M2
Versions of relevant libraries:
[pip3] mypy==1.3.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.3
[pip3] pytorch-lightning==2.0.3
[pip3] torch==2.0.1
[pip3] torchaudio==2.0.2
[pip3] torchmetrics==0.11.4
[pip3] torchvision==0.15.2
[conda] numpy 1.24.3 pypi_0 pypi
[conda] numpy-base 1.23.4 py310haf87e8b_0
[conda] pytorch 1.13.0 py3.10_0 pytorch
[conda] pytorch-lightning 2.0.3 pypi_0 pypi
[conda] torch 2.0.1 pypi_0 pypi
[conda] torchaudio 2.0.2 pypi_0 pypi
[conda] torchmetrics 0.11.4 pypi_0 pypi
[conda] torchvision 0.15.2 pypi_0 pypi
[09/06 00:38:10 Vis4D]: Load checkpoint from http path: https://download.pytorch.org/models/resnet50-0676ba61.pth
/Users/royyang/Workspace/vis4d/vis4d/common/ckpt.py:374: UserWarning: The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
rank_zero_warn(err_msg)
[09/06 00:38:10 Vis4D]: Load checkpoint from http path: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
[09/06 00:38:10 Vis4D]: Generating COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) data mapping...
[09/06 00:38:10 Vis4D]: Loading COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) takes 0.00 seconds.
[09/06 00:38:12 Vis4D]: Testing: 1/2, ETA: 0:00:01, 1.71s/it
[09/06 00:38:13 Vis4D]: Testing: 2/2, ETA: 0:00:00, 1.61s/it
[09/06 00:38:13 Vis4D]: Running evaluator CocoEvaluator(annotation_path=data/coco_test/annotations/instances_train.json)...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[09/06 00:38:13 Vis4D]: Det/AP: 0.3974
[09/06 00:38:13 Vis4D]: Det/AP50: 0.5497
[09/06 00:38:13 Vis4D]: Det/AP75: 0.4414
[09/06 00:38:13 Vis4D]: Det/APs: 0.2868
[09/06 00:38:13 Vis4D]: Det/APm: 0.6932
[09/06 00:38:13 Vis4D]: Det/APl: 0.6000
[09/06 00:38:13 Vis4D]: Showing results for metric: Det
[09/06 00:38:13 Vis4D]:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.397
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.550
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.441
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.287
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.693
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.600
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.429
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.454
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.348
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.693
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
Training¶
Run the training for 1 epoch and inference on the validation set.
!vis4d fit --config faster_rcnn_example.py --config.params.num_epochs 1
[09/06 00:38:21 Vis4D]: Environment info: PyTorch version: 2.0.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 13.5.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: Could not collect
Libc version: N/A
Python version: 3.10.8 (main, Nov 24 2022, 08:08:27) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-13.5.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M2
Versions of relevant libraries:
[pip3] mypy==1.3.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.3
[pip3] pytorch-lightning==2.0.3
[pip3] torch==2.0.1
[pip3] torchaudio==2.0.2
[pip3] torchmetrics==0.11.4
[pip3] torchvision==0.15.2
[conda] numpy 1.24.3 pypi_0 pypi
[conda] numpy-base 1.23.4 py310haf87e8b_0
[conda] pytorch 1.13.0 py3.10_0 pytorch
[conda] pytorch-lightning 2.0.3 pypi_0 pypi
[conda] torch 2.0.1 pypi_0 pypi
[conda] torchaudio 2.0.2 pypi_0 pypi
[conda] torchmetrics 0.11.4 pypi_0 pypi
[conda] torchvision 0.15.2 pypi_0 pypi
[09/06 00:38:21 Vis4D]: Load checkpoint from http path: https://download.pytorch.org/models/resnet50-0676ba61.pth
/Users/royyang/Workspace/vis4d/vis4d/common/ckpt.py:374: UserWarning: The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
rank_zero_warn(err_msg)
[09/06 00:38:21 Vis4D]: Load checkpoint from http path: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
[09/06 00:38:21 Vis4D]: [rank 0] Global seed set to 1740432514
[09/06 00:38:22 Vis4D]: Generating COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) data mapping...
[09/06 00:38:22 Vis4D]: Loading COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) takes 0.00 seconds.
[09/06 00:38:22 Vis4D]: Generating COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) data mapping...
[09/06 00:38:22 Vis4D]: Loading COCODataset(root=data/coco_test, split=train, use_pascal_voc_cats=False) takes 0.00 seconds.
[09/06 00:38:26 Vis4D]: Epoch 1: 1/2, ETA: 0:00:04, 4.03s/it, loss: 0.9780, RPNLoss.loss_cls: 0.0689, RPNLoss.loss_bbox: 0.0667, RCNNLoss.rcnn_loss_cls: 0.4852, RCNNLoss.rcnn_loss_bbox: 0.3572
[09/06 00:38:29 Vis4D]: Epoch 1: 2/2, ETA: 0:00:00, 3.97s/it, loss: 1.0499, RPNLoss.loss_cls: 0.0906, RPNLoss.loss_bbox: 0.1142, RCNNLoss.rcnn_loss_cls: 0.4352, RCNNLoss.rcnn_loss_bbox: 0.4099
[09/06 00:38:32 Vis4D]: Testing: 1/2, ETA: 0:00:01, 1.27s/it
[09/06 00:38:33 Vis4D]: Testing: 2/2, ETA: 0:00:00, 1.24s/it
[09/06 00:38:33 Vis4D]: Running evaluator CocoEvaluator(annotation_path=data/coco_test/annotations/instances_train.json)...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[09/06 00:38:33 Vis4D]: Det/AP: 0.3975
[09/06 00:38:33 Vis4D]: Det/AP50: 0.5500
[09/06 00:38:33 Vis4D]: Det/AP75: 0.4414
[09/06 00:38:33 Vis4D]: Det/APs: 0.2868
[09/06 00:38:33 Vis4D]: Det/APm: 0.6932
[09/06 00:38:33 Vis4D]: Det/APl: 0.6000
[09/06 00:38:33 Vis4D]: Showing results for metric: Det
[09/06 00:38:33 Vis4D]:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.397
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.550
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.441
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.287
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.693
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.600
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.429
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.454
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.348
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.693
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
[09/06 00:38:33 Vis4D]: `Trainer.fit` stopped: `num_epochs=1` reached.
Python API¶
You can also compose the model and dataset through Python API.
First, import the necessary components from the library.
from vis4d.model.detect.faster_rcnn import FasterRCNN
from vis4d.data.const import CommonKeys as K
from vis4d.vis.functional.image import imshow_bboxes
from vis4d.config import instantiate_classes
from vis4d.zoo.base.datasets.coco import get_coco_detection_cfg
Now, let’s create the dataset and fetch the image from it.
# Create dataloader for COCO using the default config
dataloader_cfg = get_coco_detection_cfg(
"data/coco_test/",
train_split="train",
test_split="train",
samples_per_gpu=1,
workers_per_gpu=0,
cache_as_binary=False,
)
test_dataloader = instantiate_classes(dataloader_cfg.test_dataloader)[0]
batch = next(iter(test_dataloader))
inputs, images_hw = (
batch[K.images],
batch[K.input_hw],
)
Generating COCODataset(root=data/coco_test/, split=train, use_pascal_voc_cats=False) data mapping...
Loading COCODataset(root=data/coco_test/, split=train, use_pascal_voc_cats=False) takes 0.00 seconds.
Next, we can initialize and run the model on it.
faster_rcnn = FasterRCNN(num_classes=80, weights="mmdet")
faster_rcnn.eval()
dets = faster_rcnn(inputs, images_hw, original_hw=images_hw)
Load checkpoint from http path: https://download.pytorch.org/models/resnet50-0676ba61.pth
/Users/royyang/Workspace/vis4d/vis4d/common/ckpt.py:374: UserWarning: The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
rank_zero_warn(err_msg)
Load checkpoint from http path: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
Finally, let’s visualize the result.
print(inputs[0].shape, dets.boxes[0].shape)
imshow_bboxes(inputs[0], dets.boxes[0], dets.scores[0], dets.class_ids[0])
torch.Size([3, 800, 1248]) torch.Size([52, 4])