vis4d.op.base.vgg¶

Residual networks for classification.

Classes

VGG(vgg_name[, trainable_layers, pretrained])

Wrapper for torch vision VGG.

class VGG(vgg_name, trainable_layers=None, pretrained=False)[source]¶

Wrapper for torch vision VGG.

Initialize the VGG base model from torchvision.

Parameters:

vgg_name (str) – name of the VGG variant. Choices in [“vgg11”, “vgg13”, “vgg16”, “vgg19”, “vgg11_bn”, “vgg13_bn”, “vgg16_bn”, “vgg19_bn”].
trainable_layers (int, optional) – Number layers for training or fine-tuning. None means all the layers can be fine-tuned.
pretrained (bool, optional) – Whether to load ImageNet pre-trained weights. Defaults to False.

Raises:

ValueError – The VGG name is not supported

forward(images)[source]¶

VGG feature forward without classification head.

Parameters:: images (Tensor[N, C, H, W]) – Image input to process. Expected to type float32 with values ranging 0..255.
Returns:: The output feature pyramid. The list index represents the level, which has a downsampling raio of 2^index. fp[0] and fp[1] is a reference to the input images. The last feature map downsamples the input image by 64.
Return type:: fp (list[torch.Tensor])

property out_channels: list[int]¶

Get the number of channels for each level of feature pyramid.