vis4d.op.base.vgg

Residual networks for classification.

Classes

VGG(vgg_name[, trainable_layers, pretrained])

Wrapper for torch vision VGG.

class VGG(vgg_name, trainable_layers=None, pretrained=False)[source]

Wrapper for torch vision VGG.

Initialize the VGG base model from torchvision.

Parameters:
  • vgg_name (str) – name of the VGG variant. Choices in [“vgg11”, “vgg13”, “vgg16”, “vgg19”, “vgg11_bn”, “vgg13_bn”, “vgg16_bn”, “vgg19_bn”].

  • trainable_layers (int, optional) – Number layers for training or fine-tuning. None means all the layers can be fine-tuned.

  • pretrained (bool, optional) – Whether to load ImageNet pre-trained weights. Defaults to False.

Raises:

ValueError – The VGG name is not supported

forward(images)[source]

VGG feature forward without classification head.

Parameters:

images (Tensor[N, C, H, W]) – Image input to process. Expected to type float32 with values ranging 0..255.

Returns:

The output feature pyramid. The list index represents the level, which has a downsampling raio of 2^index. fp[0] and fp[1] is a reference to the input images. The last feature map downsamples the input image by 64.

Return type:

fp (list[torch.Tensor])

property out_channels: list[int]

Get the number of channels for each level of feature pyramid.

Returns:

number of channels

Return type:

list[int]