vis4d.op.base.csp_darknet¶

CSP-Darknet base network used in YOLOX.

Modified from mmdetection (https://github.com/open-mmlab/mmdetection).

Classes

`CSPDarknet`([arch, deepen_factor, ...])	CSP-Darknet backbone used in YOLOv5 and YOLOX.
`Focus`(in_channels, out_channels[, ...])	Focus width and height information into channel space.
`SPPBottleneck`(in_channels, out_channels[, ...])	Spatial pyramid pooling layer used in YOLOv3-SPP.

class CSPDarknet(arch='P5', deepen_factor=1.0, widen_factor=1.0, out_indices=(2, 3, 4), frozen_stages=-1, arch_ovewrite=None, spp_kernal_sizes=(5, 9, 13), norm_eval=False)[source]¶

CSP-Darknet backbone used in YOLOv5 and YOLOX.

Parameters:

arch (str) – Architecture of CSP-Darknet, from {P5, P6}. Default: P5.
deepen_factor (float) – Depth multiplier, multiply number of blocks in CSP layer by this amount. Default: 1.0.
widen_factor (float) – Width multiplier, multiply number of channels in each layer by this amount. Default: 1.0.
out_indices (Sequence[int]) – Output from which stages. Default: (2, 3, 4).
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters. Default: -1.
use_depthwise (bool) – Whether to use depthwise separable convolution. Default: False.
arch_ovewrite (list[list[int]], optional) – Overwrite default arch settings. Defaults to None.
spp_kernal_sizes (Sequence[int]) – (tuple[int]): Sequential of kernel sizes of SPP layers. Default: (5, 9, 13).
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.

Example

>>> import torch
>>> from vis4d.op.base import CSPDarknet
>>> self = CSPDarknet()
>>> self.eval()
>>> inputs = torch.rand(1, 3, 416, 416)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
...
(1, 256, 52, 52)
(1, 512, 26, 26)
(1, 1024, 13, 13)

Init.

forward(images)[source]¶

Forward pass.

Parameters:: images (torch.Tensor) – Input images.
Return type:: list[Tensor]

train(mode=True)[source]¶

Override the train mode for the model.

Parameters:: mode (bool) – Whether to set training mode to True.
Return type:: CSPDarknet

class Focus(in_channels, out_channels, kernel_size=1, stride=1)[source]¶

Focus width and height information into channel space.

Parameters:

in_channels (int) – The input channels of this Module.
out_channels (int) – The output channels of this Module.
kernel_size (int, optional) – The kernel size of the convolution. Defaults to 1.
stride (int, optional) – The stride of the convolution. Defaults to 1.

Init.

forward(features)[source]¶

Forward pass.

Parameters:: features (torch.Tensor) – The input tensor of shape [B, C, W, H].
Return type:: Tensor

class SPPBottleneck(in_channels, out_channels, kernel_sizes=(5, 9, 13))[source]¶

Spatial pyramid pooling layer used in YOLOv3-SPP.

Parameters:

in_channels (int) – Input channels.
out_channels (int) – Output channels.
kernel_sizes (Sequence[int], optional) – Sequential of kernel sizes of pooling layers. Defaults to (5, 9, 13).

Init.

forward(features)[source]¶

Forward pass.

Parameters:: features (torch.Tensor) – Input features.
Return type:: Tensor