vis4d.op.base.csp_darknet¶
CSP-Darknet base network used in YOLOX.
Modified from mmdetection (https://github.com/open-mmlab/mmdetection).
Classes
|
CSP-Darknet backbone used in YOLOv5 and YOLOX. |
|
Focus width and height information into channel space. |
|
Spatial pyramid pooling layer used in YOLOv3-SPP. |
- class CSPDarknet(arch='P5', deepen_factor=1.0, widen_factor=1.0, out_indices=(2, 3, 4), frozen_stages=-1, arch_ovewrite=None, spp_kernal_sizes=(5, 9, 13), norm_eval=False)[source]¶
CSP-Darknet backbone used in YOLOv5 and YOLOX.
- Parameters:
arch (str) – Architecture of CSP-Darknet, from {P5, P6}. Default: P5.
deepen_factor (float) – Depth multiplier, multiply number of blocks in CSP layer by this amount. Default: 1.0.
widen_factor (float) – Width multiplier, multiply number of channels in each layer by this amount. Default: 1.0.
out_indices (Sequence[int]) – Output from which stages. Default: (2, 3, 4).
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters. Default: -1.
use_depthwise (bool) – Whether to use depthwise separable convolution. Default: False.
arch_ovewrite (list[list[int]], optional) – Overwrite default arch settings. Defaults to None.
spp_kernal_sizes (
Sequence
[int
]) – (tuple[int]): Sequential of kernel sizes of SPP layers. Default: (5, 9, 13).norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.
Example
>>> import torch >>> from vis4d.op.base import CSPDarknet >>> self = CSPDarknet() >>> self.eval() >>> inputs = torch.rand(1, 3, 416, 416) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) ... (1, 256, 52, 52) (1, 512, 26, 26) (1, 1024, 13, 13)
Init.
- forward(images)[source]¶
Forward pass.
- Parameters:
images (torch.Tensor) – Input images.
- Return type:
list
[Tensor
]
- class Focus(in_channels, out_channels, kernel_size=1, stride=1)[source]¶
Focus width and height information into channel space.
- Parameters:
in_channels (int) – The input channels of this Module.
out_channels (int) – The output channels of this Module.
kernel_size (int, optional) – The kernel size of the convolution. Defaults to 1.
stride (int, optional) – The stride of the convolution. Defaults to 1.
Init.
- class SPPBottleneck(in_channels, out_channels, kernel_sizes=(5, 9, 13))[source]¶
Spatial pyramid pooling layer used in YOLOv3-SPP.
- Parameters:
in_channels (int) – Input channels.
out_channels (int) – Output channels.
kernel_sizes (Sequence[int], optional) – Sequential of kernel sizes of pooling layers. Defaults to (5, 9, 13).
Init.