vis4d.op.box.poolers.roi_pooler¶
Vis4D RoI Pooling module.
Classes
|
RoI Align supporting multi-scale inputs. |
|
RoI Pool supporting multi-scale inputs. |
|
Wrapper for roi pooling that supports multi-scale feature maps. |
- class MultiScaleRoIAlign(sampling_ratio, *args, **kwargs)[source]¶
RoI Align supporting multi-scale inputs.
Creates an instance of the class.
- class MultiScaleRoIPool(resolution, strides, canonical_box_size=224, canonical_level=4, aligned=True)[source]¶
RoI Pool supporting multi-scale inputs.
Multi-scale version of arbitrary RoI pooling operations.
- Parameters:
resolution (
tuple
[int
,int
]) – Pooler resolution.strides (
list
[int
]) – Feature map strides relative to the input. The strides must be powers of 2 and a monotically decreasing geometric sequence with a factor of 1/2.canonical_box_size (
int
) – Canonical box size in pixels (sqrt(box area)). The default is heuristically defined as 224 pixels in the FPN paper (based on ImageNet pre-training).canonical_level (
int
) – The feature map level index from which a canonical sized box should be placed. The default is defined as level 4 (stride=16) in the FPN paper, i.e., a box of size 224x224 will be placed on the feature with stride=16. The box placement for all boxes will be determined from their sizes w.r.t canonical_box_size. For example, a box whose area is 4x that of a canonical box should be used to pool features from feature levelcanonical_level+1
.aligned (bool) – For roi_align op. Shift the box coordinates it by -0.5 for a better alignment with the two neighboring pixel indices.
- class MultiScaleRoIPooler(resolution, strides, canonical_box_size=224, canonical_level=4, aligned=True)[source]¶
Wrapper for roi pooling that supports multi-scale feature maps.
Multi-scale version of arbitrary RoI pooling operations.
- Parameters:
resolution (
tuple
[int
,int
]) – Pooler resolution.strides (
list
[int
]) – Feature map strides relative to the input. The strides must be powers of 2 and a monotically decreasing geometric sequence with a factor of 1/2.canonical_box_size (
int
) – Canonical box size in pixels (sqrt(box area)). The default is heuristically defined as 224 pixels in the FPN paper (based on ImageNet pre-training).canonical_level (
int
) – The feature map level index from which a canonical sized box should be placed. The default is defined as level 4 (stride=16) in the FPN paper, i.e., a box of size 224x224 will be placed on the feature with stride=16. The box placement for all boxes will be determined from their sizes w.r.t canonical_box_size. For example, a box whose area is 4x that of a canonical box should be used to pool features from feature levelcanonical_level+1
.aligned (bool) – For roi_align op. Shift the box coordinates it by -0.5 for a better alignment with the two neighboring pixel indices.
- forward(features, boxes)[source]¶
Torchvision based roi pooling operation.
- Parameters:
features (
list
[Tensor
]) – List of image feature tensors (e.g., fpn levels) - NCHW format.boxes (
list
[Tensor
]) – List of proposals (per image).
- Returns:
- NCHW format, where N = num boxes (total),
HW is roi size, C is feature dim. Boxes are concatenated along dimension 0 for all batch elements.
- Return type:
torch.Tensor