vis4d.op.detect3d.bevformer.bevformer¶
BEVFormer head.
Functions
|
Convert BEVFormer detection results to Detect3DOut. |
Classes
|
BEVFormer 3D detection head. |
- class BEVFormerHead(num_classes=10, embed_dims=256, num_query=900, transformer=None, num_reg_fcs=2, num_cls_fcs=2, point_cloud_range=(-51.2, -51.2, -5.0, 51.2, 51.2, 3.0), bev_h=200, bev_w=200)[source]¶
BEVFormer 3D detection head.
Initialize BEVFormerHead.
- Parameters:
num_classes (int, optional) – Number of classes. Defaults to 10.
embed_dims (int, optional) – Embedding dimensions. Defaults to 256.
num_query (int, optional) – Number of queries. Defaults to 900.
transformer (PerceptionTransformer, optional) – Transformer. Defaults to None. If None, a default transformer will be created.
num_reg_fcs (int, optional) – Number of fully connected layers in regression branch. Defaults to 2.
num_cls_fcs (int, optional) – Number of fully connected layers in classification branch. Defaults to 2.
point_cloud_range (Sequence[float], optional) – Point cloud range. Defaults to (-51.2, -51.2, -5.0, 51.2, 51.2, 3.0).
bev_h (int, optional) – BEV height. Defaults to 200.
bev_w (int, optional) – BEV width. Defaults to 200.
- __call__(mlvl_feats, can_bus, images_hw, cam_intrinsics, cam_extrinsics, lidar_extrinsics, prev_bev=None)[source]¶
Type definition.
- Return type:
tuple
[Detect3DOut
,Tensor
]
- forward(mlvl_feats, can_bus, images_hw, cam_intrinsics, cam_extrinsics, lidar_extrinsics, prev_bev=None)[source]¶
Forward function.
- Parameters:
mlvl_feats (list[Tensor]) – Features from the upstream network, each is with shape (B, N, C, H, W).
can_bus (Tensor) – CAN bus data, with shape (B, 18).
images_hw (tuple[int, int]) – Image height and width.
cam_intrinsics (list[Tensor]) – Camera intrinsics.
cam_extrinsics (list[Tensor]) – Camera extrinsics.
lidar_extrinsics (list[Tensor]) – LiDAR extrinsics.
prev_bev (Tensor, optional) – Previous BEV feature map, with shape (B, C, H, W). Defaults to None.
- Returns:
Detection results and BEV feature map.
- Return type:
tuple[Detect3DOut, Tensor]
- bbox3d2result(bbox_list, lidar2global)[source]¶
Convert BEVFormer detection results to Detect3DOut.
- Parameters:
bbox_list (list[tuple[Tensor, Tensor, Tensor]) – List of bounding boxes, scores and labels.
lidar2global (Tensor) – Lidar to global transformation (B, 4, 4).
- Returns:
Detection results.
- Return type: