vis4d.op.layer.attention¶
Attention layer.
Classes
|
ViT Attention Layer. |
|
A wrapper for |
- class Attention(dim, num_heads=8, qkv_bias=False, attn_drop=0.0, proj_drop=0.0)[source]¶
ViT Attention Layer.
Modified from timm (https://github.com/huggingface/pytorch-image-models).
Init attention layer.
- Parameters:
dim (int) – Input tensor’s dimension.
num_heads (int, optional) – Number of attention heads. Defaults to 8.
qkv_bias (bool, optional) – If to add bias to qkv. Defaults to False.
attn_drop (float, optional) – Dropout rate for attention. Defaults to 0.0.
proj_drop (float, optional) – Dropout rate for projection. Defaults to 0.0.
- class MultiheadAttention(embed_dims, num_heads, attn_drop=0.0, proj_drop=0.0, dropout_layer=None, batch_first=False, **kwargs)[source]¶
A wrapper for
torch.nn.MultiheadAttention
.This module implements MultiheadAttention with identity connection, and positional encoding is also passed as input.
Init MultiheadAttention.
- Parameters:
embed_dims (int) – The embedding dimension.
num_heads (int) – Parallel attention heads.
attn_drop (float) – A Dropout layer on attn_output_weights. Default: 0.0.
proj_drop (float) – A Dropout layer after nn.MultiheadAttention. Default: 0.0.
dropout_layer (nn.Module | None, optional) – The dropout_layer used when adding the shortcut. Defaults to None.
batch_first (bool) – When it is True, Key, Query and Value are shape of (batch, n, embed_dim), otherwise (n, batch, embed_dim). Default to False.
- forward(query, key=None, value=None, identity=None, query_pos=None, key_pos=None, attn_mask=None, key_padding_mask=None)[source]¶
Forward function for MultiheadAttention.
**kwargs allow passing a more general data flow when combining with other operations in transformerlayer.
- Parameters:
query (Tensor) – The input query with shape [num_queries, bs, embed_dims] if self.batch_first is False, else [bs, num_queries embed_dims].
key (Tensor) – The key tensor with shape [num_keys, bs, embed_dims] if self.batch_first is False, else [bs, num_keys, embed_dims] . If None, the
query
will be used. Defaults to None.value (Tensor) – The value tensor with same shape as key. Same in nn.MultiheadAttention.forward. Defaults to None. If None, the key will be used.
identity (Tensor) – This tensor, with the same shape as x, will be used for the identity link. If None, x will be used. Defaults to None.
query_pos (Tensor) – The positional encoding for query, with the same shape as x. If not None, it will be added to x before forward function. Defaults to None.
key_pos (Tensor) – The positional encoding for key, with the same shape as key. Defaults to None. If not None, it will be added to key before forward function. If None, and query_pos has the same shape as key, then query_pos will be used for key_pos. Defaults to None.
attn_mask (Tensor) – ByteTensor mask with shape [num_queries, num_keys]. Same in nn.MultiheadAttention.forward. Defaults to None.
key_padding_mask (Tensor) – ByteTensor with shape [bs, num_keys]. Defaults to None.
- Returns:
- forwarded results with shape [num_queries, bs, embed_dims]
if self.batch_first is False, else [bs, num_queries, embed_dims].
- Return type:
Tensor