vis4d.model.motion.velo_lstm

VeloLSTM 3D motion model.

Functions

init_lstm_module(layer)

Initialize LSTM weights and biases.

Classes

VeloLSTM([num_frames, feature_dim, ...])

Estimating object location in world coordinates.

VeloLSTMOut(loc_preds, loc_refines)

VeloLSTM output.

class VeloLSTM(num_frames=5, feature_dim=64, hidden_size=128, num_layers=2, loc_dim=7, dropout=0.1, weights=None)[source]

Estimating object location in world coordinates.

Prediction LSTM:

Input: 5 frames velocity Output: Next frame location

Updating LSTM:

Input: predicted location and observed location Output: Refined location

Init.

forward(pred_traj)[source]

Forward of QD3DTrackGraph in training stage.

Return type:

VeloLSTMOut

init_hidden(device, batch_size=1)[source]

Initializae hidden state.

The axes semantics are (num_layers, minibatch_size, hidden_dim)

Return type:

tuple[Tensor, Tensor]

predict(vel_history, location, hc_0)[source]

Predict location at t+1 using updated location at t.

Return type:

tuple[Tensor, tuple[Tensor, Tensor]]

Input:

vel_history: (num_seq, num_batch, loc_dim), velocity from previous num_seq updates location: (num_batch, loc_dim), location from previous update hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell

Middle:

embed: (num_seq, num_batch x feature_dim), location feature out: (num_seq x num_batch x hidden_size), lstm output attention_logit: (num_seq x num_batch x loc_dim), the predicted residual

Output:

hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location

refine(location, observation, prev_location, confidence, hc_0)[source]

Refine predicted location using single frame estimation at t+1.

Return type:

tuple[Tensor, tuple[Tensor, Tensor]]

Input:

location: (num_batch x loc_dim), location from prediction observation: (num_batch x loc_dim), location from single frame estimation prev_location: (num_batch x loc_dim), refined location confidence: (num_batch X 1), depth estimation confidence hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell

Middle:

loc_embed: (1, num_batch x feature_dim), predicted location feature obs_embed: (1, num_batch x feature_dim), single frame location feature conf_embed: (1, num_batch x feature_dim), depth estimation confidence feature embed: (1, num_batch x 2*feature_dim), location feature out: (1 x num_batch x hidden_size), lstm output

Output:

hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location

class VeloLSTMOut(loc_preds: Tensor, loc_refines: Tensor)[source]

VeloLSTM output.

Create new instance of VeloLSTMOut(loc_preds, loc_refines)

loc_preds: Tensor

Alias for field number 0

loc_refines: Tensor

Alias for field number 1

init_lstm_module(layer)[source]

Initialize LSTM weights and biases.

Return type:

None