vis4d.model.motion.velo_lstm¶

VeloLSTM 3D motion model.

Functions

Initialize LSTM weights and biases.

Classes

`VeloLSTM`([num_frames, feature_dim, ...])	Estimating object location in world coordinates.
`VeloLSTMOut`(loc_preds, loc_refines)	VeloLSTM output.

class VeloLSTM(num_frames=5, feature_dim=64, hidden_size=128, num_layers=2, loc_dim=7, dropout=0.1, weights=None)[source]¶

Estimating object location in world coordinates.

Prediction LSTM:: Input: 5 frames velocity Output: Next frame location
Updating LSTM:: Input: predicted location and observed location Output: Refined location

Init.

forward(pred_traj)[source]¶

Forward of QD3DTrackGraph in training stage.

init_hidden(device, batch_size=1)[source]¶

Initializae hidden state.

The axes semantics are (num_layers, minibatch_size, hidden_dim)

predict(vel_history, location, hc_0)[source]¶

Predict location at t+1 using updated location at t.

Input:: vel_history: (num_seq, num_batch, loc_dim), velocity from previous num_seq updates location: (num_batch, loc_dim), location from previous update hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell
Middle:: embed: (num_seq, num_batch x feature_dim), location feature out: (num_seq x num_batch x hidden_size), lstm output attention_logit: (num_seq x num_batch x loc_dim), the predicted residual
Output:: hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location

refine(location, observation, prev_location, confidence, hc_0)[source]¶

Refine predicted location using single frame estimation at t+1.

Input:: location: (num_batch x loc_dim), location from prediction observation: (num_batch x loc_dim), location from single frame estimation prev_location: (num_batch x loc_dim), refined location confidence: (num_batch X 1), depth estimation confidence hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell
Middle:: loc_embed: (1, num_batch x feature_dim), predicted location feature obs_embed: (1, num_batch x feature_dim), single frame location feature conf_embed: (1, num_batch x feature_dim), depth estimation confidence feature embed: (1, num_batch x 2*feature_dim), location feature out: (1 x num_batch x hidden_size), lstm output
Output:: hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location

class VeloLSTMOut(loc_preds: Tensor, loc_refines: Tensor)[source]¶

VeloLSTM output.

Create new instance of VeloLSTMOut(loc_preds, loc_refines)

init_lstm_module(layer)[source]¶

Initialize LSTM weights and biases.