UC San Diego, ECE 276A: Sensing & Estimation in Robotics · Winter Quarter 2026
This project tracks the 3D orientation of a rotating body using IMU data alone, then uses those orientation estimates to stitch a sequence of narrow camera frames into wide panoramic images. The core algorithm is projected gradient descent over a trajectory of unit quaternions, jointly optimizing a motion model (gyroscope kinematics) and an observation model (accelerometer gravity constraint).
The method is evaluated on 9 training sequences and 2 test sequences, each containing synchronized IMU, camera, and (for training) VICON motion-capture ground truth data. Dataset provided by Prof. Nikolay Atanasov as part of ECE 276A; training & test data.
Each dataset begins with a static period where the body is at rest. Sensor biases are estimated by averaging gyroscope and accelerometer readings during this window; the gyroscope should read zero and the accelerometer should read 1g upward. After subtracting these biases, gyroscope-only integration already tracks roll and pitch reasonably well, as shown below.
Gyro-only integration vs. VICON ground truth (datasets 1–3). Roll and pitch track well; yaw drifts as expected.
The full optimization minimizes a cost over the quaternion trajectory q1:T with two terms: a motion model error penalizing deviation from gyroscope-predicted orientation, and an observation model error penalizing mismatch with the measured gravity direction. After each gradient step the quaternions are projected back to unit norm. The trajectory is initialized via gyroscope integration, which dramatically improves convergence.
After optimization, roll and pitch agree closely with VICON ground truth (typically within ±0.1 rad). Yaw is unobservable by the accelerometer and exhibits some drift, an expected fundamental limitation.
Optimized orientation (gyro + accelerometer) vs. VICON ground truth (datasets 1–3).
Optimized orientation vs. VICON ground truth (datasets 7–9).
Each camera frame is back-projected through the estimated orientation into a cylindrical world coordinate panorama. For each pixel, the viewing ray is transformed from camera frame → IMU frame → world frame, then mapped to azimuth and elevation coordinates in the output image. Frames are stitched by simple overwriting in chronological order.
All 11 datasets (9 training, 2 test) converged to a final cost below 1.5, representing reductions of 1–3 orders of magnitude from initial cost. Most training sets converged before the 5000-iteration limit.
| Dataset | Split | Samples | Converged At | Initial Cost | Final Cost |
|---|---|---|---|---|---|
| 1 | Train | 5645 | 2057 | 161.4 | 0.434 |
| 2 | Train | 4698 | 5000* | 281.4 | 0.561 |
| 3 | Train | 3404 | 5000* | 11.2 | 1.187 |
| 4 | Train | 3156 | 5000* | 142.4 | 1.089 |
| 5 | Train | 3210 | 4811 | 271.1 | 1.242 |
| 6 | Train | 3211 | 2776 | 84.8 | 0.836 |
| 7 | Train | 3577 | 1978 | 230.7 | 1.466 |
| 8 | Train | 3501 | 2717 | 75.6 | 0.401 |
| 9 | Train | 2931 | 4117 | 301.7 | 0.328 |
| 10 | Test | 3078 | 3576 | 37.8 | 0.315 |
| 11 | Test | 5441 | 712 | 11.8 | 1.182 |
* Reached 5000-iteration limit; final cost change was <10⁻⁴, indicating near-convergence.
Panoramas from the four camera-equipped training datasets show clear room structure with correct vertical orientation. Black regions correspond to directions not observed during the rotation sequence. Each is shown alongside its VICON-based reference.
Dataset 1 (optimized)
Dataset 1 (VICON ground truth)
Dataset 2 (optimized)
Dataset 2 (VICON ground truth)
Dataset 8 (optimized)
Dataset 8 (VICON ground truth)
Dataset 9 (optimized)
Dataset 9 (VICON ground truth)
Optimized panoramas closely match the VICON-based reference. The main visible difference is a slight horizontal shift in some datasets (e.g. dataset 2), caused by yaw drift, consistent with the known unobservability of yaw from accelerometer data alone.
No ground truth is available for the test datasets. Both sequences converged cleanly and produce recognizable panoramas.
Dataset 10 (estimated orientation)
Dataset 11 (estimated orientation)
Dataset 10 (panorama)
Dataset 11 (panorama)