Orientation Tracking and Panoramic Image Stitching

UC San Diego, ECE 276A: Sensing & Estimation in Robotics · Winter Quarter 2026

Code  |  Report


Overview

This project tracks the 3D orientation of a rotating body using IMU data alone, then uses those orientation estimates to stitch a sequence of narrow camera frames into wide panoramic images. The core algorithm is projected gradient descent over a trajectory of unit quaternions, jointly optimizing a motion model (gyroscope kinematics) and an observation model (accelerometer gravity constraint).

The method is evaluated on 9 training sequences and 2 test sequences, each containing synchronized IMU, camera, and (for training) VICON motion-capture ground truth data. Dataset provided by Prof. Nikolay Atanasov as part of ECE 276A; training & test data.


Approach

IMU Calibration

Each dataset begins with a static period where the body is at rest. Sensor biases are estimated by averaging gyroscope and accelerometer readings during this window; the gyroscope should read zero and the accelerometer should read 1g upward. After subtracting these biases, gyroscope-only integration already tracks roll and pitch reasonably well, as shown below.

IMU vs VICON ds1 IMU vs VICON ds2 IMU vs VICON ds3

Gyro-only integration vs. VICON ground truth (datasets 1–3). Roll and pitch track well; yaw drifts as expected.

Projected Gradient Descent

The full optimization minimizes a cost over the quaternion trajectory q1:T with two terms: a motion model error penalizing deviation from gyroscope-predicted orientation, and an observation model error penalizing mismatch with the measured gravity direction. After each gradient step the quaternions are projected back to unit norm. The trajectory is initialized via gyroscope integration, which dramatically improves convergence.

After optimization, roll and pitch agree closely with VICON ground truth (typically within ±0.1 rad). Yaw is unobservable by the accelerometer and exhibits some drift, an expected fundamental limitation.

Optimized ds1 Optimized ds2 Optimized ds3

Optimized orientation (gyro + accelerometer) vs. VICON ground truth (datasets 1–3).

Optimized ds7 Optimized ds8 Optimized ds9

Optimized orientation vs. VICON ground truth (datasets 7–9).

Panorama Stitching

Each camera frame is back-projected through the estimated orientation into a cylindrical world coordinate panorama. For each pixel, the viewing ray is transformed from camera frame → IMU frame → world frame, then mapped to azimuth and elevation coordinates in the output image. Frames are stitched by simple overwriting in chronological order.


Convergence

All 11 datasets (9 training, 2 test) converged to a final cost below 1.5, representing reductions of 1–3 orders of magnitude from initial cost. Most training sets converged before the 5000-iteration limit.

DatasetSplitSamplesConverged AtInitial CostFinal Cost
1Train56452057161.40.434
2Train46985000*281.40.561
3Train34045000*11.21.187
4Train31565000*142.41.089
5Train32104811271.11.242
6Train3211277684.80.836
7Train35771978230.71.466
8Train3501271775.60.401
9Train29314117301.70.328
10Test3078357637.80.315
11Test544171211.81.182

* Reached 5000-iteration limit; final cost change was <10⁻⁴, indicating near-convergence.


Training Panoramas

Panoramas from the four camera-equipped training datasets show clear room structure with correct vertical orientation. Black regions correspond to directions not observed during the rotation sequence. Each is shown alongside its VICON-based reference.

Panorama ds1 optimized

Dataset 1 (optimized)

Panorama ds1 VICON

Dataset 1 (VICON ground truth)

Panorama ds2 optimized

Dataset 2 (optimized)

Panorama ds2 VICON

Dataset 2 (VICON ground truth)

Panorama ds8 optimized

Dataset 8 (optimized)

Panorama ds8 VICON

Dataset 8 (VICON ground truth)

Panorama ds9 optimized

Dataset 9 (optimized)

Panorama ds9 VICON

Dataset 9 (VICON ground truth)

Optimized panoramas closely match the VICON-based reference. The main visible difference is a slight horizontal shift in some datasets (e.g. dataset 2), caused by yaw drift, consistent with the known unobservability of yaw from accelerometer data alone.


Test Panoramas

No ground truth is available for the test datasets. Both sequences converged cleanly and produce recognizable panoramas.

Test ds10 orientation

Dataset 10 (estimated orientation)

Test ds11 orientation

Dataset 11 (estimated orientation)

Panorama ds10

Dataset 10 (panorama)

Panorama ds11

Dataset 11 (panorama)


← Back to Portfolio