14. Tracking

Fiducial Markers

A fiducial marker is any planar object introduced into the scene and in the field-of-view of a n imaging system to be used as a point of reference of measure.

Can be used for:

Object position estimation
Camera position estimation
Estimate transform/poses of robots

How:

Calibrate camera to determine and correct distortion
Homography: 3x3 matrix mapping between projection of two planes
- If $x_1$ and $x_2$ are corresponding points on two planes, $x_1 = H \cdot x_2$
Pose: 6 degrees of freedom; rotation + translation along each axis; can be represented in 3x3 rotation and 1x3 translation matrix. Called the extrinsics matrix
Process:
- Find marker outline
  - e.g. ArUco: adaptive thresholding, contour extraction, quad extraction
  - e.g. AprilTags: edge detection, line segment detection, quad extraction
- Calculate homography using corner locations
- Calculate extrinsics using focal length and marker size

Challenges:

Occlusion
Unfocused or motion blur
Dark/uneven light, vignetting
Jitter: exact position can vary between frames
False positives: not all squares are markers

Markerless Tracking

Use ‘natural’ features for tracking: corners, edges, points etc.

Also: templates - basically something whose representation stored in the system.

This is more difficult and usually much more computationally expensive.

Texture tracking:

Replace marker corners with keypoints
- SIFT, SURF, GLOH, etc.
- Apply detector to every single pixel
- Find the best set of keypoints (e.g. filter by strength/similarity/distance)
- Create descriptors; windows around the keypoints
Generate robust descriptors that allow differentiation between keypoints
- SIFT:
  - Estimate dominant keypoint orientation using gradients
  - Compensate for detected orientation
    - e.g. if keypoints on plane, transform all features as if the camera is normal to the plane
  - Describe keypoints in terms of surrounding gradient radially
Match descriptors against database (created offline)
- Build a database with all descriptors and their position on the original image
  - For robustness, search for corners at multiple scales
- May require data structures to allow for efficient searching
Outlier removal:
- Start with the cheapest techniques first (e.g. geometric)
- End with homography-based tests

Hybrid tracking: use gyroscope for prediction of camera orientation, and computer vision to correct gyroscope drift. Kalman filter?

Outdoor: lots of landmarks and planar features, but varying lighting conditions make it difficult.