Optical tracking:
- Specialized
- e.g. IR lights for VR controllers
- Marker-based
- TODO
- Markerless:
- Edge-based
- Template-based
- Interest point
Trackable managers:
- AR Foundation: a Trackable is anything that can e detected and tracked in the real world
- Planes, point clouds, anchors, images, environment probes, faces, 3D objects
- We are interested in planes, point clouds, anchors, and images
- Each Trackable has a Trackable Manager, which is on the same GameObject as the AR Session Origin
- Each Trackable Manager keeps a list of its Trackables
- Planes, point clouds, anchors, images, environment probes, faces, 3D objects
Computer vision: detecting objects and tracking their movement in 6 degrees of freedom
Vision is inferrential: context, prior knowledge etc. is required to come up with a reasonable interpretation of the scene: An infinite number of 3D objects can lead to the same image.
3D information recovery:
- Motion
- Stereo vision
- Works up to ~3m
- Texture
- Shading
- Contour
- Can understand depth from a line drawing
- Time-of-flight sensors
TODO
The human visual system is really good and is often taken for granted; replicating this with a computer is very difficult.
Cognitive processing of color is dependent on context: neighboring colors, not absolute values. Hence, using a color mask for filtering is likely to fail unless you have control over lighting.
Low-level image processing:
- Image compression
- Noise reduction
- Edge extraction
- Contrast enhancement
- Good for humans but for computers, it is just throwing away information
- Segmentation
- Thresholding
- Morphology
- Image restoration
- e.g. if camera velocity known, can correct for motion blur
TODO
Recognition:
- Shading
TODO:
- Sports:
- American football: touch down line
- Swimming: flags
Perfect 3D point cloud -> 3D model is very difficult
Modelling the natural world: extremely difficult as there is variation. Manufacturing produces many copies of a single product, but nature does not.
Vision systems:
- Active:
- Laser scanner
- Structured light
- Project lots of dots; use dot size to determine distance
- Time of flight:
- Use time it takes for light to return to camera to determine distance: gives distance value for every single pixel
- Passive
- Stereo
- Cheap, works well in good lighting
- Stereo
- Structure from motion/3D reconstruction:
- Deep learning with moving camera to reconstruct 3D scene
COLOR:
- Visible spectrum is a tiny part of the electromagnetic spectrum
- Sun: greatest energy output at visible wavelengths
TODO
- Natural Feature Tracking:
- Keypoint detection
- SIFT, SURF, GLOH, BRIEF, FREAK etc.
- Descriptor creation and TODO
- Outlier removal
- Pose estimation and refinement
- Keypoint detection
Fidutial:
- No databases required
- Intrusive: environment
- Must be fully in-view