Example Question
Describe and provide CV example for:
- Supervised learning:
- Learn function from input to output based on label-data pairs
- e.g. road sign classification
- Weakly-supervised learning:
- Labels are ‘weaker’: noisy, limited or imprecise Semi-supervised learning:
- Small amount of labeled data, larger set of unlabeled data
- Use model to assign labels on the unsupervised data, manually correct, and then use results to retrain
- Self-supervised learning:
- Use properties of the data to provide a supervision signal
- e.g. use auxiliary task like image completion to learn mapping from image to feature vector to define similarity metric between images
CNNs: what property of image matching CV algorithms enable self-supervised learning?
- Correct solutions can be verified - loss function can be written, allowing the ML algorithm to be supervised
How would this work for stereo/optical flow?
- Dense stereo/optical flow provide correspondence between two images; one image can be warped to match its counterpart. Hence, this allows a comparison to give an indication of how successful the warp is and hence provide a loss function
- SLAM/SfM: matches based on geometric consistency; badly-matched key-points will fail a geometric consistency test and be discarded. Keypoints that pass/fail can be used as a positive/negative supervision signal
Last question in the exam: briefly describe four of the following class projects, naming at least four algorithmic steps (with algorithm names). Do not select your own/similar projects.
If person does not list four or more algorithms, won’t be selected.
Project Tips
- Academic paper: do not mention failures or running out of time. Phrase as positives in ‘future research’
- Remove the word ‘project’ from the paper; use ‘research’
- Avoid colloquialisms
- ‘The paper proposes a method’ not ‘the goal of this paper’
- ‘These results show the proposed approach can’ not the ‘system can’
- Do not motion the phrase ‘computer vision’; paper for a CV conference, so too broad
- Worse results are fine; proposing a method, not selling a solution
- Only mention the framework, hardware etc. at the beginning of the results/methods section
Abstract:
- Not part of the paper. Self-contained, technical overview of the whole paper. Include algorithm names etc., mention at least one result number, hopefully a comparison with prior research
- Must at least attempt to compare it with prior research
Background:
- Critical review of prior research - mention limitations of prior research/algorithms
- e.g. static background required. Look at future research sections
Proposed methods:
- At least three CV algorithm names
- What algorithms are the DL networks using?
- Novel: can mean tiniest minuscule tweak
Results:
- At beginning, mention OS framework etc.
- Quantify results
- Try to quantitatively compare results with prior research
- Survey papers can be useful
Conclusions:
- Start with brief summary of results
- Quantitatively compare with prior research
- Future research sub-section
References:
- Be consistent
- Most should be newer than 10 years ago, or justify
Real World Example: CV for a Grape Vine Pruning Robot
Approx. half the cost of vineyards is in pruning, hard to get get enough workers, can’t prune in the rain etc.
Pruning: remove old wood and most new canes during the winter.
NZ:
- 90 million vines, mostly Sauvignon Blanc
- Hand-pruned. ~2 minutes per vine
Large project: viticulture, robotics and AI experts, software + hardware engineers. ~5 years
~85% successful. Good enough for the government, but not good enough commercially.
Lighting:
- Extremely challenging: dynamic range far too large in sunlight
- Got a mobile caravan to control lighting: lights, blue screen background etc.,
- Bike wheels
- Place lights to minimize shadows
Camera rig:
- 3 well-conditioned cameras. Allowed reconstruction in all directions
- Needed to align after every setup - bumping and vibrations caused movement
- 3D reconstruction:
- Many challenges: occlusion, depth discontinuities, self-similarity
- Solved using feature matching/bundle adjustments
- 2D feature extraction: canes, wires posts
- Move away from pixels/point clouds to high-level features
- Correspondence between views, using knowledge of vines
- 2D feature extraction: canes, wires posts
- Customized every stage to use knowledge about vines (no sharp corners, vine thickness etc.)
- Made sequential chain of components that could be developed and parametrised in sequence and in isolation
- Rolling buffer of the last few dozen frames
- Now can use ML to get a very accurate 3D model, but was not available at the time
Main challenge was complexity and robustness.
- Lighting
- Even with artificial lighting, getting rid of shadows is hard
- Solution: MORE LIGHTS
- Occlusion
- 6 12 megapixel cameras with global shutters and bright lights to reduce motion blur
- Self-simiarity: vines look the same TODO
Main challenges:
- Complexity
- Robustness