12. Vineyard Project

Example Question

Describe and provide CV example for:

Supervised learning:
- Learn function from input to output based on label-data pairs
- e.g. road sign classification
Weakly-supervised learning:
- Labels are ‘weaker’: noisy, limited or imprecise Semi-supervised learning:
- Small amount of labeled data, larger set of unlabeled data
- Use model to assign labels on the unsupervised data, manually correct, and then use results to retrain
Self-supervised learning:
- Use properties of the data to provide a supervision signal
- e.g. use auxiliary task like image completion to learn mapping from image to feature vector to define similarity metric between images

CNNs: what property of image matching CV algorithms enable self-supervised learning?

Correct solutions can be verified - loss function can be written, allowing the ML algorithm to be supervised

How would this work for stereo/optical flow?

Dense stereo/optical flow provide correspondence between two images; one image can be warped to match its counterpart. Hence, this allows a comparison to give an indication of how successful the warp is and hence provide a loss function
SLAM/SfM: matches based on geometric consistency; badly-matched key-points will fail a geometric consistency test and be discarded. Keypoints that pass/fail can be used as a positive/negative supervision signal

Last question in the exam: briefly describe four of the following class projects, naming at least four algorithmic steps (with algorithm names). Do not select your own/similar projects.

If person does not list four or more algorithms, won’t be selected.

Project Tips

Academic paper: do not mention failures or running out of time. Phrase as positives in ‘future research’
Remove the word ‘project’ from the paper; use ‘research’
Avoid colloquialisms
‘The paper proposes a method’ not ‘the goal of this paper’
‘These results show the proposed approach can’ not the ‘system can’
Do not motion the phrase ‘computer vision’; paper for a CV conference, so too broad
Worse results are fine; proposing a method, not selling a solution
Only mention the framework, hardware etc. at the beginning of the results/methods section

Abstract:

Not part of the paper. Self-contained, technical overview of the whole paper. Include algorithm names etc., mention at least one result number, hopefully a comparison with prior research
- Must at least attempt to compare it with prior research

Background:

Critical review of prior research - mention limitations of prior research/algorithms
e.g. static background required. Look at future research sections

Proposed methods:

At least three CV algorithm names
What algorithms are the DL networks using?
Novel: can mean tiniest minuscule tweak

Results:

At beginning, mention OS framework etc.
Quantify results
Try to quantitatively compare results with prior research
- Survey papers can be useful

Conclusions:

Start with brief summary of results
Quantitatively compare with prior research
Future research sub-section

References:

Be consistent
Most should be newer than 10 years ago, or justify

Real World Example: CV for a Grape Vine Pruning Robot

Approx. half the cost of vineyards is in pruning, hard to get get enough workers, can’t prune in the rain etc.

Pruning: remove old wood and most new canes during the winter.

NZ:

90 million vines, mostly Sauvignon Blanc
Hand-pruned. ~2 minutes per vine

Large project: viticulture, robotics and AI experts, software + hardware engineers. ~5 years

~85% successful. Good enough for the government, but not good enough commercially.

Lighting:

Extremely challenging: dynamic range far too large in sunlight
Got a mobile caravan to control lighting: lights, blue screen background etc.,
- Bike wheels
Place lights to minimize shadows

Camera rig:

3 well-conditioned cameras. Allowed reconstruction in all directions
- Needed to align after every setup - bumping and vibrations caused movement
3D reconstruction:
- Many challenges: occlusion, depth discontinuities, self-similarity
- Solved using feature matching/bundle adjustments
  - 2D feature extraction: canes, wires posts
    - Move away from pixels/point clouds to high-level features
  - Correspondence between views, using knowledge of vines
- Customized every stage to use knowledge about vines (no sharp corners, vine thickness etc.)
  - Made sequential chain of components that could be developed and parametrised in sequence and in isolation
- Rolling buffer of the last few dozen frames
- Now can use ML to get a very accurate 3D model, but was not available at the time

Main challenge was complexity and robustness.

Lighting
- Even with artificial lighting, getting rid of shadows is hard
- Solution: MORE LIGHTS
Occlusion
- 6 12 megapixel cameras with global shutters and bright lights to reduce motion blur
Self-simiarity: vines look the same TODO

Main challenges:

Complexity
Robustness