05. Edge Detection

Challenge: convert a 2D image into a set of curves.

That is, find all edge points that are (mostly) connected, then join them into curves.

Edges come from:

Surface-normal discontinuity (e.g. from shading)
Depth discontinuity
Surface color discontinuity
Illumination discontinuity

Edge profiles:

    |-----    /\        ||
----|        /  \    ___||____
Step         Roof      Line

Edge detection:

Detect short linear edge segments - edgels
Aggregate edgels into extended edges

Edges are the points at which the rate of change reaches a maximum - when the second derivative is zero.

\nabla f = \left[\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right]

The gradient, which points in the direction of greatest change, can be represented by an angle and magnitude:

\begin{aligneD} \theta &= \tan^{-1}\left(\frac{\partial f}{\partial x} / \frac{\partial f}{\partial y}\right) \\ ||\nabla f|| &= \sqrt{\left(\frac{\partial f}{\partial x}\right)^2 + \left(\frac{\partial f}{\partial y}\right)^2} \end{aligned}

However, on a discrete image, approximations can be made.

The Sobel operator:

\Delta_x = \begin{pmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{pmatrix} \quad \Delta_y = \begin{pmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{pmatrix}

(NB: need to scale by $\frac{1}{8}$ to get the right gradient value, but is irrelevant for edge detection)

The Robert’s Cross Operator (for diagonal edges):

\Delta_1 = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} \quad \Delta_2 = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}

The 4×4 Prewitt operator:

\Delta_x = \begin{pmatrix} -3 & -1 & 1 & 3 \\ -3 & -1 & 1 & 3 \\ -3 & -1 & 1 & 3 \\ -3 & -1 & 1 & 3 \end{pmatrix} \quad \Delta_y = \begin{pmatrix} 3 & 3 & 3 & 3 \\ 1 & 1 & 1 & 1 \\ -1 & -1 & -1 & -1 \\ -3 & -3 & -3 & -3 \\ \end{pmatrix}

Can use trigonometry to determine direction of the gradient using the horizontal and vertical Sobel/Prewitt operators.

Looking at only the adjacent pixel may be useless in situations with large amounts of noise. Hence, use a Gaussian kernel to smooth the image before applying the gradient operator. However, this can be done more efficiently by applying the derivative function to the Gaussian kernel, then applying it to the signal (the derivative theorem of convolution).

Then, the point of maximum difference - the second differential, can be found to detect the edge.

The sums of the second partial derivatives is called the Laplacian.

Canny Edge Detection

An optimal edge detection algorithm should have:

Good detection: responds to edges, not noise
Good localization: detected edge is close to the true edge
Single edge: one edge detected per true edge

Under the assumptions of a linear filter and independent and identically distributed Gaussian noise, the optimal detector is approximately the derivative of the Gaussian.

Detection and localization are diametrically opposed to each other: more smoothing leads to better detection but worse localization.

Canny edge detector steps:

Gaussian blur: reduce noise
Use the Sobel kernel to find the gradient in the $x$ and $y$ directions
Find the gradient magnitude and direction (rounded to 45 degree increments) Non-maximum suppression/edge thinning:
- Zero the pixel value if the magnitude not a maximum compared to neighbors in the relevant direction
- Can predict the next edge point by moving along the normal to the gradient
Hysteresis thresholding (through double thresholding):
- Any pixel whose magnitude is larger than some threshold is kept
- Any pixel whose magnitude is smaller than some threshold is removed
- Any pixel in between the thresholds is kept if it is connected to a remaining pixel

A low value of $\sigma$ , the radius of the Gaussian blur, detects fine feature, while a large value detects large-scale edges.

Scale Space

As $\sigma$ - the scale of the image gets larger:

Smoothing/blurring increases
Noise edges disappear
Edges become smoother
Fine/high-frequency detail is removed

Multiple representations of the image at different scales can be generated.

If an edge detection algorithm is applied to all the images, edges can be matched between images:

Edges may disappear or merge as scale increases - this can be used to determine how ‘strong’ an edge is
Detected edge positions may change with scale
Edges will never split as scale increases

Edge Detection by Subtraction

Subtract Gaussian-blurred image from the original, then scale and add an offset. This works as low-frequency information mostly remains in the blurred image and hence gets removed in the subtraction.This set of operations approximates the Laplacian of Gaussian.

Hough Line Detection

Finds straight lines from binary image (e.g. output of Canny algorithm).

Uses a voting scheme instead of naively searching for lines at every single position and orientation.

The Hough space is a transform of the $x$ - $y$ coordinate space to $r$ - $\theta$ space where:

$r$ is the shortest distance of the line to the origin
$\theta$ is the angle of the line
$y = mx + c$ transformed to $r = x \cos{\theta} + y \sin{\theta}$

For each point (that is an edge pixel) and for every angle, find $r$ and increment the value in the Hough space by one - the point votes for every line it could be part of.

The points with the largest vote counts are the straight lines we are most confident in.

The Hough circle transform does the same, except using the $x$ - $y$ - $r$ space.

The same technique works for any curve that can be expressed in parametric form, although the parameter space can get huge.

OpenCV Tutorial:

Corners

Doors and corners, kid. That’s where they get you

Detective Miller

What is the gradient at the corner? Near the corner, edges have gradients going in two different directions and at the corner, the gradient is ill-defined. Hence, edge detectors tend to fail at corners.

Corners, however, are useful for tracking movement between frames.

Harris Corner Detection

Over a small window, a corner is likely to have high intensity variation in all directions. This uses the sum-squared difference.

Given:

$I$ gives the intensity of a pixel
$w$ is a window function that determines the weight of each pixel (e.g. Gaussian) relative to the target pixel
An offset $(\Delta x, \Delta y)$ that separates two windows

For all pixels $(x, y)$ that are part of the window:

E(\Delta x, \Delta y) = \sum_{x, y}{w(x, y) \quad \left[I(x + \Delta x, y + \Delta y) - I(x, y)\right]^2}

That is, given two windows, calculate the difference between each pair of pixels, square them, and sum them.

Using the Taylor expansion, this can be approximated to:

\begin{aligned} I(x + \Delta x, y + \Delta y) &= I(x, y) + \Delta x \frac{\partial I}{\partial x}\left(x, y\right) + \Delta y \frac{\partial I}{\partial y}\left(x, y\right) \\ &= I(x, y) + \Delta x I_x(x, y) + \Delta y I_y(x, y) \end{aligned}

Giving the approximation:

\begin{aligned} E(\Delta x, \Delta y) &\approx \sum_{x, \Delta y}{\left[ \Delta x I_x(x, y) + \Delta y I_y(x, y) \right]^2} \\ &\approx \sum_{x, \Delta y}{ \Delta x^2 I_x(x, y)^2 + 2 \Delta x \Delta y I_x(x,y) I_y(x,y) + \Delta y^2 I_y(x, y)^2 } \\ &\approx \begin{pmatrix} \Delta x & \Delta y \end{pmatrix} M \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \end{aligned}

Where:

M = \sum_{x, y} \begin{bmatrix} \left(I_x\right)^2 & I_xI_y \\ I_xI_y & \left(I_y\right)^2 \end{bmatrix}

Given $\lambda_1$ and $\lambda_2$ are the eigenvalues of $M$ :

\begin{aligned} R &= \det(M) - k\left(\text{trace}(M)\right)^2 \\ &= \lambda_1 \lambda_2 - k\left(\lambda_1 + \lambda_2 \right)^2 \end{aligned}

Corners have large values of $R$ ; edges occur where $R < 0$ .