04. Filters

Modifying the pixels of an image based on a function taking input from pixels in the target pixel’s local neighborhood.

Linear Functions

When a filter is linear, the value of each pixel is a linear combination of its neighbors.

Where II is the image, gg is the kernel, and g[k,l]g[k, l] is the value of the matrix where g[0,0]g[0, 0] is the center element:

f[m,n]=Ig=k,iI[mk,nl]g[k,l] f[m, n] = I \otimes g = \sum_{k, i}{I[m - k, n - l]g[k, l]}

That is, the dot product of its local neighborhood multiplied by a kernel. This process is called a convolution.

Convolution kernels can be used to make a Gaussian filter - a blur. Blurring/smoothing the image reduces noise - high-frequency information.

σ\sigma, the radius of the kernel, is called the scale.

Multiple linear functions can be stacked - in addition to convolutions - multiplication by a kernel, addition/subtraction are also valid operations.

For example, to sharpen an image, you can multiply by the pixel magnitudes (and hence both low and high frequency information) by 2, then subtract by a blurred (i.e. low-pass filtered) version of the image, leaving you with only high frequency information - an image with edges over-accentuated. In pseudo-code: 2 * img(x, y, 1)) - dot(gaussian(sigma), img(x, y, sigma)). This approximates the Laplacian (sum of second partial derivatives) of Gaussian filter.

Gradients and Edges

An edge is a single point: a series of edge points is a line.

An edge is a point of sharp change (reflectance, object, illumination, noise) in an image.

The general strategy is to use linear filters to estimate the image gradient, then mark points where the change in magnitude is large in comparison to its neighbors.

Fourier Transforms

The fourier transform of a real function is complex - in this course, the phase component is ignored - we only care about magnitude.

All natural images have a similar same magnitude transform - running the inverse transform with the magnitude from another image returns similar results.

In the magnitude image generated from a fourier transform, the center of the image equals zero frequency while the edges of the image have higher frequency. Hence, by masking the image with a circle and then applying the inverse transform, either a low- (outside masked out) or high- (inside masked out) pass filter can be generated.

In the transform, the input image is tiled to infinity: this may cause discontinuities to occur at the edges. However, the effects of this can be mitigated by fading the image to grey near the edges (e.g. in a circle with a Gaussian).

https://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm