What is a good representation for image analysis?
- Fourier transform tells you ‘what’ (textural properties) but not ‘where’
- Pixel domain gives you ‘where’ but not ‘where’
We want an image representation that gives you a local description of image events - what and where, and naturally represent objects across varying scales.
Image pyramids: apply filters of fixed size to images of different sizes. Typically, edge length changes by a scale of 2 or the square root of 2.
There are many types of image pyramids:
- Gaussian
- Acts as a low pass filter
- Applying the Gaussian to a Gaussian returns another Gaussian, allowing recursion
- Synthesis: smooth and sample
- Laplacian
- Synthesis: for a given level in a Gaussian pyramid, find the difference between the image and the up-sampled image from the level below (lower resolution)
- Acts as a band-pass filter: each level represents spatial frequencies largely unrepresented in other levels
- Multi-scale, band-pass and over-complete (more coefficients than image pixels)
- Wavelets/QMFs:
- Apply 1D filters separably in two spatial dimensions
- Wavelet function: 1D function
with total integral of zero (‘centered’ around x axis) with limited domain? (only a small section of the domain returns a non-zero value) - Haar wavelet: -1 for
and 1 for - Parameters scale
and horizontal translation : - Continuous wavelet transform: integral of product of wavelet function with 1D signal: measures the correlation of wavelet with the signal
- Discrete wavelet transform: pick scales that are powers of two
- 2D wavelet transform: use high and low pass filter. At each step, run the filter, downsample by 2, and repeat, except filtering in the other dimension. Hence, if run twice (once per dimension) you get four images downsized by a factor of 4
- Wavelet function: 1D function
- Multi-scale, band-pass and complete
- Some aliasing
- Apply 1D filters separably in two spatial dimensions
- Steerable:
- Can pick angle of interest
- Image corners must be removed
Uses:
- Gaussian: scale-invariance
- Laplacian: difference between pyramid levels - useful for noise reduction and coding
- Wavelet/QMF: band-passed, complete representation of the image
- Steerable pyramid: show components at each scale/orientation - useful for texture/feature analysis