By Bart M. ter Haar Romeny Publisher : Springer; 2003 Updated: November 2024 Language : EnglishHardcover : 484 pagesISBN-10 : 1402015038ISBN-13 : 978-1402015038Dimensions : 15.88 x 2.54 x 24.13 cm
The book provides a unique perspective on multi-scale computer vision and medical image analysis, inspired by modern visual neuroscience. Through the use of interactive Mathematica notebooks a dynamic and exciting learnng environment is created.
Every chapter is an interactive Mathematica notebook, and can be conveniently downloaded for free. The book offers a comprehensive exploration of modern computer vision, inspired by insights from human vision, the retina, and brain functionality.
Scale is not an important parameter in computer vision research. It is an essential parameter. It is an immediate consequence of the process of observation, of measurements. This book is about scale, and its fundamental notion in computer vision, as well as human vision.
Observations are always done by integrating some physical property with a measurement device. Integration can be done over a spatial area, over an amount of time, over wavelengths etc. depending on the task of the physical measurement.
To compute any type of representation from the image data, information must be extracted using certain operators interacting with the data. Basic questions then are: Which operators to apply? Where to apply them? How should they look like? How large should they be?
The Gaussian (better Gaußian) kernel is named after Carl Friedrich Gauß (1777-1855), a brilliant German mathematician. This chapter discusses many of the attractive and special properties of the Gaussian kernel.
The Gaussian derivative function has many interesting properties. We will discuss them in one dimension first. We study its shape and algebraic structure, its Fourier transform, and its close relation to other functions like the Hermite functions, the Gabor functions and the generalized functions.
In order to get a good feeling for the interactive use of Mathematica, we discuss in this section three implementations of convolution with a Gaussian derivative kernel (in 2D) in detail: 1. implementation in the spatial domain with a 2D kernel; 2. through two sequential 1D kernel convolutions (exploiting the separability property); 3. implementation in the Fourier domain.
In this chapter we will study the differential structure of discrete images in detail. This is the structure described by the local multi-scale derivatives of the image. We start with the development of a toolkit for the definitions of height lines, local coordinate systems and independence of our choice of coordinates.
There is a fundamental relation between the order of differentiation, scale of the operator and the accuracy required. In this chapter will derive this relation.
Regularization is the technique to make data behave well when an operator is applied to them. Such data could e.g. be functions, that are impossible or difficult to differentiate, or discrete data where a derivative seems to be not defined at all. In scale-space theory, we realize that we do physics: a small variation of the input data should lead to small change in the output data.
We will see that the front-end visual system measures simultaneously at multiple resolutions, it measures directly (in the scale-space model) derivatives of the image in all directions at least up to fourth order, it measures temporal changes of intensity, the motion and disparity parameters, and the color differential structure.
Why do we have all these different sizes? Smaller receptive fields are useful for a sharp high resolution measurement, while the larger receptive fields measure a blurred picture of the world. We denote the size of the receptive field its scale. We seem to sample the incoming image with our retina at many scales simultaneously.
From the retina, the optic nerve runs into the central brain area and makes a first monosynaptic connection in the Lateral Geniculate Nucleus (LGN), a specialized area of the thalamus. The thalamus is an essential structure of the midbrain. Here, among others, all incoming perceptual information comes together, not only visual, but also tactile, auditory and balance.
Hubel and Wiesel were the first to find the regularity of the orientation sensitivity tuning. They recorded a regular change of the orientation sensitivity of receptive fields when the electrode followed a track tangential to the cortex surface. A hypercolumn is a functional unit of cortical structure.
The key point is that not only do we need to connect observations at different localizations - we also need to link observations at different scales. In the words of Koenderink, we must study the family of scale-space images as a family, and define the 'deep' structure. 'Deep' refers to the extra dimension of scale in a scale-space
Linking segments over scale is not trivial. The field of catastrophe theory is quite extensive and complicated. This introduction focuses on giving an intuitive understanding of the concepts most related to computer vision and image processing.
In ths chapter we discuss and implement the so-called winding numbers, elegant constructs to find singularity points, such as extrema and saddle points.
To discuss an application where really high order Gaussian derivatives are applied, we study the deblurring of Gaussian blur by inverting the action of the diffusion equation, as originally described by Florack et al. [Florack et al. 1994b, TterHaarRomeny1994a].
In this chapter we focus on the quantitative extraction of small differences in an image sequence caused by motion, and in an image pair by differences in depth. We extract the local motion parameters as a small local shift over time or space. The resulting vector field is the the optic flow from the image sequence, a spatio-temporal feature, and The resulting vector field is the disparity map for the stereo pair.
Color is an important extra dimension. Information extracted from color is useful for almost any computer vision task, like segmentation, surface characterization, etc. We are especially interested in the extraction of multi-scale differential structure in the spatial and the color domain of color images.
Orientation plays an important role as parameter in establishing similarity relations between neighboring points. As such, it is an essential ingredient of methods for perceptual grouping. E.g. the grouping of edge pixels in a group that defines them as belonging to the same contour, could be done using similarity in orientation of the edges, i.e. of their respective gradient vectors.
We need to have an aperture in time integrating for some time to perform the measurement. This is the integration time. Systems with a short resp. long integration time are said to have a fast resp. slow response. Because of the necessity of this integration time, which need to have a finite duration (temporal width) in time, a scale-space construct is a physical necessity again.
Linear, isotropic diffusion cannot preserve the position of the differential invariant features over scale. A solution is to make the diffusion, i.e. the amount of blurring, locally adaptive to the structure of the image.
Computer vision is a huge field, and this book could only touch upon a small section of it. First of all, the emphasis has been on the introduction of the notion of observing the physical phenomena, which makes the incorporation of scale unavoidable. Secondly, scale-space theory nicely starts from an axiomatic basis, and incorporates the full mathematical toolbox. It has become a mature branch in modern computer vision research.
All 790 references in the book
Mathematica is an fully integrated environment for technical computing. It has been developed by prof. Stephen Wolfram and is now being developed and distributed by Wolfram Research Inc. Mathematica comes with an excellent on-board help facility, which includes the full text of the handbook (over 1400 pages).
In this appendix a careful and visual explanation is given of the concepts of convolution and correlation.
21 chapters, 470 pages.
The complete index to all chapters of the book, with hyperlinks to the relevant pages. This works when the notebook file Index.nb is installed with all other notebooks in the same directory.
This site was created with the Nicepage