Leonardo Digital Reviews

The Geometry of Multiple Images

by Olivier Faugeras and Quang-Tuan Luong
The MIT Press, Cambridge, 2004
646 pp., illus. 230 b/w. Paper, $35.00
ISBN: 0-262-56204-9.

Reviewed by Stefaan Van Ryssen
Hogeschool Gent
Jan Delvinlaan 115, 9000 Gent, Belgium

stefaan.vanryssen@pandora.be

The laws that govern vision, by animals, humans and machines, have been researched intensively for ages. Telescopes and lenses have enhanced and broadened our view of the world, and to make them better, to see more and further, scientists needed to have a better understanding of the fundamental laws of optics. Once the camera obscura and, later on, the modern photographic camera were developed and images of our surroundings could be preserved and manipulated, interest widened from optics to the geometry of cameras and pictures. With the emergence of the field of computer vision and the wide scale production of digital images, a framework for stating and solving problems is required.

Two basic questions are at the root of this book: how to make sure that the multitude of images we produce of the same scene are consistent, and how to make sense of a series of images taken by different cameras from different angles are interpreted in the same way. In other words: If you've got three images of the Aya Sofia, how can you be sure it's the same mosque? Or: If you've got a few vague pictures of a boulder on Titan, how big can this object be? And even better: Can we derive a full and veritable 3-D model from the information we have in a few digital 2-D representations, even if we don't know exactly where the cameras were in relation to the object?

To solve these questions, a solid mathematical foundation is needed and geometry is the obvious branch of math that serves this purpose. Euclidian geometry, however, the variety that most of us have become acquainted with in high school, isn't really up to the task, or at least not in a way that makes algorthymic calculations easy. Instead, projective and affine geometries offer a better framework to solve the complex problems involved in the field. Olivier Fagueras and Quang-Tuan Luong show how the three types of geometry are related and when and where projective geometry and the algebra that goes with it serves best both as a formal way of describing three dimensional objects and their representation and as a toolbox for the professional computer vision expert who wants to develop an application. In an introductory chapter, which is already quite advanced in its formalism, an intuitive approach of the field is outlined. Chapters two and three build the math from the ground up, from basic definitions of projective spaces to Grassmann-Cayley algebras. The book then continues with analyses of the one camera case, the two cameras case, stratification, multiple views, and moving cameras. In each chapter, a number of real world applications is thoroughly and clearly described.

For non-mathematicians and even for those with a solid high school background in math, the book is far too specialised but for the first chapter. The examples and applications are not described in a way that the layman or laywoman can get an intuitive grasp of what happens in the calculations, but it is absolutely an excellent book of reference for the machine vision and computer graphics specialists.