Zoom Lens Calibration - Robotics Institute Carnegie Mellon University

Zoom Lens Calibration

Missing Image Placeholder
This Project is no longer active.

To navigate and operate in the real world autonomous systems need to use sensors to learn about the state of the world around them. One of the richest sensing modalities is vision. Conventionally machine vision systems use cameras and lenses to produce 2D images from the 3D scene. To both interpret the images from the camera and plan sensing strategy for the camera we need to have models of the relationship between image and scene geometry.

Why Adjustable Lenses?


Automated zoom lenses are useful for two tasks:

Adaptation: Matching the camera’s sensing characteristics (e.g. radiometric sensitivity, spatial resolution or focussed distance) to the requirements of a given task.

Measurement: Inferring properties of the scene by noting how the scene’s image changes as the camera’s parameters are varied (e.g. range from focus).

Whether for adaptation or measurement, to effectively use adjustable lenses we need to have models of the camera’s image formation process that are valid across ranges of lens settings.

The Modelling and Calibration Problem


For fixed parameter lenses the image formation process is static and thus the terms in the camera model are constants. In variable parameter lenses the image formation process is a dynamic function of the lens control parameters, and thus the terms in the camera model must also be variable. The question is, How do the terms vary with the control parameters? This is a difficult question to answer for two reasons. First, the two traditional models of the image formation process, the pinhole camera and the thin-lens, are idealized high level abstractions of the real image formation process and thus the connection between the lens’ physical configuration and the model terms is not direct. Second, the relationship between the lens’ physical configuration and the control parameters is complex and typically unknown. Thus, we have no good theoretical basis for the relationships between the terms of our camera models and the lens control parameters. As illustrated in the plot of effective focal length versus focus and zoom motor, every model term is potentially a function of every lens control parameter. The actual relationships must be determined empirically.

Unlike the calibration of fixed parameter lenses, the calibration of variable parameter lenses requires that measurements be made over ranges of hardware configurations for the lens. This raises several challenges. First, the dimensionality of the data is the same as the number of control parameters that are to be concurrently modeled. A second challenge is the potential difficulty in taking measurements across the wide range of imaging conditions (e.g. defocus and magnification changes) that can occur over the range of some control parameters.

Dynamic Camera Models


We have developed new algorithms and techniques to build models for cameras with automated zoom lenses. The objective is camera models that "hold calibration" across continuous ranges of lens parameters. Our approach involves first calibrating a conventional static camera model at a number of lens settings spanning the lens’ control space. We then model how the terms of the static camera model vary with lens setting by alternately fitting polynomials to individual model terms and reestimating the unfitted terms using the calibration data. The process is repeated until all of the static camera model’s terms have been replaced with polynomial functions of the lens control parameters. The result is a predictive camera model that can interpolate between the original sampled lens settings to produce a set of values for the terms in the static camera model for any lens setting. We have used these techniques to produce dynamic camera models based on Tsai’s static camera model for two different automated camera systems. The models operate across continuous ranges of focus and zoom with an average error of less than 0.14 pixels between the predicted and the measured positions of features in the image plane.

past head

  • Steven Shafer

past contact

  • Steven Shafer