mrcal lens models

Table of Contents

mrcal supports a wide range of projection models. Some are intended to represent physical lenses, while others are idealized, useful for processing. All are referred to as lens models in the code and in the documentation. The representation details and projection behaviors are described here.

Representation

A mrcal lens model represents a lens independent of its pose in space. A lens model is fully specified by

  • A model family (or type). This is something like LENSMODEL_PINHOLE or LENSMODEL_SPLINED_STEREOGRAPHIC
  • Configuration parameters. This is a set of key/value pairs, which is required only by some model families. These values are not subject to optimization, and may affect how many optimization parameters are needed.
  • Optimization parameters. These are the parameters that the optimization routine controls as it runs

Each model family also has some metadata key/value pairs associated with it. These are inherent properties of a model family, and are not settable. At the time of this writing there are 3 metadata keys:

  • has_core: True if the first 4 optimization values are the "core": \(f_x\), \(f_y\), \(c_x\), \(c_y\)
  • can_project_behind_camera: True if this model is able to project vectors from behind the camera. If it cannot, then mrcal.unproject() will never report z < 0
  • has_gradients: True if this model has gradients implemented

In Python, the models are identified with a string LENSMODEL_XXX where the XXX selects the specific model family and the configuration, if needed. A sample model string with a configuration: LENSMODEL_SPLINED_STEREOGRAPHIC_order=3_Nx=30_Ny=20_fov_x_deg=170. The configuration is the pairs order=3, Nx=30 and so on. At this time, model families that accept a configuration require it to be specified fully, although optional configuration keys will be added soon. Today, calling Python functions with LENSMODEL_SPLINED_STEREOGRAPHIC or LENSMODEL_SPLINED_STEREOGRAPHIC_order=3 will fail due to an incomplete configuration. The mrcal.lensmodel_metadata_and_config() function returns a dict containing the metadata and configuration for a particular model string.

In C, the model family is selected with the mrcal_lensmodel_type_t enum. The elements are the same as the Python model names, but with MRCAL_ prepended. So sample model from above has type MRCAL_LENSMODEL_SPLINED_STEREOGRAPHIC. In C the mrcal_lensmodel_t structure contains the type and configuration. This structure is thus an analogue the the model strings, as Python sees them. So a number of C functions accepting mrcal_lensmodel_t arguments are analogous to Python functions taking model strings. For instance, the number of parameters needed to fully describe a given model can be obtained by calling mrcal.lensmodel_num_params() in Python or mrcal_lensmodel_num_params() in C. Given a mrcal_lensmodel_t lensmodel structure of type XXX (i.e. if lensmodel.type = MRCAL_LENSMODEL_XXX=) then the configuration is available in lensmodel.LENSMODEL_XXX__config, which has type mrcal_LENSMODEL_XXX__config_t. The metadata is requestable by calling this function:

mrcal_lensmodel_metadata_t mrcal_lensmodel_metadata( const mrcal_lensmodel_t* lensmodel );

Intrinsics core

Most models contain an "intrinsics core". These are 4 values that appear at the start of the parameter vector:

  • \(f_x\): the focal length in the horizontal direction, in pixels
  • \(f_y\): the focal length in the vertical direction, in pixels
  • \(c_x\): the horizontal projection center, in pixels
  • \(c_y\): the vertical projection center, in pixels

At this time all models contain a core.

Models

Currently all models represent a central projection: all observation rays intersect at a single point (the camera origin). So \(k \vec v\) projects to the same \(\vec q\) for all \(k\). This isn't strictly true for real-world lenses, so non-central projections will be supported in a future release of mrcal.

LENSMODEL_PINHOLE

This is the basic "pinhole" model with 4 parameters: the core. Projection of a point \(\vec p\) is defined as

\[\vec q = \left[ \begin{aligned} f_x \frac{p_x}{p_z} + c_x \\ f_y \frac{p_y}{p_z} + c_y \end{aligned} \right] \]

This model is defined only in front of the camera, and projects to infinity as we approach 90 degrees off the optical axis (\(p_z \rightarrow 0\)). Straight lines in space remain straight under this projection, and observations of the same plane by two pinhole cameras define a homography. This model can be used for stereo rectification, although it only works well with long lenses. Longer lenses tend to have roughly pinhole behavior, but no real-world lens follows this projection, so this exists for data processing only.

LENSMODEL_STEREOGRAPHIC

This is another trivial model that exists for data processing, and not to represent real lenses. Like the pinhole model, this has just the 4 core parameters.

To define the projection of a point \(\vec p\), let's define the angle off the optical axis:

\[ \theta \equiv \tan^{-1} \frac{\left| \vec p_{xy} \right|}{p_z} \]

then

\[ \vec u \equiv \frac{\vec p_{xy}}{\left| \vec p_{xy} \right|} 2 \tan\frac{\theta}{2} \]

and

\[\vec q = \left[ \begin{aligned} f_x u_x + c_x \\ f_y u_y + c_y \end{aligned} \right] \]

This model is able to project behind the camera, and has a single singularity: directly opposite the optical axis. mrcal refers to \(\vec u\) as the normalized stereographic projection; we get the projection \(\vec q = \vec u\) when \(f_x = f_y = 1\) and \(c_x = c_y = 0\)

Note that the pinhole model can be defined in the same way, except the pinhole model has \(\vec u \equiv \frac{\vec p_{xy}} {\left| \vec p_{xy} \right|} \tan \theta\). And we can thus see that for long lenses the pinhole model and the stereographic model function similarly: \(\tan \theta \approx 2 \tan \frac{\theta}{2}\) as \(\theta \rightarrow 0\)

LENSMODEL_LONLAT

This is a standard equirectangular projection. It's a trivial model useful not for representing lenses, but for describing the projection function of wide panoramic images. This works just like latitude an longitude on a globe, with a linear angular map on latitude and longitude. The 4 intrinsics core parameters are used to linearly map latitude, longitude to pixel coordinates. The full projection expression to map a camera-coordinate point \(\vec p\) to an image pixel \(\vec q\):

\[ \vec q = \left[ \begin{aligned} f_x \, \mathrm{lon} + c_x \\ f_y \, \mathrm{lat} + c_y \end{aligned} \right] = \left[ \begin{aligned} f_x \tan^{-1}\left(\frac{p_x}{p_z}\right) + c_x \\ f_y \sin^{-1}\left(\frac{p_y}{\left|\vec p\right|}\right) + c_y \end{aligned} \right] \]

So \(f_x\) and \(f_y\) specify the angular resolution, in pixels/radian.

For normal lens models the optical axis is at \(\vec p = \left[ \begin{aligned} 0 \\ 0 \\ 1 \end{aligned} \right]\), and projects to roughly the center of the image, roughly at \(\vec q = \left[ \begin{aligned} c_x \\ c_y \end{aligned} \right]\). This model has \(\mathrm{lon} = \mathrm{lat} = 0\) at the optical axis, which produces the same, usual \(\vec q\). However, this projection doesn't represent a lens and there is no "camera" or an "optical axis". The view may be centered anywhere, so \(c_x\) and \(c_y\) could be anything, even negative.

The special case of \(f_x = f_y = 1\) and \(c_x = c_y = 0\) (the default values in mrcal.project_lonlat()) produces a normalized equirectangular projection:

\[ \vec q_\mathrm{normalized} = \left[ \begin{aligned} \mathrm{lon} \\\mathrm{lat} \end{aligned} \right] \]

This projection has a singularity at the poles, approached as \(x \rightarrow 0\) and \(z \rightarrow 0\).

LENSMODEL_LATLON

This is a "transverse equirectangular projection". It works just like LENSMODEL_LONLAT, but rotated 90 degrees. So instead of a globe oriented as usual with a vertical North-South axis, this projection has a horizontal North-South axis. The projected \(x\) coordinate corresponds to the latitude, and the projected \(y\) coordinate corresponds to the longitude.

As with LENSMODEL_LONLAT, lenses do not follow this model. It is useful as the core of a rectified view used in stereo processing. The full projection expression to map a camera-coordinate point \(\vec p\) to an image pixel \(\vec q\):

\[ \vec q = \left[ \begin{aligned} f_x \, \mathrm{lat} + c_x \\ f_y \, \mathrm{lon} + c_y \end{aligned} \right] = \left[ \begin{aligned} f_x \sin^{-1}\left(\frac{p_x}{\left|\vec p\right|}\right) + c_x \\ f_y \tan^{-1}\left(\frac{p_y}{p_z}\right) + c_y \end{aligned} \right] \]

As with LENSMODEL_LONLAT, \(f_x\) and \(f_y\) specify the angular resolution, in pixels/radian. And \(c_x\) and \(c_y\) specify the projection at the optical axis \(\vec p = \left[ \begin{aligned} 0 \\ 0 \\ 1 \end{aligned} \right]\).

The special case of \(f_x = f_y = 1\) and \(c_x = c_y = 0\) (the default values in mrcal.project_latlon()) produces a normalized transverse equirectangular projection:

\[ \vec q_\mathrm{normalized} = \left[ \begin{aligned} \mathrm{lat} \\\mathrm{lon} \end{aligned} \right] \]

This projection has a singularity at the poles, approached as \(y \rightarrow 0\) and \(z \rightarrow 0\).

LENSMODEL_OPENCV4, LENSMODEL_OPENCV5, LENSMODEL_OPENCV8, LENSMODEL_OPENCV12

These are simple parametric models that have the given number of "distortion" parameters in addition to the 4 core parameters. The projection behavior is described in the OpenCV documentation. These do a reasonable job in representing real-world lenses, and they're compatible with many other tools. The projection function is

\begin{align*} \vec P &\equiv \frac{\vec p_{xy}}{p_z} \\ r &\equiv \left|\vec P\right| \\ \vec P_\mathrm{radial} &\equiv \frac{ 1 + k_0 r^2 + k_1 r^4 + k_4 r^6}{ 1 + k_5 r^2 + k_6 r^4 + k_7 r^6} \vec P \\ \vec P_\mathrm{tangential} &\equiv \left[ \begin{aligned} 2 k_2 P_0 P_1 &+ k_3 \left(r^2 + 2 P_0^2 \right) \\ 2 k_3 P_0 P_1 &+ k_2 \left(r^2 + 2 P_1^2 \right) \end{aligned}\right] \\ \vec P_\mathrm{thinprism} &\equiv \left[ \begin{aligned} k_8 r^2 + k_9 r^4 \\ k_{10} r^2 + k_{11} r^4 \end{aligned}\right] \\ \vec q &= \vec f_{xy} \left( \vec P_\mathrm{radial} + \vec P_\mathrm{tangential} + \vec P_\mathrm{thinprism} \right) + \vec c_{xy} \end{align*}

The parameters are \(k_i\). For any N-parameter OpenCV model the higher-order terms \(k_i\) for \(i \geq N\) are all 0. So the tangential distortion terms exist for all the models, but the thin-prism terms exist only for LENSMODEL_OPENCV12. The radial distortion is a polynomial in LENSMODEL_OPENCV4 and LENSMODEL_OPENCV5, but a rational for the higher-order models. Practically-speaking LENSMODEL_OPENCV8 works decently well for wide lenses. For non-fisheye lenses, LENSMODEL_OPENCV4 and LENSMODEL_OPENCV5 work ok. I'm sure scenarios where LENSMODEL_OPENCV12 is beneficial exist, but I haven't come across them.

LENSMODEL_CAHVOR

mrcal supports LENSMODEL_CAHVOR, a lens model used in a number of tools at JPL. The LENSMODEL_CAHVOR model has 5 "distortion" parameters in addition to the 4 core parameters. This support exists only for compatibility, and there's no reason to use it otherwise. If you don't know what this is, you don't need it.

LENSMODEL_CAHVORE

This is an extended flavor of LENSMODEL_CAHVOR to support wider lenses. The LENSMODEL_CAHVORE model has 8 "distortion" parameters in addition to the 4 core parameters. CAHVORE is only partially supported:

  • the parameter gradients aren't implemented, so it isn't currently possible to solve for a CAHVORE model
  • there're questions about whether CAHVORE projections are invariant to scaling and whether they should be invariant to scaling. These need to be answered conclusively before using the CAHVORE implementation in mrcal. Talk to Dima.

LENSMODEL_SPLINED_STEREOGRAPHIC_...

This is a stereographic model with correction factors. It is mrcal's attempt to model real-world lens behavior with more fidelity than the usual parametric models make possible.

To compute a projection using this model, we first compute the normalized stereographic projection \(\vec u\) as in the LENSMODEL_STEREOGRAPHIC definition above:

\[ \theta \equiv \tan^{-1} \frac{\left| \vec p_{xy} \right|}{p_z} \]

\[ \vec u \equiv \frac{\vec p_{xy}}{\left| \vec p_{xy} \right|} 2 \tan\frac{\theta}{2} \]

Then we use \(\vec u\) to look-up a \(\Delta \vec u\) using two separate splined surfaces:

\[ \Delta \vec u \equiv \left[ \begin{aligned} \Delta u_x \left( \vec u \right) \\ \Delta u_y \left( \vec u \right) \end{aligned} \right] \]

and we then define the rest of the projection function:

\[\vec q = \left[ \begin{aligned} f_x \left( u_x + \Delta u_x \right) + c_x \\ f_y \left( u_y + \Delta u_y \right) + c_y \end{aligned} \right] \]

The \(\Delta \vec u\) are the off-stereographic terms. If \(\Delta \vec u = 0\), we get a plain stereographic projection.

Much more detail about this model is available on the splined models page.