# How to run a calibration

## Table of Contents

Calibrating cameras is one of the main applications of mrcal, so I describe this process in some detail here.

## Capturing images

### Calibration object

We need to get images of a calibration object, a board containing an observable
grid of points. mrcal doesn't care where these observations come from: any tool
that can produce a table of corner observations can be utilized. Currently the
recommended method is to use a classical *chessboard* target, and to employ the
mrgingham detector. The mrgingham documentation has a .pdf of a chessboard
pattern that works well. See the recipes to use a non-mrgingham detector
and/or a non-chessboard object.

#### Chessboard shape

The pattern should be printed and mounted onto a *rigid* and *flat* surface to
produce the calibration object. The optimization problem assumes the board shape
to be constant over all the calibration images, and any instability in the board
shape breaks that assumption. Depending on the resolution of the camera and the
amount of calibration accuracy required, even a slightly floppy chessboard can
break things. Don't take this for granted. A method for quantifying this effect
appears below.

Similarly, the optimization problem assumes that the calibration object is
mostly flat. Some amount of deformation is inevitable, so mrcal *does* support a
simple non-planar board model, with the board shape being computed as part of
the calibration solve. In my experience, if you try to make sure the board is
flat, this model is enough to fix most of the unavoidable small deflections. If
care isn't taken to ensure the board is flat, the current simple deformation
model is often insufficient to fully model the shape. Don't take this for
granted either. A method for quantifying this effect appears below as well.

I usually use a square chessboard roughly 1m per side to calibrate my fisheye
lenses. Such a large chessboard often has issues with its shape. To mitigate
these, my chessboard is backed by about 1 inch of aluminum honeycomb. This is
expensive, but works well to stabilize the shape. The previous board I was using
is backed with the same amount of *foam* instead. This is significantly cheaper,
and still pretty good, but noticeably less flat and less rigid. This foam-backed
board does warp with variations in temperature and humidity, but this drift is
slow, and the shape is *usually* stable long-enough for a chessboard dance. If
you can make one, the aluminum honeycomb-backed boards are strongly recommended.

#### Chessboard size

There's a tradeoff to consider between large and small chessboards. According to the dance study, we want close-ups that fill the imager, so we're choosing between small chessboards placed near the camera and large chessboards placed away from the camera.

- Small chessboards placed close to the camera produce two issues:
- the chessboard would appear out of focus. This is OK up to a point (out-of-focus blur is mostly isotropic, and this noise averages out nicely), but extreme out-of-focus images should probably be avoided
- Extreme close-ups exhibit noncentral projection behavior: observation rays don't all intersect at the same point. Today's mrcal cannot model this behavior, so extreme close-ups should be avoided to allow the central-only models to fit. The cross-validation page shows how to detect errors caused by noncentral projection

- Large chessboards are better, but are difficult to manufacture, to keep them flat and stiff, and to move them around. Currently I use a 1m x 1m square board, which seems to be near the limit of what is reasonable.

For wide lenses especially, you want a larger chessboard. When calibrating an unfamiliar system I usually try running a calibration with whatever chessboard I already have, and then analyze the results to see if anything needs to be changed. It's also useful to rerun the dance study for the specific geometry being calibrated.

### Image-capturing details

Now that we have a calibration object, this object needs to be shown to the camera(s). It is important that the images contain clear features. Motion blur or exposure issues will all cause bias in the resulting calibration.

If calibrating multiple cameras, mrcal will solve for all the intrinsics *and*
extrinsics. There is one strong requirement on the captured images in this case:
the images must be synchronized across all the cameras. This allows mrcal to
assume that if camera A and camera B observed a chessboard at time T, then this
chessboard was at *exactly* the same location when the two cameras saw it.
Generally this means that either

- The cameras are wired to respond to a physical trigger signal
- The chessboard and cameras were physically fixed (on a tripod, say) at each time of capture

Some capture systems have a "software" trigger mode, but this is usually too loose to produce usable results. A similar consideration exists when using cameras with a rolling shutter. With such cameras it is imperative that everything remain stationary during image capture, even when only one camera is involved. See the residual analysis of synchronization errors and of rolling shutter for details.

### Lens settings

Lens intrinsics are usually very sensitive to any mechanical manipulation of the lens. It is thus strongly recommended to set up the lens it its final working configuration (focal length, focus distance, aperture), then to mechanically lock down the settings, and then to gather the calibration data. This often creates dancing challenges, described in the next section.

If you *really* need to change the lens settings after calibrating or if you're
concerned about drifting lens mechanics influencing the calibration (you should
be!), see the recipes for a discussion.

### Dancing

As shown in the dance study, the most useful observations to gather are

- close-ups: the chessboard should fill the whole frame as much as possible.
- oblique views: tilt the board forward/back and left/right. I generally tilt by ~ 45 degrees. At a certain point the corners become indistinct and the detector starts having trouble, but depending on the lens, that point could come with quite a bit of tilt. A less dense chessboard eases this also, at the cost of requiring more board observations to get the same number of points.
- If calibrating multiple cameras, it is impossible to place a calibration
object at a location where it's seen by all the cameras
*and*where it's a close-up for all the cameras. So you should get close-ups for each camera individually, and also get observations common to multiple cameras, that aren't necessarily close-ups. The former will serve to define your camera intrinsics, and the latter will serve to define your extrinsics (geometry). Get just far-enough out to create the joint views. If usable joint views are missing, the extrinsics will be undefined, and the solver will complain about a "not positive definite" (singular in this case) Hessian.

A dataset composed primarily of tilted closeups produces good results.

As described above, you want a large chessboard placed as close to the camera as possible, before the out-of-focus and noncentral effects kick in.

If the model will be used to look at far-away objects, care must be taken to
produce a reliable calibration *at long ranges*, not just at the *short* ranges
where the chessboards were. The primary way to do that is to get close-up
chessboard views. If the close-up range is very different from the working range
(infinity, possibly), the close-up images could be very out-of-focus. The
current thought is that the best thing to do is to get close-up images even if
they're out of focus. The blurry images will have a high uncertainty in the
corner observations (hopefully without bias), but the uncertainty improvement
that comes from the near-range chessboard observations more than makes up for
it. In these cases you usually need to get more observations than you normally
would to bring down the uncertainties to an acceptable level. In challenging
situations it's useful to re-run the dance study for the specific use case to
get a sense of what kind of observations are required and what kind of
uncertainties can be expected. See the dance study for detail.

Another issue that could be caused by extreme close-ups is a noncentral projection behavior: observation rays don't all intersect at the same point. Today's mrcal cannot model this behavior, and the cross-validation would show higher-than-expected errors. In this case it is recommended to gather calibration data from further out.

It is better to have more chessboard data rather than less. mrgingham will throw away frames where no chessboard can be found, so it is perfectly reasonable to grab too many images with the expectation that they won't all end up being used in the computation. I usually aim for about 100 usable frames, but you may get away with fewer, depending on your specific scenario. The mrcal uncertainty feedback will tell you if you need more data.

Naturally, intrinsics are accurate only in areas where chessboards were
observed: chessboard observations on the left side of the image tell us little
about lens behavior on the right side. Thus it is imperative to cover the whole
field of view during the chessboard dance. It is often tricky to get good data
at the edges and corners of the imager, so care must be taken. Some chessboard
detectors (mrgingham in particular) only report complete chessboards. This makes
it extra-challenging to obtain good data at the edges: a small motion that
pushes one chessboard corner barely out of bounds causes the whole observation
to be discarded. It is thus *very* helpful to be able to see a live feed of the
camera as the images are being captured. In either case, checking the coverage
is a great thing to do. The usual way to do this is indirectly: by visualizing
the projection uncertainty. Or by visualizing the obtained chessboard
detections directly.

### Image file-naming convention

With monocular calibrations, there're no requirements on image filenames: use
whatever you like. If calibrating multiple synchronized cameras, however, the
image filenames need to indicate which camera captured each image and at which
time. I generally use `frameFFF-cameraCCC.jpg`

. Images with the same `FFF`

are
assumed to have been captured at the same instant in time, and `CCC`

identifies
the camera. Naming images in this way is sufficient to communicate these
mappings to mrcal.

## Detecting corners

A number of tools are available to detect chessboard corners (OpenCV is commonly
used, and different organizations have their own in-house versions). None of the
available ones worked well for my use cases, so I had to write my own, and I
recommend it for all corner detections: `mrgingham`

. mrgingham is fast and is
able to find chessboard corners subject to very un-pinhole-like projections. At
this time it has two limitations that will be lifted eventually:

- It more or less assumes a regular grid of N-by-N corners (i.e. N+1-by-N+1 squares)
- It requires
*all*the corners to be observed in order to report the detections from an image. Incomplete chessboard observations aren't supported

If these are unacceptable, any other detector may be used instead. See the recipes.

### Using mrgingham

Once mrgingham is installed or built from source, it can be run by calling the
`mrgingham`

executable. The sample in the tour of mrcal processes these images
to produce these chessboard corners like this:

```
mrgingham --jobs 4 --gridn 14 '*.JPG' > corners.vnl
```

mrgingham tries to handle a variety of lighting conditions, including varying illumination across the image, but the corners must exist in the image in some form.

At this time mrgingham returns *only* complete chessboard views: if even one
corner of the chessboard couldn't be found, mrgingham will discard the entire
image. Thus it takes care to get data at the edges and in the corners of the
imager. A live preview of the captured images is essential.

Another requirement due to the design of mrgingham is that the board should be held with a flat edge parallel to the camera xz plane (parallel to the ground, usually). mrgingham looks for vertical and horizontal sequences of corners, but if the board is rotated diagonally, then none of these sequences are clearly "horizontal" or "vertical".

### Choice of calibration object

When given an image of a *chessboard*, the detector is directly observing the
feature we actually care about: the corner. Another common calibration board
style is a grid of circles, where the feature of interest is the center of each
circle. When given an image of such a grid of circles, the detector either

- detects the contour at the edge of each circle
- finds the pixel blob comprising each circle observation

and from either of these, the detector infers the circle center. This can work
when looking at head-on images, but when given tilted images subjected to
non-pinhole lens behaviors, getting accurate circle centers from outer contours
or blobs is *hard*. The resulting inaccuracies in the detections of circle
centers will introduce biases into the solve that aren't modeled by the
projection uncertainty routine, so chessboards are *strongly* recommended in
favor of circle grids.

Some other calibration objects use AprilTags. mrcal assumes independent noise on
each point observation, so correlated sources of point observations are also not
appropriate sources of data. Thus using individual corners of AprilTags will be
un-ideal: their noise is correlated. Using the *center* of an AprilTag would
eliminate this correlated noise, but maybe would have a similar inaccuracy
problem that a grid of circles would have.

If using a possibly-problematic calibration object such as a grid of circles or AprilTags, double-check the detections by reprojecting to calibration-object space after a solve has completed.

### Visualization

Once we have a `corners.vnl`

from some chessboard detector, we can visualize the
coverage. From the tour of mrcal:

$ < corners.vnl \ vnl-filter -p x,y | \ feedgnuplot --domain --square --set 'xrange [0:6000] noextend' --set 'yrange [3376:0] noextend'

Doing this is usually unnecessary since the projection uncertainty reporting shows the coverage (and more!), but it's good to be able to do this for debugging.

## Model choice

Before calibrating we need to choose the model for representing the lenses. Use
`LENSMODEL_SPLINED_STEREOGRAPHIC`

. This model works very well, and is able to
represent real-world lenses better than the parametric models (all the other
ones). This is true even for long, near-pinhole lenses. Depending on the
specific lens and the camera resolution this accuracy improvement may not be
noteworthy. But even in those cases, the splined model is flexible enough to get
truthful projection uncertainty estimates, so it's *still* worth using. Today I
use other models only if I'm running quick experiments: splined models have many
more parameters, so things are slower.

`LENSMODEL_SPLINED_STEREOGRAPHIC`

has several configuration variables that need
to be set. The full implications of these choices still need to be studied, but
the results appear fairly insensitive to these. I generally choose `order=3`

to
select cubic splines. I generally choose a rich model with fairly dense spline
spacing. For instance the splined model used in the tour of mrcal has
`Nx=30_Ny=18`

. This has 30 spline knots horizontally and 18 vertically. You
generally want `Ny`

/ `Nx`

to roughly match the aspect ratio of the imager. The
`Nx=30_Ny=18`

arrangement is probably denser than it needs to be, but it works
OK. The cost of such a dense spline is a bit of extra computation time and more
stringent requirements on calibration data to fully and densely cover the
imager.

The last configuration parameter is `fov_x_deg`

: the horizontal field-of-view of
the lens. The splined model is defined by *knots* spread out across space, the
arrangement of these knots defined by the `fov_x_deg`

parameter. We want the
region in space defined by the knots to roughly match the region visible to the
lens. A too-large `fov_x_deg`

would waste some knots by placing them beyond
where the lens can see. And a too-small `fov_x_deg`

would restrict the
projection representation on the edge of the image.

An initial estimate of `fov_x_deg`

can be computed from the datasheet of the lens.
Then a test calibration should be computed using that value, and the
`mrcal-show-splined-model-correction`

tool can then be used to validate that
`fov_x_deg`

parameter. In the tour of mrcal we get something like this:

mrcal-show-splined-model-correction \ --set 'cbrange [0:0.1]' \ --unset grid \ --set 'xrange [:] noextend' \ --set 'yrange [:] noextend reverse' \ --set 'key opaque box' \ splined.cameramodel

This is about what we want. The valid-intrinsics region covers most of the spline-in-bounds region without going out-of-bounds anywhere. In the tour of mrcal we followed this procedure to end up with

`LENSMODEL_SPLINED_STEREOGRAPHIC_order=3_Nx=30_Ny=18_fov_x_deg=150`

Getting this perfect isn't important, so don't spent a ton of time working on it. See the lensmodel documentation for more detail.

## Computing the calibration

We have data; we have a lens model; we're ready. Let's compute the calibration
using the `mrcal-calibrate-cameras`

tool. The invocation should look something
like this:

mrcal-calibrate-cameras \ --corners-cache corners.vnl \ --lensmodel LENSMODEL_SPLINED_STEREOGRAPHIC_order=3_Nx=30_Ny=18_fov_x_deg=150 \ --focal 1900 \ --object-spacing 0.0588 \ --object-width-n 14 \ '*.JPG'

`--corners-cache corners.vnl`

says that the chessboard corner coordinates live in a file called`corners.vnl`

. This is the output of the corner detector. If this argument is omitted, or a non-existent file is given,`mrcal-calibrate-cameras`

will run mrgingham, and write the results into the given path. Thus the same command would be used to both compute the corners initially, and to reuse the pre-computed corners in subsequent runs.As described above, the

`corners.vnl`

file can come from any chessboard detector. If it's a detector that produces a 4th column of*weights*instead of a decimation level, pass in`--corners-cache-has-weights`

`--lensmodel`

specifies which lens model we're using for*all*the cameras. See the "Model choice" section above. Not being able to select different models for different cameras is a current limitation of mrcal. If some very different lenses are present in the same calibration, I use a densely-spaced (high`Nx`

,`Ny`

) splined model with the`fov_x_deg`

of the widest lens in the set. This wastes knots in the too-wide areas of the narrow lenses, but if the spline was dense-enough, there're enough knots remaining to fit the narrower lenses.`--focal 1900`

provides the initial estimate for the camera focal lengths, in pixels. This doesn't need to be extremely precise, but do try to get this close if possible. The focal length value to pass to`--focal`

(\(f_\mathrm{pixels}\)) can be derived using the stereographic model definition:

\[ f_\mathrm{pixels} \approx \frac{\mathrm{imager\_width\_pixels}}{4 \tan \frac{\mathrm{field\_of\_view\_horizontal}}{4}} \]

This expression is a good initial estimate for both long and wide lenses. Note that the manufacturer-specified "field of view" and "focal length" values are usually poorly-defined: the former is different in all directions, and the latter is meaningless in wide lenses that are nowhere near the pinhole model. With a longer lens, we can assume pinhole behavior to get

\[ f_\mathrm{pixels} = f_\mathrm{mm} \frac{\mathrm{imager\_width\_pixels}}{\mathrm{imager\_width\_mm}} \]

As with the `fov_x_deg`

parameter described in the "Model
choice" section above, running a test calibration with a rough estimate, and
then passing in the much-closer optimized value is a good strategy. The
optimized focal length is the first two values of the `intrinsics`

vector in
the result. Those two values should be similar, and anything around there
should work well for `--focal`

.

`--object-spacing`

is the distance between neighboring corners in the chessboard. Even spacing, identical in both directions is assumed`--object-width-n`

is the horizontal corner count of the calibration object. In the example invocation above there is no`--object-height-n`

, so`mrcal-calibrate-cameras`

assumes a square chessboard

After the options, `mrcal-calibrate-cameras`

takes globs describing the images.
One glob per camera is expected, and in the above example *one* glob was given:
`'*.JPG'`

. Thus this is a monocular solve. More cameras would imply more globs.
For instance a 2-camera calibration might take arguments

'frame*-camera0.png' 'frame*-camera1.png'

Note that these are *globs*, not *filenames*. So they need to be quoted or
escaped to prevent the shell from expanding them: hence `'*.JPG'`

and not
`*.JPG`

.

Finally, to compute cross-validation diffs (see below) it's necessary to run
several independent calibrations. I generally split the dataset into even/odd
chunks, and calibrate them independently. With this `zsh`

snippet, for instance:

for oddeven (odd even) { cmd=(mrcal-calibrate-cameras \ --corners-cache corners.vnl \ --lensmodel LENSMODEL_SPLINED_STEREOGRAPHIC_order=3_Nx=30_Ny=18_fov_x_deg=150 \ --focal 1900 \ --object-spacing 0.0588 \ --object-width-n 14) if [[ $oddeven = "even" ]] { globs=('frame*[02468]-camera'{0,1}'.png') } else { globs=('frame*[13579]-camera'{0,1}'.png') } cmd=($cmd $globs) $cmd for i (0 1) { mv camera-$i.cameramodel camera-$i-$oddeven.cameramodel } }

### Troubleshooting broken solves

Usually the solver converges, and produces a result. Then we look at the diagnostics to evaluate the quality of this result (next section). We can do that if the errors are small, and the optimization completed successfully. If the data issues are too large, however (pervasive sync errors, completely insufficient lens model, etc), the solver will have trouble. It could

- Take a very long time to converge to
*some*solution Produce tool many outliers, possibly incrementally, producing output such as this:

mrcal.c(5413): Threw out some outliers. New count = 174/35280 (0.5%). Going again mrcal.c(5413): Threw out some outliers. New count = 252/35280 (0.7%). Going again mrcal.c(5413): Threw out some outliers. New count = 303/35280 (0.9%). Going again

This will slow down the solve dramatically. Unless the data is known to be funky, more than ~ 3% outliers should raise questions

If the solver has trouble like this, it is usually helpful to turn off the
outlier rejection by running ```
mrcal-calibrate-cameras --skip-outlier-rejection
....
```

, and then examine the residuals as described below. Hopefully that would
provide a hint about the issues.

It also helps to simplify the problem, which is effective at isolating certain issues. For instance, a nonconverging multi-camera solve should be attempted monocularly, one camera at a time. If some cameras converge and some don't, that points to the issue. If individually the solves converge, but together they don't, there could be an issue with the camera synchronization, or the chessboard corner indices aren't consistent. Similarly solving with a subset of images is often enlightening.

Another common problem is getting messages like this:

mrcal.c(3758): WARNING: Board observation 157 (icam_intrinsics=0, icam_extrinsics=-1, iframe=104) had almost all of its points thrown out as outliers: only 0/100 remain. CHOLMOD is about to complain about a non-positive-definite JtJ. Something is wrong with this observation mrcal.c(3758): WARNING: Board observation 158 (icam_intrinsics=1, icam_extrinsics=0, iframe=104) had almost all of its points thrown out as outliers: only 0/100 remain. CHOLMOD is about to complain about a non-positive-definite JtJ. Something is wrong with this observation mrcal.c(5412): Threw out some outliers (have a total of 861 now); going again libdogleg at dogleg.c:1115: CHOLMOD warning: libdogleg at dogleg.c:1115: not positive definite. libdogleg at dogleg.c:1115: file: ../Cholesky/t_cholmod_rowfac.c libdogleg at dogleg.c:1115: line: 430 libdogleg at dogleg.c:1115:

The complaint is about a singular Hessian matrix. Usually this happens if some variable in the optimization has no effect at all on the solution, and the optimizer thus doesn't know what to do with that variable. Usually heavy outlier rejection precedes this, and the missing data is causing the problem. Example: all chessboard observations for a given frame were thrown out; thus moving the chessboard pose at that time has no effect, and we get a singular Hessian. The diagnostic technique is the same: disable the outlier rejection and examine the residuals.

## Interpreting the results

Once we have a calibration, we should evaluate how well it represents reality.
The tour of mrcal shows a very detailed process: a real-world fit using
`LENSMODEL_OPENCV8`

and a real-world fit using
`LENSMODEL_SPLINED_STEREOGRAPHIC_...`

. The analysis sequence outlined there is
too thorough for everyday use, but it's good to look through those pages to get
a sense. A high-level process is

- Examine the projection uncertainty to make sure we have enough good data in the right places
- Examine the cross-validation diffs to confirm that the model fits and that the reported uncertainty is trustworthy
- If these diffs are too high, examine the residuals to find out why

This is a *lot*, but you don't need to do all this every time. As noted in the
noise model description, we want homoscedastic noise in our observations of
the chessboard corners. We will get that if and only if our models faithfully
describe the world; so the available diagnostics serve to pinpoint areas where
the models don't fit, so that those issues could be addressed.

I now describe each step.

### Projection uncertainty

As described in detail in the projection-uncertainty page, the projection
uncertainty computed by mrcal gauges the effect of sampling error. Since we
*always* have noise in our chessboard observations, it's important that the
solution be insensitive to this noise. Otherwise a recalibration of the same
system would produce very different results due to new chessboard observations
containing new noise.

Projection uncertainty can be visualized with the
`mrcal-show-projection-uncertainty`

tool. From the tour of mrcal:

mrcal-show-projection-uncertainty splined.cameramodel --cbmax 1 --unset key

This is projection uncertainty at infinity, which is what I'm usually interested
in. If we care *only* about the performance at some particular distance, that
can be requested with `mrcal-show-projection-uncertainty --distance ...`

. That
uncertainty will usually be better than the uncertainty at infinity. Making sure
things work at infinity will make things work at other ranges also.

The projection uncertainty measures the quality of the chessboard dance. If the guidelines noted above were followed, you'll get good uncertainties. If the uncertainty is poor in some region, you need more chessboard observations in that area. To improve it everywhere, follow the guidelines: more observations, more closeups, more tilt.

The projection uncertainties will be overly-optimistic if model errors are present or if a too-lean lens model is selected. So we now look at the cross-validation diffs to confirm that no model errors are present. If we can confirm that, the projection uncertainties can be used as the authoritative gauge of the quality of our calibration.

Since the uncertainties are largely a function of the chessboard dance, I usually don't bother looking at them if I'm recalibrating a system that I have calibrated before, with a similar dance. Since the system and the dance didn't change, neither would the uncertainty.

### Cross-validation diffs

If we have an acceptable projection uncertainty, we need to decide if it's a good gauge of calibration quality: if we have model errors or not.

A good way to do that is to compute a cross-validation: we calibrate the camera system twice with independent input data, and we compare the resulting projections. If the models fit, then we only have sampling error affecting the solves, and the resulting differences will be in-line with what the uncertainties predict: \(\mathrm{difference} \approx \mathrm{uncertainty}_0 + \mathrm{uncertainty}_1\). Otherwise, we have an extra source of error not present in the uncertainty estimates, which would cause the cross-validation diffs to be significantly higher. That would suggest a deeper look is necessary.

To get the data, we can do two separate dances, or we can split the dataset into
odd/even images, as described above. Note: if the chessboard is moved very
slowly, the even and odd datasets will be very qualitatively similar, and it's
possible to get a low cross-validation diff even in the presence of model
errors: the two samples of the model error distribution would be similar to each
other. This usually doesn't happen, but keep in mind that it *could*, so you
should move at a reasonable speed to make the odd/even datasets non-identical.

In the tour of mrcal we computed a cross-validation, and discovered that there indeed exists a model error. The cross-validation diff looked like this:

mrcal-show-projection-diff \ --cbmax 2 \ --unset key \ 2-f22-infinity.splined.cameramodel \ 3-f22-infinity.splined.cameramodel

And the two uncertainties looked roughly like this:

The diff is more than roughly 2x the uncertainty, so something wasn't fitting: the lens had a non-negligible noncentral behavior at the chessboard distance, which wasn't fixable with today's mrcal. So we could either

- Accept the results as is, using the diffs as a guideline to how trustworthy the solves are
- Gather more calibration images from further out, minimizing the unmodeled noncentral effect

Cross-validation diffs are usually *very* effective at detecting issues, and I
usually compute these every time I calibrate a lens. In my experience, these are
the single most important diagnostic output.

While these are very good at *detecting* issues, they're less good at
pinpointing the root cause. To do that usually requires thinking about the
solve residuals (next section).

### Residuals

We usually have a *lot* of images and a *lot* of residuals. Usually I only look
at the residuals if

- I'm calibrating an unfamiliar system
- I don't trust something about the way the data was collected; if I have little faith in the camera time-sync for instance
- Something unknown is causing issues (we're seeing too-high cross-validation diffs), and we need to debug

The residuals in the whole problem can be visualized with the
`mrcal-show-residuals`

tool. And the residuals of specific chessboard
observations can be visualized with the `mrcal-show-residuals-board-observation`

tool.

Once again. As noted in the noise model description, we want homoscedastic noise
in our observations of the chessboard corners. So the residuals should all be
independent and should all have the same dispersion: each residual vector should
look random, and unrelated to any other residual vector. There should be no
discernible patterns to the residuals. If model errors are present, the
residuals will *not* appear random, and *will* exhibit patterns. And we will
then see those patterns in the residual plots.

The single most useful invocation to run is

mrcal-show-residuals-board-observation \ --from-worst \ --vectorscale 100 \ solved.cameramodel \ 0-5

This displays the residuals of a few worst-fitting images, with the error vectors scaled up 100x for legibility (the "100" often needs to be adjusted for each specific case). The most significant issues usually show up in these few worst-fitting chessboard observations.

The residual plots are interactive, so it's useful to then zoom in on the worst fitting points (easily identifiable by the color) in the worst-fitting observations to make sure the observed chessboard corners we're fitting to are in the right place. If something was wrong with the chessboard corner detection, a zoomed-in residual image would tell us this.

Next the residual plots should be examined for patterns. Let's look at some common issues, and the characteristic residual patterns they produce.

For most of the below I'm going to show the usual calibration residuals we would
expect to see if we had some particular kind of issue. We do this by simulating
*perfect* calibration data, corrupting it with the kind of problem we're
demonstrating, recalibrating, and visualizing the result. Below I do this in
Python using the `mrcal.make_perfect_observations()`

function. These could
equivalently have been made using the `mrcal-reoptimize`

tool:

$ mrcal/analyses/mrcal-reoptimize \ --explore \ --perfect \ splined.cameramodel ... Have REPL looking at perfect data. Add flaw, reoptimize, visualize here.

This technique is powerful and is useful for more than just making plots for this documentation. We can also use it to quantify the errors that would result from problems that we suspect exist. We can compare those to the cross-validation errors that we observe to get a hint about what is causing the problem.

For instance, you might have a system where you're not confident that your
chessboard is flat or that your lens model fits or that your camera sync works
right. You probably see high cross-validation diffs, but don't have a sense of
which factor is most responsible for the issues. We can simulate this, to see if
the suspected issues would cause errors as high as what is observed. Let's say
you placed a straight edge across your chessboard, and you see a deflection at
the center of ~ 3mm. mrcal doesn't assume a flat chessboard, but its deformation
model is simple, and even in this case it can be off by 1mm. Let's hypothesize
that the board deflection is 0.5mm horizontally and 0.5mm vertically, in the
other direction. We can then use the `mrcal-reoptimize`

tool to quantify the
resulting error, and we can compare the simulated cross-validation diff to the
one we observe. We might discover that the expected chessboard deformation
produces cross-validation errors that are smaller than what we observe. In that
case, we should keep looking. But if we see the simulated cross-validation diff
of the same order of magnitude that we observe, then we know it's time to fix
the chessboard.

#### Poorly-fitting lens model

We saw this in the tour of mrcal: we tried to calibrate a very wide lens with
`LENSMODEL_OPENCV8`

, and it showed clear signs of the model not fitting. Read
that page to get a sense of what that looks like in the various diagnostics.
Broadly speaking, lens modeling errors increase as you move towards the edges of
the imager, so we would see higher errors at the edges. This often looks similar
to an unmodeled deformation in the chessboard shape.

#### Errors in the chessboard detector

What if the chessboard detector gets a small number of corners localized incorrectly? If the shift is large, those corner observations will be thrown out as outliers, and will not affect the solve. But if they're small, they may cause a bias in the solution. What does that look like? Let's simulate it. We

- take the
`LENSMODEL_SPLINED_STEREOGRAPHIC`

solve from the tour of mrcal - produce perfect observations at the computed optimal geometry
- add perfect independent gaussian noise to the observations

This data now follows the noise model perfectly. Then I add the flaw: I bump all the observed corners on the chessboard diagonal in observation 10 to the right by 0.8 pixels, and I look at the residuals. This requires a bit of code:

#!/usr/bin/python3 import sys import mrcal import numpy as np import numpysane as nps def add_imperfection(optimization_inputs): for i in range(14): optimization_inputs['observations_board'][10,i,i,0] += 0.8 model = mrcal.cameramodel('splined.cameramodel') optimization_inputs = model.optimization_inputs() mrcal.make_perfect_observations(optimization_inputs) add_imperfection(optimization_inputs) mrcal.optimize(**optimization_inputs) mrcal.show_residuals_board_observation(optimization_inputs, 10, vectorscale = 200, circlescale = 0.5, title = "Effect of corner detection error at the center", paths = optimization_inputs['imagepaths'])

With a model on disk, the same could be produced with

mrcal-show-residuals-board-observation \ --vectorscale 200 \ --circlescale 0.5 \ --title "Effect of corner detection error at the center" \ splined.cameramodel \ 10

Here we see the residuals on the top-left/bottom-right diagonal be consistently
larger than the others, and we see them point in a consistent non-random
direction. This isn't easily noticeable without knowing what to look for. *But*
the usual method of zooming in to the worst-few points will make the error in
the detected corners visibly apparent.

#### Rolling shutter

Many cameras employ a rolling shutter: the images aren't captured all at once,
but are built up over time, capturing different parts of the image at different
times. If the scene or the camera are moving, this would produce images that
violate mrcal's view of the world: that at an instant in time I can describe the
*full* chessboard pose and its observed corners. mrcal does not model rolling
shutter effects, so non-static calibration images would cause problems. Today
the only way to calibrate rolling-shutter cameras with mrcal is to make sure the
cameras and the chessboard are stationary during each image capture.

How do we make sure that no rolling-shutter effects ended up in our data? We look at the residuals. Prior to the dataset used in the tour of mrcal I captured images that used the "silent mode" of the Sony Alpha 7 III camera. I didn't realize that this mode implied a rolling shutter, so that dataset wasn't useful to demo anything other than the rolling shutter residuals. Looking at the worst few observations:

mrcal-show-residuals-board-observation \ --from-worst \ --vectorscale 20 \ --circlescale 0.5 \ splined.cameramodel \ 0-3

I'm omitting the 2nd image because it qualitatively looks very similar to the first. Note the patterns. Clearly something is happening, but it varies from image to image, and a warped chessboard isn't likely to explain it. In addition to that, the magnitude of all the errors is dramatically higher than before: the vectors are scaled up 5x less than those in the tour of mrcal.

#### Synchronization

What if we're calibrating a multi-camera system, and the image synchronization
is broken? You should have a hardware sync: a physical trigger wire connected to
each camera, with a pulse on that line telling each camera to begin the image
capture. If this is missing or doesn't work properly, then a similar issue
occurs as with a rolling-shutter: images captured at allegedly a particular
instant in time haven't actually been captured completely at *that* time.

This can be confirmed by recalibrating monocularly, to see if individually the solves converge. And as expected, this can be readily seen in the residuals. A time sync problem means that synced images A and B were supposed to capture a chessboard at some pose \(T\). But since the sync was broken, the chessboard moved, so the chessboard was really at two different poses: \(T_\mathrm{A}\) and \(T_\mathrm{B}\). mrcal was told that there was no motion, so it ends up solving for some compromise transform \(T_\mathrm{mid}\) that splits the difference. So the residuals for image A would point from the observation at \(T_\mathrm{A}\) towards the observation at the compromise pose \(T_\mathrm{mid}\). And the residuals for image B would point in the opposite direction: from the observation at \(T_\mathrm{B}\) towards the observation at the compromise pose \(T_\mathrm{mid}\).

Let's demo this with a little code. I grab the monocular solve from the tour of mrcal, add a second camera with perfect observations and noise, and add the flaw: one of the chessboard observations was erroneously reported for two images in a row. This happened in only one camera of the pair, so a time-sync error resulted. This is a simulation, but as with everything else on this page, I've seen such problems in real life. The incorrect observation is observation 40, frame 20, camera 0.

#!/usr/bin/python3 import sys import mrcal import numpy as np import numpysane as nps def binocular_from_monocular(optimization_inputs): # Assuming monocular calibration. Add second identical camera to the right # of the first optimization_inputs['intrinsics'] = \ nps.glue( optimization_inputs['intrinsics'], optimization_inputs['intrinsics'], axis = -2 ) optimization_inputs['imagersizes'] = \ nps.glue( optimization_inputs['imagersizes'], optimization_inputs['imagersizes'], axis = -2 ) optimization_inputs['extrinsics_rt_fromref'] = np.array(((0,0,0, -0.1,0,0),), dtype=float) i = optimization_inputs['indices_frame_camintrinsics_camextrinsics'] optimization_inputs['indices_frame_camintrinsics_camextrinsics'] = \ np.ascontiguousarray( \ nps.clump(nps.xchg(nps.cat(i, i + np.array((0,1,1),dtype=np.int32)), 0, 1), n=2)) # optimization_inputs['observations_board'] needs to have the right shape # and the right weights. So I just duplicate it optimization_inputs['observations_board'] = \ np.ascontiguousarray( \ nps.clump(nps.xchg(nps.cat(optimization_inputs['observations_board'], optimization_inputs['observations_board']), 0,1), n=2)) del optimization_inputs['imagepaths'] def add_imperfection(optimization_inputs): iobservation = 40 iframe,icam_intrinsics,icam_extrinsics = \ optimization_inputs['indices_frame_camintrinsics_camextrinsics'][iobservation] print(f"observation {iobservation} iframe {iframe} icam_intrinsics {icam_intrinsics} is stuck to the previous observation") optimization_inputs ['observations_board'][iobservation, ...] = \ optimization_inputs['observations_board'][iobservation-2,...] model = mrcal.cameramodel('splined.cameramodel') optimization_inputs = model.optimization_inputs() binocular_from_monocular(optimization_inputs) mrcal.make_perfect_observations(optimization_inputs, observed_pixel_uncertainty = 0) add_imperfection(optimization_inputs) optimization_inputs['do_apply_outlier_rejection'] = False mrcal.optimize(**optimization_inputs) mrcal.cameramodel(optimization_inputs = optimization_inputs, icam_intrinsics = 0).write('/tmp/sync-error.cameramodel') for i in range(2): mrcal.show_residuals_board_observation(optimization_inputs, i, from_worst = True, vectorscale = 2, circlescale = 0.5)

We're plotting the two worst-residual observations, and we clearly see the tell-tale sign of a time-sync error.

As before, we can make these plots from the commandline:

mrcal-show-residuals-board-observation \ --from-worst \ --vectorscale 2 \ --circlescale 0.5 \ /tmp/sync-error.cameramodel \ --explore \ 0-1

Here we passed `--explore`

to drop into a REPL to investigate further, which
gives us another clear indication that we have a time-sync error. The tool
reports some useful diagnostics at the top of the REPL:

The first 10 worst-fitting observations (i_observations_sorted_from_worst[:10]) [41, 40, 97, 94, 95, 38, 99, 96, 98, 39] The corresponding 10 (iframe, icamintrinsics, icamextrinsics) tuples (indices_frame_camintrinsics_camextrinsics[i_observations_sorted_from_worst[:10] ]): [[20 1 0] [20 0 -1] [48 1 0] [47 0 -1] [47 1 0] [19 0 -1] [49 1 0] [48 0 -1] [49 0 -1] [19 1 0]]

Since a time-sync error affects multiple images at the same time, we should see
multiple chessboard observations from the same frame (instant in time) in the
high-residuals list, and we clearly see that above. We corrupted frame 20 camera
0, so it's no longer syncronized with frame 20 camera 1. We expect both of those
observations to fit badly, and we do see that: the
`mrcal-show-residuals-board-observation`

tool says that the worst residual is
from frame 20 camera 1 and the second-worst is frame 20 camera 0. High errors in
multiple cameras in the same frame like this are another tell-tale sign of sync
errors. We can also query the errors themselves in the REPL:

In [1]: err_per_observation[i_observations_sorted_from_worst][:10] Out[1]: array([965407.29301702, 864665.95442895, 11840.10373732, 7762.88344105, 7759.5509593 , 7618.23632488, 7190.35449443, 6621.34798572, 6499.23173619, 6039.45522539])

So the two frame-20 sum-of-squares residuals are on the order of \(900,000 \mathrm{pixels}^2\), and the next worst one is ~ 100 times smaller.

#### Chessboard shape

Since flat chessboards don't exist, mrcal doesn't assume that the observed
chessboard is flat. Today it uses a simple parabolic model. It *does* assume the
board shape is constant throughout the whole calibration sequence. So let's run
more simulations to test two scenarios:

- The chessboard is slightly non-flat, but in a way not modeled by the solver
- The chessboard shape changes slightly over the course of the chessboard dance

- Unmodeled chessboard shape

As before, we rerun the scenario from the tour of mrcal with perfect observations and perfect, but with a flaw: I use the very slightly parabolic chessboard, but tell mrcal to assume the chessboard is flat. The script is very similar:

#!/usr/bin/python3 import sys import mrcal import numpy as np import numpysane as nps def add_imperfection(optimization_inputs): H,W = optimization_inputs['observations_board'].shape[1:3] s = optimization_inputs['calibration_object_spacing'] print(f"Chessboard dimensions are {s*(W-1)}m x {s*(H-1)}m") print(f"Chessboard deflection at the center is {optimization_inputs['calobject_warp'][0]*1000:.1f}mm , {optimization_inputs['calobject_warp'][1]*1000:.1f}mm in the x,y direction") optimization_inputs['calobject_warp'] *= 0. optimization_inputs['do_optimize_calobject_warp'] = False model = mrcal.cameramodel('splined.cameramodel') optimization_inputs = model.optimization_inputs() optimization_inputs['calobject_warp'] = np.array((1e-3, -0.5e-3)) mrcal.make_perfect_observations(optimization_inputs) add_imperfection(optimization_inputs) mrcal.optimize(**optimization_inputs) mrcal.show_residuals_board_observation(optimization_inputs, 0, from_worst = True, vectorscale = 200, circlescale = 0.5)

It says:

Chessboard dimensions are 0.7644m x 0.7644m Chessboard deflection at the center is 1.0mm , -0.5mm in the x,y direction

And the worst residual image looks like this:

It usually looks similar to the residuals from a poorly-modeled lens (chessboard edges tend to be observed at the edges of the lens). In

*this*solve the residuals all look low-ish at the center, much bigger at the edges, and consistent at the edges. - Unstable chessboard shape

What if the chessboard flexes a little bit? I redo my solve using a small parabolic deflection at the first half of the sequence and twice as much deformation during the second half.

#!/usr/bin/python3 import sys import mrcal import numpy as np import numpysane as nps def add_imperfection(optimization_inputs, observed_pixel_uncertainty): Nobservations = len(optimization_inputs['observations_board']) observations_half = np.array(optimization_inputs['observations_board'][:Nobservations//2,...]) optimization_inputs['calobject_warp'] *= 2. mrcal.make_perfect_observations(optimization_inputs, observed_pixel_uncertainty = observed_pixel_uncertainty) optimization_inputs['observations_board'][:Nobservations//2,...] = observations_half H,W = optimization_inputs['observations_board'].shape[1:3] s = optimization_inputs['calibration_object_spacing'] print(f"Chessboard dimensions are {s*(W-1)}m x {s*(H-1)}m") print(f"Chessboard deflection at the center (second half of dataset) is {optimization_inputs['calobject_warp'][0]*1000:.2f}mm , {optimization_inputs['calobject_warp'][1]*1000:.2f}mm in the x,y direction") print(f"Chessboard deflection at the center at the first half of the dataset is half of that") model = mrcal.cameramodel('splined.cameramodel') optimization_inputs = model.optimization_inputs() observed_pixel_uncertainty = np.std(mrcal.residuals_board(optimization_inputs).ravel()) optimization_inputs['calobject_warp'] = np.array((1e-3, -0.5e-3)) mrcal.make_perfect_observations(optimization_inputs, observed_pixel_uncertainty = observed_pixel_uncertainty) add_imperfection(optimization_inputs, observed_pixel_uncertainty) mrcal.optimize(**optimization_inputs) print(f"Optimized chessboard deflection at the center is {optimization_inputs['calobject_warp'][0]*1000:.2f}mm , {optimization_inputs['calobject_warp'][1]*1000:.2f}mm in the x,y direction") mrcal.cameramodel(optimization_inputs = optimization_inputs, icam_intrinsics = 0).write('/tmp/unstable-chessboard-shape.cameramodel') mrcal.show_residuals_board_observation(optimization_inputs, 0, from_worst = True, vectorscale = 200, circlescale = 0.5)

It says:

Chessboard dimensions are 0.7644m x 0.7644m Chessboard deflection at the center (second half of dataset) is 2.00mm , -1.00mm in the x,y direction Chessboard deflection at the center at the first half of the dataset is half of that Optimized chessboard deflection at the center is 1.66mm , -0.79mm in the x,y direction

It looks a bit like the unmodeled chessboard residuals above, but smaller, and more subtle. But lest you think that these small and subtle residuals imply that this doesn't affect the solution, here's the resulting difference in projection:

mrcal-show-projection-diff \ --no-uncertainties \ --radius 500 \ --unset key \ /tmp/unstable-chessboard-shape.cameramodel \ splined.cameramodel

So take care to make your chessboard rigid. I can think of better visualizations to more clearly identify these kinds of issues, so this might improve in the future.

For both of these chessboard deformation cases it's helpful to look at more than just 1 or 2 worst-case residuals. Look at the worst 10. In both cases the most tilted chessboard observations usually show very consistent residual vectors along the far edge of the chessboard.