mrcal roadmap

New, big features being considered for a future release

Triangulation in the optimization loop. This will allow efficient SFM since the coordinates of each observed 3D point don't need to be explicitly optimized as part of the optimization vector. This should also allow calibrating extrinsics separately from intrinsics, while propagating all the sources of uncertainty through to the eventual triangulation. This is being developed in the 2022-06--triangulated-solve branch
Non-central projection support. At this time, mrcal assumes that all projections are central: all rays of light are assumed to intersect at a single point (the origin of the camera coordinate system). So \(k \vec v\) projects to the same \(\vec q\) for any \(k\). This is very convenient, but not completely realistic. Support for non-central lenses will make possible more precise calibrations of all lenses, but especially wide ones. This is being developed in the noncentral branch
Improved projection uncertainty quantification. The current projection uncertainty method, which functional, has some issues. A new approach in the 2022-04--cross-uncertainty branch aims to resolve them.

The current projection uncertainty method works badly if given chessboards at multiple different ranges from the camera. This is due to the aphysical transform \(T_{\mathrm{r}^+\mathrm{r}}\) computed as part of the uncertainty computation. We can clearly see this in the dance study:
```
dance-study.py                          \
  --scan num_far_constant_Nframes_near  \
  --range 2,10                          \
  --Ncameras 1                          \
  --Nframes-near 100                    \
  --observed-pixel-uncertainty 2        \
  --ymax 2.5                            \
  --uncertainty-at-range-sampled-max 35 \
  opencv8.cameramodel
```
This tells us that adding any observations at 10m to the bulk set at 2m makes the projection uncertainty worse. One could expect no improvement from the far-off observations, but they shouldn't break anything. The issue is the averaging in 3D point space. Observation noise causes the far-off geometry to move much more than the nearby chessboards, and that far-off motion then dominates the average. We can also see it with the much larger ellipse we get when we add --extra-observation-at to
```
test/test-projection-uncertainty.py \
  --fixed cam0                      \
  --model opencv4                   \
  --show-distribution               \
  --range-to-boards 4               \
  --extra-observation-at 40         \
  --do-sample                       \
  --explore
```

Some experimental fixes are implemented in =test/test-projection-uncertainty.py=]]. For instance:

test/test-projection-uncertainty.py \
  --fixed cam0                      \
  --model opencv4                   \
  --show-distribution               \
  --explore                         \
  --do-sample                       \
  --reproject-perturbed mean-frames-using-meanq-penalize-big-shifts

It is important to solve this to be able to clearly say if non-closeup observations are useful at all or not. There was quick a bit of thought and experimentation in this area, but no conclusive solutions yet.

The solution being considered: solve for \(T_{\mathrm{r}^+\mathrm{r}}\) directly. We have a solve that minimizes the reprojection error \(\Sigma_i \left\Vert\vec q_i - \mathrm{project}\left(T_\mathrm{cr_i} T_\mathrm{rf_i} \vec p_{\mathrm{frame}_i}\right)\right\Vert^2\) and another one that looks at perturbed quantities \(\left\Vert\vec q^+ - \mathrm{project}^+\left(T_{\mathrm{c}^+\mathrm{r}^+} T_{\mathrm{r}^+\mathrm{f}^+} \vec p_{\mathrm{frame}}\right)\right\Vert^2\). Can I cross these to find the \(T_{\mathrm{r}^+\mathrm{r}}\) that minimizes \(\left\Vert\vec q^+ - \mathrm{project}^+\left(T_{\mathrm{c}^+\mathrm{r}^+} T_{\mathrm{r}^+\mathrm{r}} T_\mathrm{rf} \vec p_{\mathrm{frame}}\right)\right\Vert^2\). A diagram:

ORIGINAL SOLVE                   PERTURBED SOLVE

point in                         point in
chessboard                       chessboard
frame                            frame

  |                                |
  | Trf                            | Tr+f+
  v                                v

point in                         point in
ref frame     <-- Trr+ -->       ref frame

  |                                |
  | Tcr                            | Tc+r+
  v                                v

point in                         point in
cam frame                        cam frame

  |                                |
  | project                        | project
  v                                v

pixel                            pixel

Some experiments along those lines are implemented in mrcal-show-projection-diff --same-dance and in test/test-projection-uncertainty.py --reproject-perturbed ...

When asked to compute the uncertainty of many pixels at once (such as what mrcal-show-projection-uncertainty tool does), mrcal currently computes a separate \(T_{\mathrm{r}^+\mathrm{r}}\) for each pixel. But there exists only one \(T_{\mathrm{r}^+\mathrm{r}}\), and this should be computed once for all pixels, and applied to all of them.

Currently we are able to compute projection uncertainties only when given a vanilla calibration problem: stationary cameras are observing a moving chessboard. We should support more cases, for instance structure-from-motion coupled with intrinsics optimization. And computing uncertainty from a points-only chessboard-less solve should be possible

Richer board-shape model. Currently mrcal can solve for an axis-aligned paraboloid board shape. This is better than nothing, but experiments indicate that real-world board warping is more complex than that. A richer board-shape model will make mrcal less sensitive to imperfect chessboards, and will reduce that source of bias. This is being developed in the richer-board-shape branch, but this has the least priority of any ongoing work

Things that should be fixed, but that I'm not actively thinking about today

Algorithmic

Uncertainty quantification

The input noise should be characterized better. Currently we use the distribution from the optimal residuals. This feels right, but the empirical distribution isn't entirely gaussian. Why? There's an attempt to quantify the input noise directly in mrgingham. Does it work? Does that estimate agree with what the residuals tell us? If not, which is right? If a better method is found, the observed_pixel_uncertainty should come back as something the user passes in.
Can I quantify heteroscedasticity to detect model errors? In the tour of mrcal the human observer can clearly see patterns in the residuals. Can these patterns be detected automatically to flag these issues, especially when they're small and not entirely obvious? Do I want a "white test"?
As desired, we currently report high uncertainties in imager regions with no chessboards. When using a splined model, the projection in those regions is controlled entirely by the regularization terms, so we report high uncertainties there only because of the moving extrinsics. This isn't a great thing to rely on, and could break if I have some kind of surveyed calibration (known chessboard and/or camera poses).

Differencing

Fitting of the implied transformation is key to computing a diff, and various details about how this is done could be improved. Currently mrcal computes this from a fit. The default behavior of mrcal-show-projection-diff is to use the whole imager, using the uncertainties as weights. This has two problems:

If using a splined model, this is slow
If using a lean model, the overly-optimistic uncertainties you get from lean models tend to poison the fit, as seen in the documentation.

Triangulation

Currently I have a routine to compute projection uncertainty. And a separate routine to compute triangulation uncertainty. It would be nice to have a generic monocular uncertainty routine that is applicable to those and more cases. Should I be computing the uncertainty of a stabilized, normalized stereographic projection of \(\mathrm{unproject}\left(\vec q\right)\)? Then I could do monocular tracking with uncertainties. Can I derive the existing uncertainty methods from that one?
As noted on the triangulation page, some distributions become non-gaussian when looking at infinity. Is this a problem? When is it a problem? Should it be fixed? How?

Splined models

It's currently not clear how to choose the spline order (the order configuration parameter) and the spline density (the Nx and Ny parameters). There's some trade-off here: a quadratic spline needs denser knots. An initial study of the effects of spline spacings appears here. Can this be used to select the best spline configuration?
In the tour of mrcal we saw that uncertainty oscillates, with peaks at the knots. The causes and implications of this need to be understood better
The current regularization scheme is iffy. More or less mrcal is using simple L2 regularization. Something is required to tell the solver what to do in regions of no data. The transition between "data" and "no-data" regions is currently aphysical, as described in the documentation. Changing the regularization scheme to pull towards the mean, and not towards 0 could possibly fix this. An earlier attempt to do thatwas reverted because any planar splined surface would have "perfect" regularization, and that was breaking things (crazy focal lengths would be picked). But now that I'm locking down the intrinsics core when optimizing splined models, this isn't a problem anymore, so maybe that approach should be revisited.

Outlier rejection

The current outlier-rejection scheme is simplistic. A smarter approach is available in libdogleg (Cook's D and Dima's variations on that). Bringing those in could be good
Outlier rejection is currently only enabled for chessboard observations. It should be enabled for discrete points as well

Stereo

A pre-filter should be added to the mrcal-stereo tool to enhance the edges prior to stereo matching. A patch to add an early, untested prototype:

diff --git a/mrcal/stereo.py b/mrcal/stereo.py
index 6ba3549..7a6eabc 100644
--- a/mrcal/stereo.py
+++ b/mrcal/stereo.py
@@ -1276,5 +1276,22 @@ data_tuples, plot_options. The plot can then be made with gp.plot(*data_tuples,
                q0[ 0,-1],
                q0[-1,-1] )

+    image1 = image1.astype(np.float32)
+    image1 -= \
+        cv2.boxFilter(image1,
+                      ddepth     = -1,
+                      ksize      = tuple(template_size1),
+                      normalize  = True,
+                      borderType = cv2.BORDER_REPLICATE)
+    template_size0 = (round(np.max(q0[...,1]) - np.min(q0[...,1])),
+                      round(np.max(q0[...,0]) - np.min(q0[...,0])))
+    # I don't need to mean-0 the entire image0. Just the template will do
+    image0 = image0.astype(np.float32)
+    image0 -= \
+        cv2.boxFilter(image0,
+                      ddepth     = -1,
+                      ksize      = template_size0,
+                      normalize  = True,
+                      borderType = cv2.BORDER_REPLICATE)
     image0_template = mrcal.transform_image(image0, q0)

Currently a stereo pair arranged axially (one camera in front of the other) cause mrcal to fail. But it could work: the rectified images are similar to a polar transform of the input.

`mrcal.estimate_monocular_calobject_poses_Rt_tocam()`

An early stage of a calibration run generates a rough estimate of the chessboard geometry. Internally this is currently assuming a pinhole model, which is wrong, and currently requires an ugly hack. This does appear to work fairly well, but it should be fixed

Software

Stereo

The mrcal-stereo tool should be able to estimate the field of view automatically: the user should not be required to pass --az-fov-deg and --el-fov-deg

Uncertainty

Currently mrcal.triangulate() broadcasts nicely, while mrcal.projection_uncertainty() does not. It would be nice if it did and if its API resembled that of mrcal.triangulate()

Misc

mrcal-show-geometry tool: the mrcal-stereo tool produces a field-of-view visualization. This should be made available in the Python API and in the mrcal-show-geometry tool
dance-study.py: if asked for chessboards that are too close, the tool goes into an infinite loop as it searches for chessboard poses that are fully visible by the camera. Something smarter than an infinite loop should happen
Warnings in mrcal.c: there are a number of warnings in mrcal.c tagged with // WARNING that should eventually be addressed. This has never been urgent-enough to deal with. But someday
viz tools should accept --vectorfield and --vector-field