# Projection uncertainty: mean-frames

A key part of the projection uncertainty propagation method is to compute a function \(\vec q^+\left(\vec b\right)\) to represent the change in projected pixel \(\vec q\) as the optimization vector \(\vec b\) moves around.

mrcal has several methods of doing this, and the legacy *mean-frames* method is
described here. This is accessible by calling
`mrcal.projection_uncertainty(method = "mean-frames")`

. This method is simple,
and has some issues that are resolved by newer formulations: starting with mrcal
3.0 the improved *cross-reprojection* uncertainty method is recommended.

## Mean-frames uncertainty

The state vector \(\vec b\) is a random variable, and we know its distribution. To
evaluate the projection uncertainty we want to project a *fixed* point, to see
how this projection \(\vec q\) moves around as the chessboards and cameras and
intrinsics shift due to the uncertainty in \(\vec b\). In other words, we want to
project a point defined in the coordinate system of the camera housing, as the
origin of the mathematical camera moves around inside this housing:

How do we operate on points in a fixed coordinate system when all the coordinate
systems we have are floating random variables? We use the most fixed thing we
have: chessboards. As with the camera housing, the chessboards themselves are
fixed in space. We have noisy camera observations of the chessboards that
implicitly produce estimates of the fixed transformation \(T_{\mathrm{cf}_i}\) for
each chessboard \(i\). The explicit transformations that we *actually* have in
\(\vec b\) all relate to a floating reference coordinate system: \(T_\mathrm{cr}\)
and \(T_\mathrm{rf}\). *That* coordinate system doesn't have any physical meaning,
and it's useless in producing our fixed point.

Thus if we project points from a chessboard frame, we would be unaffected by the untethered reference coordinate system. So points in a chessboard frame are somewhat "fixed" for our purposes.

To begin, let's focus on just *one* chessboard frame: frame 0. We want to know
the uncertainty at a pixel coordinate \(\vec q\), so let's unproject and transform
\(\vec q\) out to frame 0:

\[ \vec p_{\mathrm{frame}_0} = T_{\mathrm{f}_0\mathrm{r}} T_\mathrm{rc} \mathrm{unproject}\left( \vec q \right) \]

We then transform and project \(\vec p_{\mathrm{frame}_0}\) back to the imager to get \(\vec q^+\). But here we take into account the uncertainties of each transformation to get the desired projection uncertainty \(\mathrm{Var}\left(\vec q^+ - \vec q\right)\). The full data flow looks like this, with all the perturbed quantities marked with a \(+\) superscript.

\[ \xymatrix{ \vec q^+ & & \vec p^+_\mathrm{camera} \ar[ll]_-{\vec b_\mathrm{intrinsics}^+} & \vec p^+_{\mathrm{reference}_0} \ar[l]_-{T^+_\mathrm{cr}} & \vec p_{\mathrm{frame}_0} \ar[l]_-{T^+_{\mathrm{rf}_0}} & \vec p_\mathrm{reference} \ar[l]_-{T_\mathrm{fr}} & \vec p_\mathrm{camera} \ar[l]_-{T_\mathrm{rc}} & & \vec q \ar[ll]_-{\vec b_\mathrm{intrinsics}} } \]

This works, but it depends on \(\vec p_{\mathrm{frame}_0}\) being "fixed". We can
do better. We're observing more than one chessboard, and *in aggregate* all the
chessboard frames can represent an even-more "fixed" frame. Currently we take a
very simple approach towards combinining the frames: we compute the mean of all
the \(\vec p^+_\mathrm{reference}\) estimates from each frame. The full data flow
then looks like this:

\[ \xymatrix{ & & & & \vec p^+_{\mathrm{reference}_0} \ar[dl]_{\mathrm{mean}} & \vec p^+_{\mathrm{frame}_0} \ar[l]_-{T^+_{\mathrm{rf}_0}} \\ \vec q^+ & & \vec p^+_\mathrm{camera} \ar[ll]_-{\vec b_\mathrm{intrinsics}^+} & \vec p^+_{\mathrm{reference}} \ar[l]_-{T^+_\mathrm{cr}} & \vec p^+_{\mathrm{reference}_1} \ar[l]_{\mathrm{mean}} & \vec p^+_{\mathrm{frame}_1} \ar[l]_-{T^+_{\mathrm{rf}_1}} & \vec p_\mathrm{reference} \ar[l]_-{T_\mathrm{f_1 r}} \ar[lu]_-{T_\mathrm{f_0 r}} \ar[ld]^-{T_\mathrm{f_2 r}} & \vec p_\mathrm{camera} \ar[l]_-{T_\mathrm{rc}} & & \vec q \ar[ll]_-{\vec b_\mathrm{intrinsics}} \\ & & & & \vec p^+_{\mathrm{reference}_2} \ar[ul]^{\mathrm{mean}} & \vec p^+_{\mathrm{frame}_2} \ar[l]_-{T^+_{\mathrm{rf}_2}} } \]

So to summarize, to compute the projection uncertainty at a pixel \(\vec q\) we

- Unproject \(\vec q\) and transform to
*each*chessboard coordinate system to obtain \(\vec p_{\mathrm{frame}_i}\) - Transform and project back to \(\vec q^+\), using the mean of all the \(\vec p_{\mathrm{reference}_i}\) and taking into account uncertainties

We have \(\vec q^+\left(\vec b\right) = \mathrm{project}\left( T_\mathrm{cr} \, \mathrm{mean}_i \left( T_{\mathrm{rf}_i} \vec p_{\mathrm{frame}_i} \right) \right)\) where the transformations \(T\) and the intrinsics used in \(\mathrm{project}()\) come directly from the optimization state vector \(\vec b\). This function can be used in the higher-level projection uncertainty propagation method to compute \(\mathrm{Var}\left( \vec q \right)\).

## Problems with "mean-frames" uncertainty

This "mean-frames" uncertainty method works well, but has several issues:

### Aphysical \(T_{\mathrm{r}^+\mathrm{r}}\) transform

The computation above indirectly computes the transform that relates the unperturbed and perturbed reference coordinate systems:

\[ T_{\mathrm{r}^+\mathrm{r}} = \mathrm{mean}_i \left( T_{\mathrm{r}^+\mathrm{f}_i} T_{\mathrm{f}_i\mathrm{r}} \right) \]

Each transformation \(T\) includes a rotation matrix \(R\), so the above constructs a new rotation as a mean of multiple rotation matrices. This is aphysical: the resulting matrix is not a valid rotation. In practice, the perturbations are tiny, and this is sufficiently close. Usually, but not always.

### Pessimistic response to disparate observed chessboard ranges

Because of this aphysical transform, the mean-frames method produces
fictitiously high uncertainties when gives a mix of low-range and high-range
observations. Far-away chessboard observations don't contain much information,
so adding some far-away chessboards to a dataset shouldn't improve the
uncertainty much at the distance, but it shouldn't make it any worse. However,
with the mean-frames method, far-away observations *do* make the uncertainty
worse. We can clearly see this in the dance study:

analyses/dancing/dance-study.py \ --scan num_far_constant_Nframes_near \ --range 2,10 \ --Ncameras 1 \ --Nframes-near 100 \ --observed-pixel-uncertainty 2 \ --ymax 4 \ --uncertainty-at-range-sampled-max 35 \ --Nscan-samples 4 \ --method mean-frames \ opencv8.cameramodel

This is a one-camera calibration computed off 100 chessboard observations at 2m out, with a few observations added at a longer range of 10m. Each curve represents the projection uncertainty at the center of the image, at different distances. The purple curve is the uncertainty with no 10m chessboards at all. As we add observations at 10m, we see the uncertainty get worse.

The issue is the averaging in 3D point space. Observation noise causes the
far-off geometry to move much more than the nearby chessboards, and that far-off
motion then dominates the average. If we use the newer
`cross-reprojection--rrp-Jfp`

method, this issue goes away:

analyses/dancing/dance-study.py \ --scan num_far_constant_Nframes_near \ --range 2,10 \ --Ncameras 1 \ --Nframes-near 100 \ --observed-pixel-uncertainty 2 \ --ymax 4 \ --uncertainty-at-range-sampled-max 35 \ --Nscan-samples 4 \ --method cross-reprojection--rrp-Jfp \ opencv8.cameramodel

As expected, the low-range uncertainty is unaffected by the 10m observations, but the far-range uncertainty is improved.

### Chessboards are a hard requirement

The "mean-frames" metho has a hard requirement on chessboards being used in the
solve. In fact, the assumption of stationary cameras observing a moving
chessboard is baked into the formulation. So any other case (moving cameras or
calibrating off discrete points for instance) is not supported. The newer
`cross-reprojection--rrp-Jfp`

method lifts this restriction.