A tour of mrcal: stereo processing
Table of Contents
This is an overview of a more detailed discussion about dense stereo.
Previous
Stereo
We computed intrinsics from chessboards observations earlier, so let's use these for stereo processing.
I took several images off a catwalk over Figueroa Street in downtown Los Angeles. This is the view South along Figueroa Street. There're tall buildings ahead and on either side, which makes an interesting stereo scene.
The two images out of the camera look like this:
All the full-size images are available by clicking on an image.
The cameras are 7ft (2.1m) apart. In order to compute stereo images we need an accurate estimate of the relative geometry of the cameras, which we usually get as an output of the calibration process. Here I only had one physical camera, so I did something different:
- I calibrated the one camera monocularly, to get its intrinsics. This is what we computed so far
- I used this camera to take a pair of images from slightly different locations. This is the "stereo pair"
- I used a separate tool to estimate the extrinsics from corresponding feature detections in the two images. This tool isn't a part of mrcal today, but will be made available in the future
Generating a stereo pair in this way works well to demo the stereo processing tools. It does break the triangulation uncertainty reporting, since we lose the uncertainty information in the extrinsics, however. Preserving this uncertainty, and propagating it through the recomputed extrinsics to triangulation is on the mrcal roadmap. The resulting models we use for stereo processing:
I then use the mrcal APIs to compute rectification maps, rectify the images,
compute disparities and convert them to ranges. This is all done with the
mrcal-stereo
tool:
mrcal-stereo \ --az-fov-deg 145 \ --el-fov-deg 135 \ --el0-deg 5 \ --disparity-range 0 400 \ --sgbm-p1 600 \ --sgbm-p2 2400 \ --sgbm-uniqueness-ratio 5 \ --sgbm-disp12-max-diff 1 \ --sgbm-speckle-window-size 200 \ --sgbm-speckle-range 2 \ --valid-intrinsics-region \ splined-[01].cameramodel \ [01].jpg
The --sgbm-...
options configure the OpenCV SGBM stereo matcher. Not
specifying them uses the OpenCV defaults, which usually produces poor results.
The rectified images look like this:
And the disparity and range images look like this:
On a basic level, this is clearly working well.
If you've used other stereo libraries previously, these rectified images may look odd. This is due to using a transverse equirectangular projection for rectification instead of the more common pinhole projection (see the detailed stereo-processing documentation). This samples the azimuth and elevation angles evenly, which minimizes the visual distortion inside each image row. A side-effect is the the vertical expansion in the rectified image at the azimuth extremesm but since stereo matching works by correlating the rows independently, this is a good trade-off.
The generated rectified models are here and here. These define a rectified system with axes
- \(x\): the baseline; from the origin of the left camera to the origin of the right camera
- \(y\): down
- \(z\): forward
mrcal-stereo
selects the \(y\) and \(z\) axes to line up as much as possible with
the geometry of the input models. Since the geometry of the actual cameras may
not match the idealized rectified geometry (the "forward" view direction may not
be orthogonal to the baseline), the usual expectations that \(c_x \approx
\frac{\mathrm{imagerwidth}}{2}\) and \(c_y \approx
\frac{\mathrm{imagerheight}}{2}\) are not necessarily true in the rectified
system. Thus mrcal-stereo
can visualize the models and the rectified field
of view, for verification:
mrcal-stereo \ --az-fov-deg 145 \ --el-fov-deg 135 \ --el0-deg 5 \ --set 'view 70,5' \ --viz geometry \ splined-[01].cameramodel
Here, the geometry is mostly nominal and the rectified view (indicated by the purple lines) does mostly lie along the \(z\) axis.
Stereo rectification outside of mrcal
As a toolkit, mrcal is fairly flexible, so I want to show how one could perform
stereo processing using other tools a part of the pipeline, rather than letting
mrcal-stereo
do all the work.
What if we want to do our stereo processing with some other tool, and what if that tool doesn't support the splined model we want to use? We can use mrcal to reproject the image to whatever model we like, and then hand off the processed image and new models to that tool. Let's demonstrate with a pinhole model.
We can use the mrcal-reproject-image
tool to reproject the images. Mapping
fisheye images to a pinhole model introduces an unwinnable trade-off: the
angular resolution changes dramatically as you move towards the edges of the
image. At the edges the angular resolution becomes extreme, and you need far
more pixels to represent the same arc in space as you do in the center. So you
usually need to throw out high-information pixels in the center, and gain
low-information pixels at the edges. The original image doesn't have more
resolution at the edges, so we interpolate. Cutting off the edges (i.e. using a
narrower lens) helps bring this back into balance.
So let's do this using two different focal lengths to illustrate the trade-off:
--scale-focal 0.35
: fairly wide. Looks extreme in a pinhole projection--scale-focal 0.6
: not as wide. Looks more reasonable in a pinhole projection, but cuts off big chunks of the image at the edges
for scale in 0.35 0.6; do for c in 0 1; do mrcal-reproject-image \ --valid-intrinsics-region \ --to-pinhole \ --scale-focal $scale \ splined-$c.cameramodel \ $c.jpg \ | mrcal-to-cahvor \ > splined-$c.scale$scale.cahvor; mv $c-reprojected{,-scale$scale}.jpg; done done
The wider pinhole resampling of the two images:
The narrower resampling of the two images:
We will use jplv (a stereo library used at NASA/JPL) to process these pinhole
images into a disparity map, so I converted the models to the .cahvor
file
format, as that tool expects. The models:
- camera 0, wider scaling
- camera 1, wider scaling
- camera 0, narrower scaling
- camera 1, narrower scaling
Both clearly show the uneven resolution described above, with the wider image being far more extreme. I can now use these images to compute stereo with jplv:
for scale in 0.35 0.6; do \ stereo \ --no-ran \ --no-disp \ --no-pre \ --corr-width 17 \ --corr-height 5 \ --blob-area 10 \ --disp-min 0 \ --disp-max 400 \ splined-[01].scale$scale.cahvor \ [01]-reprojected-scale$scale.jpg; for f in rect-left rect-right diag-left; do \ mv 00-$f.png jplv-stereo-$f-scale$scale.png; done done
The wide rectified images:
The narrow rectified images:
As most non-mrcal tools, jplv uses a pinhole model for rectification. So even if we gave it wide-angle images and a wide-angle-friendly camera model, we would still experience the issues raised above: the rectified images would have these problems.
The disparities computed by jplv look like this for wide images:
and for narrow images:
On a basic level, this is clearly working well also.
If jplv output its rectified pinhole models, we could do an apples-to-apples
comparison using the SGBM correlator, as before. We would ask mrcal-stereo
to
accept jplv's rectification by passing mrcal-stereo --already-rectified
.
Splitting a wide view into multiple narrow views
We just showed one way to use jplv to handle mrcal lenses, but we had to pay a
price of degraded feature-matching accuracy due to unevenly-scaled rectified
images. A way to do pinhole-rectified stereo while handling the geometric
challenges of wide-angle lenses is to subdivide the wide field of view into
multiple narrower virtual views. Then we'd have several narrow-angle stereo
pairs instead of a single wide stereo pair, and each narrow pair can be
processed with pinhole rectification. mrcal-stereo
can do all the work. Let's
look 45 degrees to the left:
mrcal-stereo \ --rectification LENSMODEL_PINHOLE \ --az-fov-deg 80 \ --el-fov-deg 80 \ --az0-deg -45 \ --disparity-range 0 200 \ --sgbm-p1 600 \ --sgbm-p2 2400 \ --sgbm-uniqueness-ratio 5 \ --sgbm-disp12-max-diff 1 \ --sgbm-speckle-window-size 200 \ --sgbm-speckle-range 2 \ --valid-intrinsics-region \ splined-[01].cameramodel \ [01].jpg
The pinhole rectified images:
And the disparity:
This looks much better than the pinhole-rectified stereo from the full image. The rectified pinhole models and images could be passed to a different tool to complete the processing, if desired.
And we can see the rotated field of view when we visualize the rectified system:
mrcal-stereo \ --rectification LENSMODEL_PINHOLE \ --az-fov-deg 80 \ --el-fov-deg 80 \ --az0-deg -45 \ --set 'view 70,5' \ --viz geometry \ splined-[01].cameramodel
Range accuracy
So far we have looked at the results qualitatively by having a human eyeball the disparity maps to verify that they looked reasonable. But the goal here is to get ranges. So let's talk about triangulation routines.
Next
We're ready to talk about triangulation routines