mrcal-stereo - Stereo processing


$ mrcal-stereo                        \
   --az-fov-deg               90      \
   --el-fov-deg               90      \
   --sgbm-block-size          5       \
   --sgbm-p1                  600     \
   --sgbm-p2                  2400    \
   --sgbm-uniqueness-ratio    5       \
   --sgbm-disp12-max-diff     1       \
   --sgbm-speckle-window-size 200     \
   --sgbm-speckle-range       2       \
   --outdir /tmp                      \
   left.cameramodel right.cameramodel \
   left.jpg         right.jpg

Processing left.jpg and right.jpg
Wrote '/tmp/rectified0.cameramodel'
Wrote '/tmp/rectified1.cameramodel'
Wrote '/tmp/left-rectified.png'
Wrote '/tmp/right-rectified.png'
Wrote '/tmp/left-disparity.png'
Wrote '/tmp/left-range.png'
Wrote '/tmp/points-cam0.vnl'


Given a pair of calibrated cameras and pairs of images captured by these cameras, this tool runs the whole stereo processing sequence to produce disparity and range images and a point cloud array.

mrcal functions are used to construct the rectified system. Currently only the OpenCV SGBM routine is available to perform stereo matching, but more options will be made available with time.

The commandline arguments to configure the SGBM matcher (--sgbm-...) map to the corresponding OpenCV APIs. Omitting an --sgbm-... argument will result in the defaults being used in the cv2.StereoSGBM_create() call. Usually the cv2.StereoSGBM_create() defaults are terrible, and produce a disparity map that isn't great. The --sgbm-... arguments in the synopsis above are a good start to get usable stereo.

The rectified system is constructed with the axes

- x: from the origin of the first camera to the origin of the second camera (the baseline direction)

- y: completes the system from x,z

- z: the mean "forward" direction of the two input cameras, with the component parallel to the baseline subtracted off

The active window in this system is specified using a few parameters. These refer to

- the "azimuth" (or "az"): the direction along the baseline: rectified x axis

- the "elevation" (or "el"): the direction across the baseline: rectified y axis

The rectified field of view is given by the arguments --az-fov-deg and --el-fov-deg. At this time there's no auto-detection logic, and these must be given. Changing these is a "zoom" operation.

To pan the stereo system, pass --az0-deg and/or --el0-deg. These specify the center of the rectified images, and are optional.

Finally, the resolution of the rectified images is given with --pixels-per-deg. This is optional, and defaults to the resolution of the first input image. If we want to scale the input resolution, pass a value <0. For instance, to generate rectified images at half the first-input-image resolution, pass --pixels-per-deg=-0.5. Note that the Python argparse has a problem with negative numbers, so "--pixels-per-deg -0.5" does not work.

The input images are specified by a pair of globs, so we can process many images with a single call. Each glob is expanded, and the filenames are sorted. The resulting lists of files are assumed to match up.

There are several modes of operation:

- No images given: we compute the rectified system only, writing the models to disk

- No --viz argument given: we compute the rectified system and the disparity, and we write all output as images on disk

- --viz geometry: we compute the rectified system, and display its geometry as a plot. No rectification is computed, and the images aren't used, and don't need to be passed in

- --viz stereo: compute the rectified system and the disparity. We don't write anything to disk initially, but we invoke an interactive visualization tool to display the results. Requires pyFLTK (homepage: and GL_image_display (homepage:

It is often desired to compute dense stereo for lots of images in bulk. To make this go faster, this tool supports the -j JOBS option. This works just like in Make: the work will be parallelized among JOBS simultaneous processes. Unlike make, the JOBS value must be specified.



models                Camera models representing cameras used to capture the
                      images. Both intrinsics and extrinsics are used
images                The image globs to use for the stereo. If omitted, we
                      only write out the rectified models. If given, exactly
                      two image globs must be given


-h, --help            show this help message and exit
--az-fov-deg AZ_FOV_DEG
                      The field of view in the azimuth direction, in
                      degrees. There's no auto-detection at this time, so
                      this argument is required (unless --already-rectified)
--el-fov-deg EL_FOV_DEG
                      The field of view in the elevation direction, in
                      degrees. There's no auto-detection at this time, so
                      this argument is required (unless --already-rectified)
--az0-deg AZ0_DEG     The azimuth center of the rectified images. "0" means
                      "the horizontal center of the rectified system is the
                      mean forward direction of the two cameras projected to
                      lie perpendicular to the baseline". If omitted, we
                      align the center of the rectified system with the
                      center of the two cameras' views
--el0-deg EL0_DEG     The elevation center of the rectified system. "0"
                      means "the vertical center of the rectified system
                      lies along the mean forward direction of the two
                      cameras" Defaults to 0.
--pixels-per-deg PIXELS_PER_DEG
                      The resolution of the rectified images. This is either
                      a whitespace-less, comma-separated list of two values
                      (az,el) or a single value to be applied to both axes.
                      If a resolution of >0 is requested, the value is used
                      as is. If a resolution of <0 is requested, we use this
                      as a scale factor on the resolution of the first input
                      image. For instance, to downsample by a factor of 2,
                      pass -0.5. By default, we use -1 for both axes: the
                      resolution of the input image at the center of the
                      rectified system.
                      The lens model to use for rectification. Currently two
                      models are supported: LENSMODEL_LATLON (the default)
                      and LENSMODEL_PINHOLE. Pinhole stereo works badly for
                      wide lenses and suffers from varying angular
                      resolution across the image. LENSMODEL_LATLON
                      rectification uses a transverse equirectangular
                      projection, and does not suffer from these effects. It
                      is thus the recommended model
--already-rectified   If given, assume the given models and images already
                      represent a rectified system. This will be checked,
                      and the models will be used as-is if the checks pass
--clahe               If given, apply CLAHE equalization to the images prior
                      to the stereo matching. If --already-rectified, we
                      still apply this equalization, if requested. Requires
--force-grayscale     If given, convert the images to grayscale prior to
                      doing anything else with them. By default, read the
                      images in their default format, and pass those
                      posibly-color images to all the processing steps.
                      Required if --clahe
--viz {geometry,stereo}
                      If given, we visualize either the rectified geometry
                      or the stereo results. If --viz geometry: we construct
                      the rectified stereo system, but instead of continuing
                      with the stereo processing, we render the geometry of
                      the stereo world; the images are ignored in this mode.
                      If --viz stereo: we launch an interactive graphical
                      tool to examine the rectification and stereo matching
                      results; the Fl_Gl_Image_Widget Python library must be
--axis-scale AXIS_SCALE
                      Used if --viz geometry. Scale for the camera axes. By
                      default a reasonable default is chosen (see
                      mrcal.show_geometry() for the logic)
--title TITLE         Used if --viz geometry. Title string for the plot
--hardcopy HARDCOPY   Used if --viz geometry. Write the output to disk,
                      instead of making an interactive plot. The output
                      filename is given in the option
--terminal TERMINAL   Used if --viz geometry. The gnuplotlib terminal. The
                      default is almost always right, so most people don't
                      need this option
--set SET             Used if --viz geometry. Extra 'set' directives to pass
                      to gnuplotlib. May be given multiple times
--unset UNSET         Used if --viz geometry. Extra 'unset' directives to
                      pass to gnuplotlib. May be given multiple times
--force, -f           By default existing files are not overwritten. Pass
                      --force to overwrite them without complaint
--outdir OUTDIR       Directory to write the output into. If omitted, we
                      user the current directory
--tag TAG             String to use in the output filenames. Non-specific
                      output filenames if omitted
                      The disparity limits to use in the search, in pixels.
                      Two integers are expected: MIN_DISPARITY
                      MAX_DISPARITY. Completely arbitrarily, we default to
                      MIN_DISPARITY=0 and MAX_DISPARITY=100
                      If given, annotate the image with its valid-intrinsics
                      region. This will end up in the rectified images, and
                      make it clear where successful matching shouldn't be
                      The nearest,furthest range to encode in the range
                      image. Defaults to 1,1000, arbitrarily
--stereo-matcher {SGBM,ELAS}
                      The stereo-matching method. By default we use the
                      "SGBM" method from OpenCV. libelas isn't always
                      available, and must be enabled at compile-time by
                      setting USE_LIBELAS=1 during the build
--sgbm-block-size SGBM_BLOCK_SIZE
                      A parameter for the OpenCV SGBM matcher. If omitted, 5
                      is used
--sgbm-p1 SGBM_P1     A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-p2 SGBM_P2     A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-disp12-max-diff SGBM_DISP12_MAX_DIFF
                      A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-pre-filter-cap SGBM_PRE_FILTER_CAP
                      A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-uniqueness-ratio SGBM_UNIQUENESS_RATIO
                      A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-speckle-window-size SGBM_SPECKLE_WINDOW_SIZE
                      A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-speckle-range SGBM_SPECKLE_RANGE
                      A parameter for the OpenCV SGBM matcher. If omitted,
                      the OpenCV default is used
--sgbm-mode {SGBM,HH,HH4,SGBM_3WAY}
                      A parameter for the OpenCV SGBM matcher. Must be one
                      of ('SGBM','HH','HH4','SGBM_3WAY'). If omitted, the
                      OpenCV default (SGBM) is used
--write-point-cloud   If given, we write out the point cloud as a .ply file.
                      Each point is reported in the reference coordinate
                      system, colored with the nearest-neighbor color of the
                      camera0 image. This is disabled by default because
                      this is potentially a very large file
--jobs JOBS, -j JOBS  parallelize the processing JOBS-ways. This is like
                      Make, except you're required to explicitly specify a
                      job count. This applies when processing multiple sets
                      of images with the same set of models



Dima Kogan, <>


Copyright (c) 2017-2021 California Institute of Technology ("Caltech"). U.S. Government sponsorship acknowledged. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); You may obtain a copy of the License at