mrcal C API

Table of Contents

Internally, the Python functions use the mrcal C API. Only core functionality is available in the C API (the Python API can do some stuff that the C API cannot), but with time more and more stuff will be transitioned to a C-internal representation. Today, end-to-end dense stereo processing in C is possible.

The C API consists of several headers:

Most usages would simply #include <mrcal.h>, and this would include all the headers. This is a C (not C++) library, so X macros are used in several places for templating.

mrcal is a research project, so the capabilities and focus are still evolving. Thus, the interfaces, especially those in the C API are not yet stable. I do try to maintain stability, but this is not fully possible, especially in the higher-level APIs (mrcal_optimize() and mrcal_optimizer_callback()). For now, assume that each major release breaks both the API and the ABI. The migration notes for each release are described in the relevant release notes.

If you use the C APIs, shoot me an email to let me know, and I'll keep you in mind when making API updates.

The best documentation for the C interfaces is the comments in the headers. I don't want to write up anything complete and detailed until I'm sure the interfaces are stable. The available functions are broken down into categories, and described in a bit more detail here.

Geometry structures

We have 3 structures in basic-geometry.h:

  • mrcal_point2_t: a vector containing 2 double-precision floating-point values. The elements can be accessed individually as .x and .y or as an array .xy[]
  • mrcal_point3_t: exactly like mrcal_point2_t, but 3-dimensional. A vector containing 3 double-precision floating-point values. The elements can be accessed individually as .x and .y and .z or as an array .xyz[]
  • mrcal_pose_t: an unconstrained 6-DOF pose. Contains two sub-structures:

Geometry functions

A number of utility functions are defined in poseutils.h. Each routine has two forms:

  • A mrcal_..._full() function that supports a non-contiguous memory layout for each input and output
  • A convenience mrcal_...() macro that wraps mrcal_..._full(), and expects contiguous data. This has many fewer arguments, and is easier to call

Each data argument (input or output) has several items in the argument list:

  • double* xxx: a pointer to the first element in the array
  • int xxx_stride0, int xxx_stride1, …: the strides, one per dimension

The strides are given in bytes, and work as expected. For a (for instance) 3-dimensional xxx, the element at xxx[i,j,k] would be accessible as

*(double*) &((char*)xxx)[ i*xxx_stride0 +
                          j*xxx_stride1 +
                          k*xxx_stride2 ]

There are convenience macros in strides.h, so the above can be equivalently expressed as

P3(xxx, i,j,k)

These all have direct Python bindings. For instance the Python mrcal.rt_from_Rt() function wraps mrcal_rt_from_Rt() C function.

The poseutils.h header serves as the listing of available functions.

Lens models

The lens model structures are defined here:

  • mrcal_lensmodel_type_t: an enum decribing the lens model type. No configuration is stored here.
  • mrcal_lensmodel_t: a lens model type and the configuration parameters. The configuration lives in a union supporting all the known lens models
  • mrcal_lensmodel_metadata_t: some metadata that decribes a model type. These are inherent properties of a particular model type; answers questions like: Can this model project behind the camera? Does it have an intrinsics core? Does it have gradients implemented?

The Python API describes a lens model with a string that contains the model type and the configuration, while the C API stores the same information in a mrcal_lensmodel_t.

Camera models

We can also represent a full camera model in C. This is a lens model with a pose and imager dimension: the full set of things in a .cameramodel file. The definitions appear in mrcal-types.h:

typedef struct
{
    double            rt_cam_ref[6];
    unsigned int      imagersize[2];
    mrcal_lensmodel_t lensmodel;
    double            intrinsics[0];
} mrcal_cameramodel_t;

typedef union
{
    mrcal_cameramodel_t m;
    struct
    {
        double            rt_cam_ref[6];
        unsigned int      imagersize[2];
        mrcal_lensmodel_t lensmodel;
        double intrinsics[4];
    };
} mrcal_cameramodel_LENSMODEL_LATLON_t;

...

Note that mrcal_cameramodel_t.intrinsics has size 0 because the size of this array depends on the specific lens model being used, and is unknown at compile time.

So it is an error to define this on the stack. Do not do this:

void f(void)
{
    mrcal_cameramodel_t model; // ERROR
}

If you need to define a known-at-compile-time model on the stack you can use the lensmodel-specific cameramodel types:

void f(void)
{
    mrcal_cameramodel_LENSMODEL_OPENCV8_t model; // OK
}

This only exists for models that have a constant number of parameters; notably there is no mrcal_cameramodel_LENSMODEL_SPLINED_STEREOGRAPHIC_t. When reading a model from disk, mrcal dynamically allocates the right amount of memory, and returns a mrcal_cameramodel_t*.

Projections

The fundamental functions for projection and unprojection are defined here. mrcal_project() is the main routine that implements the "forward" direction, and is available for every camera model. This function can return gradients in respect to the coordinates of the point being projected and/or in respect to the intrinsics vector.

mrcal_unproject() is the reverse direction, and is implemented as a numerical optimization to reverse the projection operation. Naturally, this is much slower than mrcal_project(). Since mrcal_unproject() is implemented with a nonlinear optimization, it has no gradient reporting. The Python mrcal.unproject() routine is higher-level, and it does report gradients.

The gradients of the forward mrcal_project() operation are used in this nonlinear optimization, so models that have no projection gradients defined do not support mrcal_unproject(). The Python mrcal.unproject() routine still makes this work, using numerical differences for the projection gradients.

Simple, special-case lens models have their own projection and unprojection functions defined:

void mrcal_project_pinhole(...);
void mrcal_unproject_pinhole(...);
void mrcal_project_stereographic(...);
void mrcal_unproject_stereographic(...);
void mrcal_project_lonlat(...);
void mrcal_unproject_lonlat(...);
void mrcal_project_latlon(...);
void mrcal_unproject_latlon(...);

These functions do the same thing as the general mrcal_project() and mrcal_unproject() functions, but work much faster.

Layout of the measurement and state vectors

The optimization routine tries to minimize the 2-norm of the measurement vector \(\vec x\) by moving around the state vector \(\vec b\).

We select which parts of the optimization problem we're solving by setting bits in the mrcal_problem_selections_t structure. This defines

  • Which elements of the optimization vector are locked-down, and which are given to the optimizer to adjust
  • Whether we apply regularization to stabilize the solution
  • Whether the chessboard should be assumed flat, or if we should optimize deformation factors

Thus the state vector may contain any of

  • The lens parameters
  • The geometry of the cameras
  • The geometry of the observed chessboards and discrete points
  • The chessboard shape

The measurement vector may contain

  • The errors in observations of the chessboards
  • The errors in observations of discrete points
  • The penalties in the solved point positions
  • The regularization terms

Given mrcal_problem_selections_t and a vector \(\vec b\) or \(\vec x\), it is useful to know where specific quantities lie inside those vectors. Here we have 4 sets of functions to answer such questions:

  • int mrcal_state_index_THING(): Returns the index in the state vector \(\vec b\) where the contiguous block of values describing the THING begins. THING is any of

    • intrinsics
    • extrinsics
    • frames
    • points
    • calobject_warp

    If we're not optimizing the THING, return <0

  • int mrcal_num_states_THING(): Returns the number of values in the contiguous block in the state vector \(\vec b\) that describe the given THING. THING is any of
    • intrinsics
    • extrinsics
    • frames
    • points
    • calobject_warp
  • int mrcal_measurement_index_THING(): Returns the index in the measurement vector \(\vec x\) where the contiguous block of values describing the THING begins. THING is any of
    • boards
    • points
    • regularization
  • int mrcal_num_measurements_THING(): Returns the number of values in the contiguous block in the measurement vector \(\vec x\) that describe the given THING. THING is any of
    • boards
    • points
    • regularization

State packing

The optimization routine works in the space of scaled parameters, and several functions are available to pack/unpack the state vector \(\vec b\):

void mrcal_pack_solver_state_vector(...);
void mrcal_unpack_solver_state_vector(...);

Optimization

The mrcal optimization routines are defined in mrcal.h. There are two primary functions, each accessing a lot of functionality, and taking many arguments. At this time, the prototypes will likely change in each release of mrcal, so try not to rely on these being stable.

  • mrcal_optimize() is the entry point to the optimization routine. This function ingests the state, runs the optimization, and returns the optimal state in the same variables. The optimization routine tries out different values of the state vector by calling an optimization callback function to evaluate each one.
  • mrcal_optimizer_callback() provides access to the optimization callback function standalone, without being wrapped into the optimization loop

Helper structures

This is correct as of mrcal 2.1. It may change in future releases.

We define some structures to organize the input to these functions. Each observation has a mrcal_camera_index_t to identify the observing camera:

// Used to specify which camera is making an observation. The "intrinsics" index
// is used to identify a specific camera, while the "extrinsics" index is used
// to locate a camera in space. If I have a camera that is moving over time, the
// intrinsics index will remain the same, while the extrinsics index will change
typedef struct
{
    // indexes the intrinsics array
    int  intrinsics;
    // indexes the extrinsics array. -1 means "at coordinate system reference"
    int  extrinsics;
} mrcal_camera_index_t;

When solving a vanilla calibration problem, we have a set of stationary cameras observing a moving scene. By convention, in such a problem we set the reference coordinate system to camera 0, so that camera has no extrinsics. So in a vanilla calibration problem mrcal_camera_index_t.intrinsics will be in \([0, N_\mathrm{cameras})\) and mrcal_camera_index_t.extrinsics will always be mrcal_camera_index_t.intrinsics - 1.

When solving a vanilla structure-from-motion problem, we have a set of moving cameras observing a stationary scene. Here mrcal_camera_index_t.intrinsics would be in \([0, N_\mathrm{cameras})\) and mrcal_camera_index_t.extrinsics would be specify the camera pose, unrelated to mrcal_camera_index_t.intrinsics.

These are the limiting cases; anything in-between is allowed.

A board observation is defined by a mrcal_observation_board_t:

// An observation of a calibration board. Each "observation" is ONE camera
// observing a board
typedef struct
{
    // which camera is making this observation
    mrcal_camera_index_t icam;

    // indexes the "frames" array to select the pose of the calibration object
    // being observed
    int                  iframe;
} mrcal_observation_board_t;

And an observation of a discrete point is defined by a mrcal_observation_point_t:

// An observation of a discrete point. Each "observation" is ONE camera
// observing a single point in space
typedef struct
{
    // which camera is making this observation
    mrcal_camera_index_t icam;

    // indexes the "points" array to select the position of the point being
    // observed
    int                  i_point;

    // Observed pixel coordinates. This works just like elements of
    // observations_board_pool:
    //
    // .x, .y are the pixel observations
    // .z is the weight of the observation. Most of the weights are expected to
    // be 1.0. Less precise observations have lower weights.
    // .z<0 indicates that this is an outlier. This is respected on
    // input
    //
    // Unlike observations_board_pool, outlier rejection is NOT YET IMPLEMENTED
    // for points, so outlier points will NOT be found and reported on output in
    // .z<0
    mrcal_point3_t px;
} mrcal_observation_point_t;

Note that the details of the handling of discrete points may change in the future.

We have mrcal_problem_constants_t to define some details of the optimization problem. These are similar to mrcal_problem_selections_t, but consist of numerical values, rather than just bits. Currently this structure contains valid ranges for interpretation of discrete points. These may change in the future.

// Constants used in a mrcal optimization. This is similar to
// mrcal_problem_selections_t, but contains numerical values rather than just
// bits
typedef struct
{
    // The minimum distance of an observed discrete point from its observing
    // camera. Any observation of a point below this range will be penalized to
    // encourage the optimizer to move the point further away from the camera
    double  point_min_range;


    // The maximum distance of an observed discrete point from its observing
    // camera. Any observation of a point abive this range will be penalized to
    // encourage the optimizer to move the point closer to the camera
    double  point_max_range;
} mrcal_problem_constants_t;

The optimization function returns most of its output in the same memory as its input variables. A few metrics that don't belong there are returned in a separate mrcal_stats_t structure:

// This structure is returned by the optimizer, and contains some statistics
// about the optimization
typedef struct
{
    // generated by an X-macro

    /* The RMS error of the optimized fit at the optimum. Generally the residual */
    /* vector x contains error values for each element of q, so N observed pixels */
    /* produce 2N measurements: len(x) = 2*N. And the RMS error is */
    /*   sqrt( norm2(x) / N ) */
    double rms_reproj_error__pixels;

    /* How many pixel observations were thrown out as outliers. Each pixel */
    /* observation produces two measurements. Note that this INCLUDES any */
    /* outliers that were passed-in at the start */
    int Noutliers;
} mrcal_stats_t;

This contains some statistics describing the discovered optimal solution.

Camera model reading/writing

A simple interface for reading/writing .cameramodel data from C is available:

// if len>0, the string doesn't need to be 0-terminated. If len<=0, the end of
// the buffer IS indicated by a '\0' byte
mrcal_cameramodel_t* mrcal_read_cameramodel_string(const char* string, int len);
mrcal_cameramodel_t* mrcal_read_cameramodel_file  (const char* filename);
void                 mrcal_free_cameramodel(mrcal_cameramodel_t** cameramodel);

bool mrcal_write_cameramodel_file(const char* filename,
                                  const mrcal_cameramodel_t* cameramodel);

This reads and write the mrcal_cameramodel_t structures. Only the .cameramodel file format is supported by these C functions. The Python API supports more formats.

Images

mrcal defines simple image types in mrcal-image.h:

  • mrcal_image_int8_t
  • mrcal_image_uint8_t
  • mrcal_image_int16_t
  • mrcal_image_uint16_t
  • mrcal_image_int32_t
  • mrcal_image_uint32_t
  • mrcal_image_int64_t
  • mrcal_image_uint64_t
  • mrcal_image_float_t
  • mrcal_image_double_t
  • mrcal_image_bgr_t

These are the basic not-necessarily-contiguous arrays. The bgr type is used for color images:

typedef struct { uint8_t bgr[3]; } mrcal_bgr_t;

Simple accessor and manipulation functions are available for each of these types (replacing each T below):

T* mrcal_image_T_at(mrcal_image_T_t* image, int x, int y);

const T* mrcal_image_T_at_const(const mrcal_image_T_t* image, int x, int y);

mrcal_image_T_t mrcal_image_T_crop(mrcal_image_T_t* image,
                                   int x0, int y0,
                                   int w,  int h);

And for uint8_t, uint16_t and mrcal_bgr_t we can also read and write image files:

bool mrcal_image_T_save (const char* filename, const mrcal_image_T_t*  image);

bool mrcal_image_T_load( mrcal_image_T_t*  image, const char* filename);

These use the freeimage library. These functions aren't interesting, or better than any other functions you may have already. The declarations are in mrcal-image.h, and the documentation lives there.

Heat maps

mrcal can produce a colored visualization of any of the image types defined above:

bool
mrcal_apply_color_map_T(
        mrcal_image_bgr_t*    out,
        const mrcal_image_T_t* in,

        /* If true, I set in_min/in_max from the */
        /* min/max of the input data */
        const bool auto_min,
        const bool auto_max,

        /* If true, I implement gnuplot's default 7,5,15 mapping. */
        /* This is a reasonable default choice. */
        /* function_red/green/blue are ignored if true */
        const bool auto_function,

        /* min/max input values to use if not */
        /* auto_min/auto_max */
        T in_min, /* will map to 0 */
        T in_max, /* will map to 255 */

        /* The color mappings to use. If !auto_function */
        int function_red,
        int function_green,
        int function_blue);

Dense stereo

A number of dense stereo routines are available. These make it possible to implement a full mrcal dense stereo pipeline in C; an example is provided. The available functions are declared in stereo.h:

  • mrcal_rectified_resolution() computes the resolution of the rectified system from the resolution of the input. Usually mrcal_rectified_system() does this internally, and there's no reason to call it directly. The Python wrapper is mrcal.rectified_resolution(), and further documentation is in its docstring
  • mrcal_rectified_system() computes the geometry of the rectified system. The Python wrapper is mrcal.rectified_system(), and further documentation is in its docstring.
  • mrcal_rectification_maps() computes the image transformation maps used to compute the rectified images. To apply the maps, and actually remap the images, the OpenCV cv::remap() function can be used. The Python wrapper is mrcal.rectification_maps(), and further documentation is in its docstring
  • mrcal_stereo_range_sparse(), mrcal_stereo_range_dense() compute ranges from disparities. The former function converts a set of discrete disparity values, while the latter function processes a whole disparity image

Triangulation

A number of triangulation routines are available in triangulation.h. These estimate the position of the 3D point that produced a given pair of observations.