mrcal C API
Table of Contents
Internally, the Python functions use the mrcal C API. Only core functionality is available in the C API (the Python API can do some stuff that the C API cannot), but with time more and more stuff will be transitioned to a C-internal representation. Today, end-to-end dense stereo processing in C is possible.
The C API consists of several headers:
basic-geometry.h
: very simple geometry structuresposeutils.h
: pose and geometry functionstriangulation.h
: triangulation routinesmrcal.h
: lens models, projections, optimization
Most usages would simply #include <mrcal.h>
, and this would include all the
headers. This is a C (not C++) library, so X macros are used in several places
for templating.
mrcal is a research project, so the capabilities and focus are still evolving.
Thus, the interfaces, especially those in the C API are not yet stable. I do
try to maintain stability, but this is not fully possible, especially in the
higher-level APIs (mrcal_optimize()
and mrcal_optimizer_callback()
). For
now, assume that each major release breaks both the API and the ABI. The
migration notes for each release are described in the relevant release notes.
If you use the C APIs, shoot me an email to let me know, and I'll keep you in mind when making API updates.
The best documentation for the C interfaces is the comments in the headers. I don't want to write up anything complete and detailed until I'm sure the interfaces are stable. The available functions are broken down into categories, and described in a bit more detail here.
Geometry structures
We have 3 structures in basic-geometry.h
:
mrcal_point2_t
: a vector containing 2 double-precision floating-point values. The elements can be accessed individually as.x
and.y
or as an array.xy[]
mrcal_point3_t
: exactly likemrcal_point2_t
, but 3-dimensional. A vector containing 3 double-precision floating-point values. The elements can be accessed individually as.x
and.y
and.z
or as an array.xyz[]
mrcal_pose_t
: an unconstrained 6-DOF pose. Contains two sub-structures:mrcal_point3_t r
: a Rodrigues rotationmrcal_point3_t t
: a translation
Geometry functions
A number of utility functions are defined in poseutils.h
. Each routine has two
forms:
- A
mrcal_..._full()
function that supports a non-contiguous memory layout for each input and output - A convenience
mrcal_...()
macro that wrapsmrcal_..._full()
, and expects contiguous data. This has many fewer arguments, and is easier to call
Each data argument (input or output) has several items in the argument list:
double* xxx
: a pointer to the first element in the arrayint xxx_stride0
,int xxx_stride1
, …: the strides, one per dimension
The strides are given in bytes, and work as expected. For a (for instance)
3-dimensional xxx
, the element at xxx[i,j,k]
would be accessible as
*(double*) &((char*)xxx)[ i*xxx_stride0 + j*xxx_stride1 + k*xxx_stride2 ]
There are convenience macros in strides.h
, so the above can be equivalently
expressed as
P3(xxx, i,j,k)
These all have direct Python bindings. For instance the Python
mrcal.rt_from_Rt()
function wraps mrcal_rt_from_Rt()
C function.
The poseutils.h
header serves as the listing of available functions.
Lens models
The lens model structures are defined here:
mrcal_lensmodel_type_t
: an enum decribing the lens model type. No configuration is stored here.mrcal_lensmodel_t
: a lens model type and the configuration parameters. The configuration lives in aunion
supporting all the known lens modelsmrcal_lensmodel_metadata_t
: some metadata that decribes a model type. These are inherent properties of a particular model type; answers questions like: Can this model project behind the camera? Does it have an intrinsics core? Does it have gradients implemented?
The Python API describes a lens model with a string that contains the model type
and the configuration, while the C API stores the same information in a
mrcal_lensmodel_t
.
Camera models
We can also represent a full camera model in C. This is a lens model with a pose
and imager dimension: the full set of things in a .cameramodel
file. The
definitions appear in mrcal-types.h
:
typedef struct { double rt_cam_ref[6]; unsigned int imagersize[2]; mrcal_lensmodel_t lensmodel; double intrinsics[0]; } mrcal_cameramodel_t; typedef union { mrcal_cameramodel_t m; struct { double rt_cam_ref[6]; unsigned int imagersize[2]; mrcal_lensmodel_t lensmodel; double intrinsics[4]; }; } mrcal_cameramodel_LENSMODEL_LATLON_t; ...
Note that mrcal_cameramodel_t.intrinsics
has size 0 because the size of this
array depends on the specific lens model being used, and is unknown at compile
time.
So it is an error to define this on the stack. Do not do this:
void f(void) { mrcal_cameramodel_t model; // ERROR }
If you need to define a known-at-compile-time model on the stack you can use the lensmodel-specific cameramodel types:
void f(void) { mrcal_cameramodel_LENSMODEL_OPENCV8_t model; // OK }
This only exists for models that have a constant number of parameters; notably
there is no mrcal_cameramodel_LENSMODEL_SPLINED_STEREOGRAPHIC_t
. When reading
a model from disk, mrcal dynamically allocates the right amount of memory, and
returns a mrcal_cameramodel_t*
.
Projections
The fundamental functions for projection and unprojection are defined here.
mrcal_project()
is the main routine that implements the "forward" direction,
and is available for every camera model. This function can return gradients in
respect to the coordinates of the point being projected and/or in respect to the
intrinsics vector.
mrcal_unproject()
is the reverse direction, and is implemented as a numerical
optimization to reverse the projection operation. Naturally, this is much slower
than mrcal_project()
. Since mrcal_unproject()
is implemented with a
nonlinear optimization, it has no gradient reporting. The Python
mrcal.unproject()
routine is higher-level, and it does report gradients.
The gradients of the forward mrcal_project()
operation are used in this
nonlinear optimization, so models that have no projection gradients defined do
not support mrcal_unproject()
. The Python mrcal.unproject()
routine still
makes this work, using numerical differences for the projection gradients.
Simple, special-case lens models have their own projection and unprojection functions defined:
void mrcal_project_pinhole(...); void mrcal_unproject_pinhole(...); void mrcal_project_stereographic(...); void mrcal_unproject_stereographic(...); void mrcal_project_lonlat(...); void mrcal_unproject_lonlat(...); void mrcal_project_latlon(...); void mrcal_unproject_latlon(...);
These functions do the same thing as the general mrcal_project()
and
mrcal_unproject()
functions, but work much faster.
Layout of the measurement and state vectors
The optimization routine tries to minimize the 2-norm of the measurement vector \(\vec x\) by moving around the state vector \(\vec b\).
We select which parts of the optimization problem we're solving by setting bits
in the mrcal_problem_selections_t
structure. This defines
- Which elements of the optimization vector are locked-down, and which are given to the optimizer to adjust
- Whether we apply regularization to stabilize the solution
- Whether the chessboard should be assumed flat, or if we should optimize deformation factors
Thus the state vector may contain any of
- The lens parameters
- The geometry of the cameras
- The geometry of the observed chessboards and discrete points
- The chessboard shape
The measurement vector may contain
- The errors in observations of the chessboards
- The errors in observations of discrete points
- The penalties in the solved point positions
- The regularization terms
Given mrcal_problem_selections_t
and a vector \(\vec b\) or \(\vec x\), it is
useful to know where specific quantities lie inside those vectors. Here we have
4 sets of functions to answer such questions:
int mrcal_state_index_THING()
: Returns the index in the state vector \(\vec b\) where the contiguous block of values describing the THING begins. THING is any of- intrinsics
- extrinsics
- frames
- points
- calobject_warp
If we're not optimizing the THING, return <0
int mrcal_num_states_THING()
: Returns the number of values in the contiguous block in the state vector \(\vec b\) that describe the given THING. THING is any of- intrinsics
- extrinsics
- frames
- points
- calobject_warp
int mrcal_measurement_index_THING()
: Returns the index in the measurement vector \(\vec x\) where the contiguous block of values describing the THING begins. THING is any of- boards
- points
- regularization
int mrcal_num_measurements_THING()
: Returns the number of values in the contiguous block in the measurement vector \(\vec x\) that describe the given THING. THING is any of- boards
- points
- regularization
State packing
The optimization routine works in the space of scaled parameters, and several functions are available to pack/unpack the state vector \(\vec b\):
void mrcal_pack_solver_state_vector(...); void mrcal_unpack_solver_state_vector(...);
Optimization
The mrcal optimization routines are defined in mrcal.h
. There are two primary
functions, each accessing a lot of functionality, and taking many arguments.
At this time, the prototypes will likely change in each release of mrcal, so try
not to rely on these being stable.
mrcal_optimize()
is the entry point to the optimization routine. This function ingests the state, runs the optimization, and returns the optimal state in the same variables. The optimization routine tries out different values of the state vector by calling an optimization callback function to evaluate each one.mrcal_optimizer_callback()
provides access to the optimization callback function standalone, without being wrapped into the optimization loop
Helper structures
This is correct as of mrcal 2.1. It may change in future releases.
We define some structures to organize the input to these functions. Each
observation has a mrcal_camera_index_t
to identify the observing camera:
// Used to specify which camera is making an observation. The "intrinsics" index // is used to identify a specific camera, while the "extrinsics" index is used // to locate a camera in space. If I have a camera that is moving over time, the // intrinsics index will remain the same, while the extrinsics index will change typedef struct { // indexes the intrinsics array int intrinsics; // indexes the extrinsics array. -1 means "at coordinate system reference" int extrinsics; } mrcal_camera_index_t;
When solving a vanilla calibration problem, we have a set of stationary cameras
observing a moving scene. By convention, in such a problem we set the reference
coordinate system to camera 0, so that camera has no extrinsics. So in a vanilla
calibration problem mrcal_camera_index_t.intrinsics
will be in \([0,
N_\mathrm{cameras})\) and mrcal_camera_index_t.extrinsics
will always be
mrcal_camera_index_t.intrinsics - 1
.
When solving a vanilla structure-from-motion problem, we have a set of moving
cameras observing a stationary scene. Here mrcal_camera_index_t.intrinsics
would be in \([0, N_\mathrm{cameras})\) and mrcal_camera_index_t.extrinsics
would be specify the camera pose, unrelated to
mrcal_camera_index_t.intrinsics
.
These are the limiting cases; anything in-between is allowed.
A board observation is defined by a mrcal_observation_board_t
:
// An observation of a calibration board. Each "observation" is ONE camera // observing a board typedef struct { // which camera is making this observation mrcal_camera_index_t icam; // indexes the "frames" array to select the pose of the calibration object // being observed int iframe; } mrcal_observation_board_t;
And an observation of a discrete point is defined by a
mrcal_observation_point_t
:
// An observation of a discrete point. Each "observation" is ONE camera // observing a single point in space typedef struct { // which camera is making this observation mrcal_camera_index_t icam; // indexes the "points" array to select the position of the point being // observed int i_point; // Observed pixel coordinates. This works just like elements of // observations_board_pool: // // .x, .y are the pixel observations // .z is the weight of the observation. Most of the weights are expected to // be 1.0. Less precise observations have lower weights. // .z<0 indicates that this is an outlier. This is respected on // input // // Unlike observations_board_pool, outlier rejection is NOT YET IMPLEMENTED // for points, so outlier points will NOT be found and reported on output in // .z<0 mrcal_point3_t px; } mrcal_observation_point_t;
Note that the details of the handling of discrete points may change in the future.
We have mrcal_problem_constants_t
to define some details of the optimization
problem. These are similar to mrcal_problem_selections_t
, but consist of
numerical values, rather than just bits. Currently this structure contains valid
ranges for interpretation of discrete points. These may change in the future.
// Constants used in a mrcal optimization. This is similar to // mrcal_problem_selections_t, but contains numerical values rather than just // bits typedef struct { // The minimum distance of an observed discrete point from its observing // camera. Any observation of a point below this range will be penalized to // encourage the optimizer to move the point further away from the camera double point_min_range; // The maximum distance of an observed discrete point from its observing // camera. Any observation of a point abive this range will be penalized to // encourage the optimizer to move the point closer to the camera double point_max_range; } mrcal_problem_constants_t;
The optimization function returns most of its output in the same memory as its
input variables. A few metrics that don't belong there are returned in a
separate mrcal_stats_t
structure:
// This structure is returned by the optimizer, and contains some statistics // about the optimization typedef struct { // generated by an X-macro /* The RMS error of the optimized fit at the optimum. Generally the residual */ /* vector x contains error values for each element of q, so N observed pixels */ /* produce 2N measurements: len(x) = 2*N. And the RMS error is */ /* sqrt( norm2(x) / N ) */ double rms_reproj_error__pixels; /* How many pixel observations were thrown out as outliers. Each pixel */ /* observation produces two measurements. Note that this INCLUDES any */ /* outliers that were passed-in at the start */ int Noutliers; } mrcal_stats_t;
This contains some statistics describing the discovered optimal solution.
Camera model reading/writing
A simple interface for reading/writing .cameramodel
data from C is available:
// if len>0, the string doesn't need to be 0-terminated. If len<=0, the end of // the buffer IS indicated by a '\0' byte mrcal_cameramodel_t* mrcal_read_cameramodel_string(const char* string, int len); mrcal_cameramodel_t* mrcal_read_cameramodel_file (const char* filename); void mrcal_free_cameramodel(mrcal_cameramodel_t** cameramodel); bool mrcal_write_cameramodel_file(const char* filename, const mrcal_cameramodel_t* cameramodel);
This reads and write the mrcal_cameramodel_t
structures. Only the
.cameramodel
file format is supported by these C functions. The Python API
supports more formats.
Image reading/writing
mrcal includes simple functions for reading/writing images. These use the
freeimage library. These functions aren't interesting, or better than any other
functions you may have already. The declarations are in mrcal-image.h
, and the
documentation lives there.
Dense stereo
A number of dense stereo routines are available. These make it possible to
implement a full mrcal dense stereo pipeline in C. The available functions are
declared in stereo.h
:
mrcal_rectified_resolution()
computes the resolution of the rectified system from the resolution of the input. Usuallymrcal_rectified_system()
does this internally, and there's no reason to call it directly. The Python wrapper ismrcal.rectified_resolution()
, and further documentation is in its docstringmrcal_rectified_system()
computes the geometry of the rectified system. The Python wrapper ismrcal.rectified_system()
, and further documentation is in its docstring.mrcal_rectification_maps()
computes the image transformation maps used to compute the rectified images. To apply the maps, and actually remap the images, the OpenCVcv::remap()
function can be used. The Python wrapper ismrcal.rectification_maps()
, and further documentation is in its docstring
Triangulation
A number of triangulation routines are available in triangulation.h
. These
estimate the position of the 3D point that produced a given pair of
observations.