Converting units of measure with Convertinator

(If you're just looking to get started quickly with the library, take a look at the readme. If you're looking for a more in-depth discussion of the library and how it works, read on.)

In the past year I've been writing a lot of components which have to collect, process, and display various sorts of measurement data from many different sources. For example, I might have to retrieve odometer readings from a machine in miles, store them in kilometers, and be able to display them in both. Or retrieve the amount of time an engine was running in a resolution of seconds, then display it in the nearest whole hour.

It seems like a fairly straightforward problem at first - retrieve the data, determine the units, and do some math if necessary to convert the data to the units you actually need. Get the data in miles, store it in kilometers, and convert it back to miles if the user's display settings call for US units. Done, no problem. Unfortunately, there are a couple snags.

First of all, the sources vary. Some are data feeds from manufacturers written to a specific standard, and some are directly from machines via embedded systems which are using an evolving protocol.

The standard which the data feeds follow has a notion of units (basically metric vs US); any measurements transmitted must be accompanied by a sring value that says what the unit of measure is. The acceptable values for these strings are strictly defined, but in practice that doesn't stop the implementers from just using whatever they feel like. Fuel, for instance, is supposed to be in either "liter" or "gallon". Depending on whose data feed you're actually reading, however, you might see "gal" or "gallons" or "liters" or "lt".

For data coming directly from the machines, the units of measure vary even more wildly. Most of these are custom installations, so the units used (and their names and abbreviations) are whatever the customer uses locally. The system used (metric or US) might depend on customer location, but may also end up being mixed because of data standards which are fixed in one system. And sometimes, incorrect names or abbreviations are just hanging around from long ago installations.

So anything that takes this data as input needs to be pretty liberal in what it accepts. Additionally, the output needs to be flexible. In most situations, standard conventions are fine; converting miles to metric usually means you want kilometers, for example. But what if someone wants to see kilopascals converted to bar instead of psi? Or worse, what if they want all of their units in metric except for one measurement that they want to see in US units for some reason? So ideally, we should be able to configure the mappings from input units to output units.

There also needs to be flexibility in configuring the actual conversion factors. In some cases, the 'correct' conversion factor may be defined by the standard being implemented; in others, customer preference for significant digits and rounding may require use of more or less precise conversions; in still others, we might be creating or consuming data for legacy system with specific precision requirements. So a mile in kilometers may be 1.60934, or it may just be 1.61. These are admittedly edge cases, but I don't want to be stuck if they come up.

This isn't the first time I've had to solve this problem - years ago I had a similar situation working on some milling software. Back then, and during my first pass at solving the problem this time around, I ended up with a ton of explicit conversions back and forth between various hard-coded units. There were conversions from meters to feet, kilometers to miles, seconds to hours, seconds to minutes, minutes to hours, meters to yards, and more, and of course all of the reverse conversions. After a while it gets tedious writing explicit methods for each of these. And that's not how humans would do these conversions, anyway.

For one, we know enough to reverse conversions. I know that there are 5,280 feet in a mile, so I don't need to memorize how many miles there are in a foot. I have a process for going from miles -> feet (multiplication by 5,280), and because that process is reversible I also have a process for going from feet -> miles (division by 5,280).

If someone asked you to convert some arbitrary metric unit, e.g. decimeters, to feet, how would you do it? My guess is that you don't have a conversion factor for that memorized. You know how to go from decimeters to meters, and you know how to go from meters to feet. You have a path that gets your from one unit to the other, so you simply follow that path, applying each step as you go.

Working through conversions this way has the advantage of requiring minimal "configuration" of your brain. Instead of memorizing 6 conversion formulas to cover the whole system (decimeters to meters and back, meters to feet and back, and decimeters to feet and back), you can get by with 2 conversion formulas. And the return on investment just gets better the more units you add. It's been a long (very long) time since I took a combinatorics class, but if I'm doing the math right, a system with seven units in it (think inches, feet, yards, miles, centimeters, meters, and kilometers) takes 42 conversion formulas to do directly, and only 6 to handle indirectly. Adding one more unit bumps the direct conversions to 56, but requires only one indirect conversion.

That's a lot less work for a lot more benefit. So that's what Convertinator does - it models that process of converting things using intermediate steps.

It does so using a bidirectional directed graph (provided by the excellent QuickGraph library). Each vertex on the graph is a unit, and each edge on the graph is a list of conversion operations. For example, we can define a conversion graph with meters and feet and a conversion between them like this:

var graph = new ConversionGraph();

var meter = new Unit("meter");
var feet = new Unit("foot");

graph.AddConversion(Conversions.One(meter).In(feet).Is(3.28084M));

Here's what the graph we just defined looks like:

meter_feet.png

The first thing to note here is that we only defined one edge of this graph (from meters to feet) in code. The other edge was automatically added for us. The edges are defined with reversible operations:

public interface IConversion
{
    IEnumerable<IConversionStep> Steps { get; }
    void AddStep(IConversionStep step);
    Conversion Reverse();
}

public interface IConversionStep
{
    decimal Apply(decimal input);
    IConversionStep Reverse();
}

So creating the edge going back the other direction is simply a matter of reversing the order of the operations, and using their reverse implementation. Convertinator does this every time you add a conversion to the graph.

We talk about a list of conversion operations; while most unit conversions are simply a matter of multiplying or dividing, some (notably temperature) also require an offset. So the currently defined operations are Multiply and Add, each of which is easily reversible (Divide and Substract). That covers all of the conversions I've needed in practice so far, though I can imagine situations where I'll need to add operations for exponents and logarithms.

When we call the Convert method to go from one unit to the other, Convertinator finds the source unit in the graph and searches for the shortest path to the destination unit. In this case, the shortest path is obvious; Convterinator takes edge 1 and applies its set of operations to the source value in order to get the destination value.

Here's a more complex example:

var graph = new ConversionGraph();

var meter = new Unit("meter");
var feet = new Unit("foot");
var kilometer = new Unit("kilometer");
var inches = new Unit("inch");

graph.AddConversion(
    Conversions.From(kilometer).To(meter).MultiplyBy(1000M),
    Conversions.From(meter).To(feet).MultiplyBy(3.28084M),
    Conversions.From(feet).To(inches).MultiplyBy(12M));

And the resulting graph:

more_complex.png

To convert from kilometers to inches, Convertinator starts at the kilometer node and finds the shortest path to inches (via meters, then feet). It then applies the operations from edges 1, 3, and 5 in order to get the value for inches.

I mentioned earlier that I needed the library to be liberal in what it accepts. When searching for a vertex in the graph, Convertinator will match a vertex on any of the names, plurals, or abbreviations you've defined for a unit. For instance, you might define 'celcius' like this:

var celcius = new Unit("Celcius")
    .IsAlsoCalled("celcius", "Centigrade", "centigrade", "degrees Celcius", "degrees Centigrade")
    .IsAlsoCalled("celcuis") // typo from legacy system
    .CanBeAbbreviated("°C")
    .PluralizeAs("°C");

When looking for this unit in order to perform a conversion, it would match on "celcius", "centigrade", "°C", "degrees Celcius", "degrees Centigrade", and even the typo, "celcuis", should that have worked it's way into a legacy system somewhere.

We can add Fahrenheit to the previous example and set up a conversion:

var temperature = new ConversionGraph();

var fahrenheit = new Unit("degrees Fahrenheit")
    .IsAlsoCalled("Fahrenheit")
    .CanBeAbbreviated("°F")
    .PluralizeAs("°F");

temperature.AddConversion(
    Conversions.From(fahrenheit).To(celcius).Subtract(32).MultiplyBy(5M / 9M));

With that setup, we can use any alternate name for the units when doing the conversions:

var result1 = temperature.Convert(new Measurement("°F", 32M), "°C"); // zero
var result2 = temperature.Convert(new Measurement("Fahrenheit", 212M), "centigrade"); // 100

Of course, in many situations we're not really going to care which units a particular measurement is getting converted to; we really only care what measurement system it ends up in. For instance, I have to display all of these measurements I've been working with on a web site. The end user might have a preference for which system they see the data in based on their industry or customers. So I want them to be able to select whether the values they see are in US units or metric units. In that case, I don't want to have to take each measurement and run it through a giant switch statement in order to determine the destination units; I'd rather have Convertinator do that for me automatically.

So units can be tagged with a system, either by calling the SystemIs() method or simply setting the System property. With all of the units in a ConversionGraph tagged with system names, you can then use the ConvertSystem method to convert from your source measurement to a measurement in the destination system.

Here's a quick example:

var graph = new ConversionGraph();

var meter = new Unit("meter").SystemIs("metric");
var mile = new Unit("mile").SystemIs("US");
var feet = new Unit("foot").SystemIs("US");
var kilometer = new Unit("kilometer").SystemIs("metric");

graph.AddConversion(Conversions.One(meter).In(feet).Is(3.28084M));
graph.AddConversion(Conversions.One(kilometer).In(meter).Is(1000M));
graph.AddConversion(Conversions.One(mile).In(feet).Is(5280M));

var result = graph.ConvertSystem(new Measurement("meter", 1M), "US");
length.png

With this conversion graph, calling ConvertSystem on a meter measurement with a destination of "US" results in a measurement of 3.2808 with a unit of "foot". Convertinator starts at the source unit and finds the shortest path it can to any unit tagged with the destination system, which in this case is "foot".

Which is fine, but when we think about converting units between systems we usually have some sort of implied counterpart in mind. For example, if you asked most people to convert kilometers to US units, they'd give you back a value in miles, not feet. Ask them to convert grams and they'd give you ounces; for kilograms they'd give you pounds.

In order to handle this expectation, when we define the units we can also define 'counterparts' in other systems. These counterparts will be the destination units used when converting to another system, even if the path to them isn't the shortest. Here's a version of the previous example with the kilometers -> miles expectation added in:

var graph = new ConversionGraph();

var meter = new Unit("meter").SystemIs("metric");
var mile = new Unit("mile").SystemIs("US");
var feet = new Unit("foot").SystemIs("US");
var kilometer = new Unit("kilometer").SystemIs("metric").HasCounterPart(mile);

graph.AddConversion(Conversions.One(meter).In(feet).Is(3.28084M));
graph.AddConversion(Conversions.One(kilometer).In(meter).Is(1000M));
graph.AddConversion(Conversions.One(mile).In(feet).Is(5280M));

By defining "mile" as a counterpart to "kilometer", we've let Convertinator know that if we call

var result = graph.ConvertSystem(new Measurement("kilometer", 1M), "US");

we expect the result to be in miles.

Finally, to make debugging complex graphs easier (and to create the graph images above), I added a ToDotFile() method to the ConversionGraph class that outputs .dot files for use with tools like GraphViz. There's also an example program to demonstrate creating .png files in the source.

Give it a try; the source code is available on GitHub and library is available as a nuget package. Bug reports and feature requests (via the GitHub issues interface) are always welcome.

PS - If you're curious about the name, it's just that I have a hard time coming up with clever names for things. So I've just started subscribing to the Dr. Heinz Doofenshmirtz method for naming my projects.