Parsing Enum values with Sprache

Sprache is a fantastic little library for creating parsers in C#. It's great for situations where you want to create a fairly simple DSL; something beyond what you can get with String.Split() and regular expressions, but not at the level where you need to learn a language workbench tool. I used it a few years ago to create a very simple language to define rules for an alert system - my clients needed a way for their customers to create custom rules to alert them when a piece of equipment was operating outside of parameters. I'm using it now to write a simple language for defining video game characters and NPCs.

One of the tasks I've run into repeatedly in the current project is parsing text from the language into a C# enumeration. As a very simple example of the problem, let's say I need to parse the following input:

var input = "+2 to strength";

And let's say the goal is to parse that into an instance of the following class:

public class StatChange
{
    public StatChange(Stat stat, int amount)
    {
        Stat = stat;
        Amount = amount;
    }

    public Stat Stat { get; set; }
    public int Amount { get; set; }
}

Stat is an enumeration that looks like this:

public enum Stat
{
    Strength,
    Wisdom,
    Intelligence,
    Charisma,
    Dexterity,
    Constitution
}

The straightforward way to build a parser for this using Sprache looks like this:

public static class RulesGrammar
{
    public static readonly Parser<Stat> StatParser =
        from stat in Parse.IgnoreCase("Strength").Token().Return(Stat.Strength)
            .Or(Parse.IgnoreCase("Wisdom").Token().Return(Stat.Wisdom))
            .Or(Parse.IgnoreCase("Charisma").Token().Return(Stat.Charisma))
            .Or(Parse.IgnoreCase("Dexterity").Token().Return(Stat.Dexterity))
            .Or(Parse.IgnoreCase("Constitution").Token().Return(Stat.Constitution))
            .Or(Parse.IgnoreCase("Intelligence").Token().Return(Stat.Intelligence))
        select stat;

    public static readonly Parser<StatChange> StatChange =
        from amount in Parse.Regex(@"[+-]\d").Token()
        from to in Parse.String("to").Token()
        from stat in StatParser.Token()
        select new StatChange(stat, int.Parse(amount));
}

With that in place, I parse the input above with a single call:

var statChange = RulesGramar.StatChange.Parse(input);

The StatChange parser is pretty straightforward - it parses the amount ("+2" in this example), parses the "to" part of the phrase, and then uses StatParser to determine which statistic is being modified. Then it creates a new StatChange instance with those values.

The awkward part of this is StatParser. The code is simple, but repetitive. This pattern requires me to account for every single enum value, and any time I change an enum I need to make a change to the parser for it, too.

If you've only got to do this a couple of times when writing your DSL, it's not a big deal. But I've already got half a dozen enumerations in this project which require parsing, and it's likely to be quite a few more by the end. Also, this pattern is susceptible to other problems, like fat-fingering the strings or simply forgetting to include an enumeration value.

What I'd like to have is an easy way to generate parsers for enumerations, rather than doing the repetitive work of writing them explicitly each time. So I whipped up a helper class to handle it:

public static class EnumParser<T>
{
    public static Parser<T> Create()
    {
        var names = Enum.GetNames(typeof(T));

        var parser = Parse.IgnoreCase(names.First()).Token()
            .Return((T)Enum.Parse(typeof(T), names.First()));

        foreach (var name in names.Skip(1))
        {
            parser = parser.Or(Parse.IgnoreCase(name).Token().Return((T)Enum.Parse(typeof(T), name)));
        }

        return parser;
    }
}

This class takes an enum type and generates a parser equivalent to the hand-written one above. With this class, RulesGrammar now looks like this:

public static class RulesGrammar
{
     public static readonly Parser<Stat> StatParser = EnumParser<Stat>.Create();

    public static readonly Parser<StatChange> StatChange =
        from amount in Parse.Regex(@"[+-]\d").Token()
        from to in Parse.String("to").Token()
        from stat in StatParser.Token()
        select new StatChange(stat, int.Parse(amount));
}

Much simpler, and now I don't have to remember to update the parser every time I change the enum.

A couple of things to note here:

  • I'm using Parse.IgnoreCase() in all of these because my end users will complain if they have to worry about casing things correctly. If you want your language to care about case, use Parse.String() instead.
  • EnumParser<T>.Create() could be written as a one-line LINQ Aggregate(). I'm leaving it in this expanded form because I think it more clearly depicts how the parser is being constructed.

Happy parsing!