• $ cat "

    Cleaner Test Setup using Builders

    "

    One of the most common causes for messy test code is setup code that reduces
    the signal to noise ratio and makes you lose focus on the parts of the test
    code that actually are important. I've recently taken a liking to the builder
    pattern as a way to reduce this problem.

    In this post I'm going to compare a few different ways to write your setup
    code, to illustrate the pros and cons of the different styles.

    To start off, let's look at some classic object construction code:

    var comment1 = new Comment();
    comment1.Body = "Comment 1 body";
    comment1.Date = new DateTime(2011, 1, 2, 3, 4, 5);

    var comment2 = new Comment();
    comment2.Body = "Comment 2 body";
    comment2.Date = new DateTime(2011, 1, 2, 3, 4, 5);

    var post = new Post();
    post.Title = "Title";
    post.Body = "Body";
    post.Date = new DateTime(2011, 1, 2, 3, 4, 5);
    post.AddComment(comment1);
    post.AddComment(comment2);

    It doesn't get more basic than that, but there is a quite a lot of noise. The
    first step toward reducing that noise is to use the object initializer
    syntax:

    var comment1 = new Comment
    {
    Body = "Comment 1 body",
    Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };

    var comment2 = new Comment
    {
    Body = "Comment 2 body",
    Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };

    var post = new Post
    {
    Title = "Title",
    Body = "Body",
    Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };

    post.AddComment(comment1);
    post.AddComment(comment2);

    I'd say that the this syntax makes the important information stand out a bit
    more. Another way is to use a constructor with named parameters and default
    values:

    var comment1 = new Comment(
    body: "comment 1 body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));

    var comment2 = new Comment(
    body: "comment 2 body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));

    var post = new Post(
    title: "Title",
    body: "Body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));

    post.AddComment(comment1)
    post.AddComment(comment2);

    This approach is a bit more compact than the object initializer way. Apart
    from the syntax, a major problem with both object initializers and
    constructors with named arguments is that they force you to modify the
    entities you want to create so that they work well with the test setup
    code. In the case above that is not a real problem, but it becomes a problem
    if you, for example, want to use certain default values when creating
    instances for your tests that you do not want to use in the production
    code. This is where the builder pattern comes in to the picture.

    A builder is a class whose sole purpose is to facilitate creation of instances
    of a specific class. In this case we would probably use a PostBuilder and a
    CommentBuilder. These classes can have all the helper methods you need so that
    you can easily get instances for your test cases.

    Here is an example:

    var post = new PostBuilder()
    .WithTitle("Title")
    .WithBody("Body")
    .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
    .WithComment(new CommentBuilder()
    .WithBody("comment 1 body")
    .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .WithComment(new CommentBuilder()
    .WithBody("comment 2 body")
    .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .Build();

    This style of programming has been quite popular in the .NET space for the
    last two years or so. A fluent interface using daisy chaining of method calls
    and method names chosen to give a prose like reading experience. However, this
    style easily gets quite verbose and has fallen out of favor. The reason is
    simple, all those "With":s in the example above clutter up the code rather
    than makeing it easier to read. A slightly more compact version could look
    something like this:

    var post = new PostBuilder()
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(new CommentBuilder()
    .Body("comment 1 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .Comment(new CommentBuilder()
    .Body("comment 2 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .Build();

    Now we're getting somewhere. There is less noise, but there are still a couple
    of builder instantiations scattered around the code. The syntax could be
    cleaned up a bit by introducing a nicer way to create the builders. Below is
    an example with a static class which has properties for the different kinds of
    builders. The factory class is called Build to make the code read a little
    nicer.

    var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(Build.Comment
    .Body("comment 1 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .Comment(Build.Comment
    .Body("comment 2 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Build())
    .Build();

    Better. Two problems remaining are the duplication of the word Comment in
    the call to the Comment method, and that annoying call to Build for the
    comments. These problems could be addressed by creating a version of the
    Comment method that takes a lambda operating on a builder as an argument:

    var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
    .Body("comment 1 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5)))
    .Comment(c => c
    .Body("comment 2 body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5)))
    .Build();

    I'd say this is even better. The only noise remaining is the duplication of
    Date/DateTime when setting the date for a comment or a post and the "c => c" part
    of the lambda. The implementation of the PostBuilder class now looks like
    this:

    public class PostBuilder
    {
    private readonly Post _post = new Post();

    public PostBuilder Title(string title)
    {
    _post.Title = title;
    return this;
    }

    public PostBuilder Body(string body)
    {
    _post.Body = body;
    return this;
    }

    public PostBuilder Date(DateTime date)
    {
    _post.Date = date;
    return this;
    }

    public Post Build()
    {
    return _post;
    }

    public PostBuilder Comment(Action<CommentBuilder> initializer)
    {
    var builder = new CommentBuilder();
    initializer(builder);

    var comment = builder.Build();
    _post.AddComment(comment);

    return this;
    }
    }

    I usually find myself using the same DateTime constructor, over and over
    again. This cries out for refactoring. Now we can reap the benefits of using a
    builder class, since we easily can add any helpers we need. In this case by
    allowing the date to be set using the standard six double values for year,
    month, day, hour, minute and second:

    var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(2011, 1, 2, 3, 4, 5)
    .Comment(c => c
    .Body("comment 1 body")
    .Date(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
    .Body("comment 2 body")
    .Date(2011, 1, 2, 3, 4, 5))
    .Build();

    Much better! The builder now contains the following method:

    public PostBuilder Date(
    int year, int month, int day,
    int hour, int minute, int second)
    {
    _post.Date = new DateTime(year, month, day, hour, minute, second);
    return this;
    }

    Ok, so now the setup code looks nice and tight, but the builder class contains
    a nasty form of duplication. For each property that is to be exposed through
    the builder there is a matching method:

    public PostBuilder Title(string title)
    {
    _post.Title = title;
    return this;
    }

    public PostBuilder Body(string body)
    {
    _post.Body = body;
    return this;
    }

    This code is extremely tedious to write, especially if you have a large
    application with a lot of entities. LISP eats this kind of duplication for
    breakfast, as do Ruby, but it is often quite hard to remove in statically
    typed languages which have no pre-processor or macro facilities.

    Fortunately, C# 4 includes the dynamic keyword which opens up new
    possibilities for the static folks. All the dumb builder methods which set the
    property with the same name as the method on the entity could easily be
    replaced with a method missing hook:

    public class DynamicBuilder<T> : DynamicObject 
    where T : class, new()
    {
    protected readonly T Entity = new T();

    // This method is called when you invoke a method that does not exist
    public override bool TryInvokeMember(
    InvokeMemberBinder binder, object[] args, out object result)
    {
    // Remember to return self to enable daisy chaining
    result = this;

    // Get the property on the entity that has the same name
    // as the method that was invoked
    var property = typeof(T).GetProperty(binder.Name);

    var propertyExists = property != null;

    if (propertyExists)
    {
    property.SetValue(Entity, args[0], null);
    }

    return propertyExists;
    }

    public T Build()
    {
    return Entity;
    }
    }

    Sweet! Now you can throw away most of your boring builder code, except for the
    helpers that are tailor made for the specific entity type that your are building.

    The post builder now looks like this:

    public class DynamicPostBuilder : DynamicBuilder<Post>
    {
    public DynamicPostBuilder Date(
    int year, int month, int day,
    int hour, int minute, int second)
    {
    Entity.Date = new DateTime(year, month, day, hour, minute, second);
    return this;
    }

    public DynamicPostBuilder Comment(Action<dynamic> initializer)
    {
    var builder = new DynamicCommentBuilder();

    initializer(builder);

    var comment = builder.Build();
    Entity.AddComment(comment);

    return this;
    }
    }

    The only downside to this is that you lose refactoring support and
    intellisense, which can be a big deal for many .NET developers. However, if
    you use TDD, the refactoring support should not be an issue, since you will
    instantly know what was broken when something is renamed.

    The code for adding a comment looks suspiciously like a bit of code that might
    get repeated in other builders. So there is another chance to, for example,
    introduce a convention that would allow that code to be pushed down and
    handled in the method missing hook of the base class. Only inconsistency and
    lack of imagination set the limits in this case.

    To use the builder you have to make sure that the builder instance is typed as
    dynamic, so that the compiler will get out of your way and allow you to call
    the methods you want to call, even though they are not actually declared in
    the builder class.

    In this case, that can be accomplished by modifying the builder factory class:

    public class DynamicBuild
    {
    public static dynamic Post
    {
    get { return new DynamicPostBuilder(); }
    }

    public static dynamic Comment
    {
    get { return new DynamicCommentBuilder(); }
    }
    }

    So, to sum up. Using the builder pattern allows you to clean up your test code
    significantly and makes it trivial to add helpers when needed.

    Original setup code:

    var comment1 = new Comment();
    comment1.Body = "Comment 1 body";
    comment1.Date = new DateTime(2011, 1, 2, 3, 4, 5);

    var comment2 = new Comment();
    comment2.Body = "Comment 2 body";
    comment2.Date = new DateTime(2011, 1, 2, 3, 4, 5);

    var post = new Post();
    post.Title = "Title";
    post.Body = "Body";
    post.Date = new DateTime(2011, 1, 2, 3, 4, 5);
    post.AddComment(comment1);
    post.AddComment(comment2);

    Builder based setup code:

    var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(2011, 1, 2, 3, 4, 5)
    .Comment(c => c
    .Body("comment 1 body")
    .Date(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
    .Body("comment 2 body")
    .Date(2011, 1, 2, 3, 4, 5))
    .Build();

    Happy building!