Code Blog - Erik Öjebo.se

$ cat "

Cleaner Test Setup using Builders

One of the most common causes for messy test code is setup code that reduces
the signal to noise ratio and makes you lose focus on the parts of the test
code that actually are important. I've recently taken a liking to the builder
pattern as a way to reduce this problem.

In this post I'm going to compare a few different ways to write your setup
code, to illustrate the pros and cons of the different styles.

To start off, let's look at some classic object construction code:

var comment1 = new Comment();
comment1.Body = "Comment 1 body";
comment1.Date = new DateTime(2011, 1, 2, 3, 4, 5);
var comment2 = new Comment();
comment2.Body = "Comment 2 body";
comment2.Date = new DateTime(2011, 1, 2, 3, 4, 5);
var post = new Post();
post.Title = "Title";
post.Body = "Body";
post.Date = new DateTime(2011, 1, 2, 3, 4, 5);
post.AddComment(comment1);
post.AddComment(comment2);

It doesn't get more basic than that, but there is a quite a lot of noise. The
first step toward reducing that noise is to use the object initializer
syntax:

var comment1 = new Comment
    {
        Body = "Comment 1 body",
        Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };
var comment2 = new Comment
    {
        Body = "Comment 2 body",
        Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };
var post = new Post
    {
        Title = "Title",
        Body = "Body",
        Date = new DateTime(2011, 1, 2, 3, 4, 5)
    };
post.AddComment(comment1);
post.AddComment(comment2);

I'd say that the this syntax makes the important information stand out a bit
more. Another way is to use a constructor with named parameters and default
values:

var comment1 = new Comment(
    body: "comment 1 body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));
var comment2 = new Comment(
    body: "comment 2 body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));
var post = new Post(
    title: "Title",
    body: "Body",
    date: new DateTime(2011, 1, 2, 3, 4, 5));
post.AddComment(comment1)
post.AddComment(comment2);

This approach is a bit more compact than the object initializer way. Apart
from the syntax, a major problem with both object initializers and
constructors with named arguments is that they force you to modify the
entities you want to create so that they work well with the test setup
code. In the case above that is not a real problem, but it becomes a problem
if you, for example, want to use certain default values when creating
instances for your tests that you do not want to use in the production
code. This is where the builder pattern comes in to the picture.

A builder is a class whose sole purpose is to facilitate creation of instances
of a specific class. In this case we would probably use a PostBuilder and a
CommentBuilder. These classes can have all the helper methods you need so that
you can easily get instances for your test cases.

Here is an example:

var post = new PostBuilder()
    .WithTitle("Title")
    .WithBody("Body")
    .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
    .WithComment(new CommentBuilder()
        .WithBody("comment 1 body")
        .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .WithComment(new CommentBuilder()
        .WithBody("comment 2 body")
        .WithDate(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .Build();

This style of programming has been quite popular in the .NET space for the
last two years or so. A fluent interface using daisy chaining of method calls
and method names chosen to give a prose like reading experience. However, this
style easily gets quite verbose and has fallen out of favor. The reason is
simple, all those "With":s in the example above clutter up the code rather
than makeing it easier to read. A slightly more compact version could look
something like this:

var post = new PostBuilder()
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(new CommentBuilder()
        .Body("comment 1 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .Comment(new CommentBuilder()
        .Body("comment 2 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .Build();

Now we're getting somewhere. There is less noise, but there are still a couple
of builder instantiations scattered around the code. The syntax could be
cleaned up a bit by introducing a nicer way to create the builders. Below is
an example with a static class which has properties for the different kinds of
builders. The factory class is called Build to make the code read a little
nicer.

var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(Build.Comment
        .Body("comment 1 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .Comment(Build.Comment
        .Body("comment 2 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5))
        .Build())
    .Build();

Better. Two problems remaining are the duplication of the word Comment in
the call to the Comment method, and that annoying call to Build for the
comments. These problems could be addressed by creating a version of the
Comment method that takes a lambda operating on a builder as an argument:

var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(new DateTime(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
        .Body("comment 1 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5)))
    .Comment(c => c
        .Body("comment 2 body")
        .Date(new DateTime(2011, 1, 2, 3, 4, 5)))
    .Build();

I'd say this is even better. The only noise remaining is the duplication of
Date/DateTime when setting the date for a comment or a post and the "c => c" part
of the lambda. The implementation of the PostBuilder class now looks like
this:

public class PostBuilder
{
    private readonly Post _post = new Post();
    
    public PostBuilder Title(string title)
    {
        _post.Title = title;
        return this;
    }
    public PostBuilder Body(string body)
    {
        _post.Body = body;
        return this;
    }
    public PostBuilder Date(DateTime date)
    {
        _post.Date = date;
        return this;
    }
    public Post Build()
    {
        return _post;
    }
    
    public PostBuilder Comment(Action<CommentBuilder> initializer)
    {
        var builder = new CommentBuilder();
        initializer(builder);
        
        var comment = builder.Build();
        _post.AddComment(comment);
        return this;
    }
}

I usually find myself using the same DateTime constructor, over and over
again. This cries out for refactoring. Now we can reap the benefits of using a
builder class, since we easily can add any helpers we need. In this case by
allowing the date to be set using the standard six double values for year,
month, day, hour, minute and second:

var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(2011, 1, 2, 3, 4, 5)
    .Comment(c => c
        .Body("comment 1 body")
        .Date(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
        .Body("comment 2 body")
        .Date(2011, 1, 2, 3, 4, 5))
    .Build();

Much better! The builder now contains the following method:

public PostBuilder Date(
    int year, int month, int day, 
    int hour, int minute, int second)
{
    _post.Date = new DateTime(year, month, day, hour, minute, second);
    return this;
}

Ok, so now the setup code looks nice and tight, but the builder class contains
a nasty form of duplication. For each property that is to be exposed through
the builder there is a matching method:

public PostBuilder Title(string title)
{
    _post.Title = title;
    return this;
}
public PostBuilder Body(string body)
{
    _post.Body = body;
    return this;
}

This code is extremely tedious to write, especially if you have a large
application with a lot of entities. LISP eats this kind of duplication for
breakfast, as do Ruby, but it is often quite hard to remove in statically
typed languages which have no pre-processor or macro facilities.

Fortunately, C# 4 includes the dynamic keyword which opens up new
possibilities for the static folks. All the dumb builder methods which set the
property with the same name as the method on the entity could easily be
replaced with a method missing hook:

public class DynamicBuilder<T> : DynamicObject 
    where T : class, new()
{
    protected readonly T Entity = new T();
    // This method is called when you invoke a method that does not exist
    public override bool TryInvokeMember(
        InvokeMemberBinder binder, object[] args, out object result)
    {
        // Remember to return self to enable daisy chaining
        result = this;
        // Get the property on the entity that has the same name
        // as the method that was invoked
        var property = typeof(T).GetProperty(binder.Name);
        var propertyExists = property != null;
        if (propertyExists)
        {
            property.SetValue(Entity, args[0], null);
        }
        return propertyExists;
    }
    public T Build()
    {
        return Entity;
    }
}

Sweet! Now you can throw away most of your boring builder code, except for the
helpers that are tailor made for the specific entity type that your are building.

The post builder now looks like this:

public class DynamicPostBuilder : DynamicBuilder<Post>
{
    public DynamicPostBuilder Date(
        int year, int month, int day,
        int hour, int minute, int second)
    {
        Entity.Date = new DateTime(year, month, day, hour, minute, second);
        return this;
    }
    public DynamicPostBuilder Comment(Action<dynamic> initializer)
    {
        var builder = new DynamicCommentBuilder();
            
        initializer(builder);
            
        var comment = builder.Build();
        Entity.AddComment(comment);
        return this;
    }
}

The only downside to this is that you lose refactoring support and
intellisense, which can be a big deal for many .NET developers. However, if
you use TDD, the refactoring support should not be an issue, since you will
instantly know what was broken when something is renamed.

The code for adding a comment looks suspiciously like a bit of code that might
get repeated in other builders. So there is another chance to, for example,
introduce a convention that would allow that code to be pushed down and
handled in the method missing hook of the base class. Only inconsistency and
lack of imagination set the limits in this case.

To use the builder you have to make sure that the builder instance is typed as
dynamic, so that the compiler will get out of your way and allow you to call
the methods you want to call, even though they are not actually declared in
the builder class.

In this case, that can be accomplished by modifying the builder factory class:

public class DynamicBuild
{
    public static dynamic Post
    {
        get { return new DynamicPostBuilder(); }
    }
        
    public static dynamic Comment
    {
        get { return new DynamicCommentBuilder(); }
    }
}

So, to sum up. Using the builder pattern allows you to clean up your test code
significantly and makes it trivial to add helpers when needed.

Original setup code:

var comment1 = new Comment();
comment1.Body = "Comment 1 body";
comment1.Date = new DateTime(2011, 1, 2, 3, 4, 5);
var comment2 = new Comment();
comment2.Body = "Comment 2 body";
comment2.Date = new DateTime(2011, 1, 2, 3, 4, 5);
var post = new Post();
post.Title = "Title";
post.Body = "Body";
post.Date = new DateTime(2011, 1, 2, 3, 4, 5);
post.AddComment(comment1);
post.AddComment(comment2);

Builder based setup code:

var post = Build.Post
    .Title("Title")
    .Body("Body")
    .Date(2011, 1, 2, 3, 4, 5)
    .Comment(c => c
        .Body("comment 1 body")
        .Date(2011, 1, 2, 3, 4, 5))
    .Comment(c => c
        .Body("comment 2 body")
        .Date(2011, 1, 2, 3, 4, 5))
    .Build();

Happy building!