Date archives "January 2015"

Dependency Injection: are you doing it wrong?

A lot of people are applying dependency injection (DI) in their software designs. As anything, DI has its proponents and opponents but I believe that given the right context it can actually help your software design by making dependencies more explicit and better testable.

A lot of implementations of DI I encounter have a problem though: they have the actual dependency running in the wrong way. In the case of business software, for example, this would be the business layer depending on the the data layer instead of the other way around. This can cause some pretty significant problems downstream.

In this post I’ll review the basics of dependency injection, dependency inversion and the problems that occur when the dependency is running the wrong way.

Injecting data access code in business apps

One of the contexts in which I’ve had success applying DI is data access in business software. In order to keep our business rule tests run fast enough we need an in-memory data source instead of a disk-based one, since those are usually (still) not fast enough. You usually end up with something like this:

interface DataAccess {
    Record GetRecordById(int id);
    void Save(Record record);
}

class Application {
    DataAccess _access;

    public Application(DataAccess access) {
        _access = access;
    }

    public void UpdateRecordName(int id, string newName) {
        var rec = _access.GetRecordById(id);
        rec.Name = newName;
        _access.Save(rec);
    }
}

In our tests, we then initialize Application with some kind of InMemoryDataAccess implementation and our production system we uses a PersistentDataAccess. All pretty standard stuff and something I see happening in one form or another in most code bases I encounter.

Dependency Inversion

Dependency Injection is closely related to the dependency inversion principle (the D of SOLID). This principle states two things:

  1. High-level modules should not depend on low-level modules. Both should depend on abstractions.
  2. Abstractions should not depend on details. Details should depend on abstractions.

Now, this is a principle I don’t see widely applied. Instead, I usually see something like this:

wrong-direction

Here, we have the business layer depending on the data access code in the data access layer. When using Entity Framework code first on .NET, for example, we would have PersistentDataAccess being the class deriving from DbContext (maybe MyDomainDbContext) and from that we would extract an interface like IMyDomainDbContext representing the DataAccess interface.

This is a clear violation of the dependency inversion principle: in this case the high-level module is the module where the business logic lives (i.e. the business layer) and the low-level module is the module containing the data access code (i.e. the data layer). Adhering to the principle means that our business logic layer should not depend on our data access layer, which it does.

How does this cause us problems?

1. The data layer cannot access our domain classes

Since the business layer depends on the data layer we cannot access types from our business layer without creating circular dependency chains. While some environments actually allow those (and some don’t), in general it’s not considered particularly healthy if there are (a lot of) dependency cycles across modules.

But generally, I do want to be able to create and use types from my business layer in the data layer. For example, in DDD I want to return Domain Entities or Value objects from my DAL. Since those classes are defined in the domain layer, it is impossible to access those types in the DAL. So instead we end up with something like this:

no-domain-types

As you can see, in general, I won’t be able to use any custom business types in the DAL, including the simplest of value objects. This means all the mapping needs to happen in the business layer, which means we need to change the business layer whenever we add a new way of mapping.

By reversing the dependency, we can move all the mapping code to the DAL, keeping our business layer clutter-free and allowing the data layer to change independently:

domain-types-from-dal

This is the heart of the dependency inversion principle. Note that we moved the definition of the DataAccess interface from the DAL to the business layer. This is another key principle: we let consumers define the interface, and leave it up to others conform to this interface.

2. It’s hard to keep your business layer clean

As you can see in the previous section, when we have the dependency pointing the wrong way, our business layer gets polluted with all kind of stuff that doesn’t really belong there, like the mapping code above. This makes the business layer harder to reason about, since we can’t do it in isolation anymore. This hurts productivity and new entrants to your code base need more time to get going.

3. All implementations depend on our data layer

Let’s say we’re writing tests for our business layer. We’ll end up with something like this:

tests-depend-on-dal

In this diagram we see that our Tests package depends on the DAL and that means that, since dependencies are transitive, our Tests package depends on everything the DAL depends on. This will usually mean that our Tests package gets to depend on the implementation chosen in the DAL layer. So if our production implementation of DataAccess is backed by something like Entity Framework, we also need to take in that dependency in the Tests package. Since we’re probably not using anything from Entity Framework for our tests, being forced to take this dependency doesn’t really feel right.

It’s even worse if we’d like to switch to a completely different DAL implementation (which, agreed, doesn’t happen as much as we’re led to believe). But let’s say we’re writing a new DAL based on NHibernate instead of the current Entity Framework. Changing ORMs is something I have actually seen happen. In this case our NHibernate implementation will have a dependency on Entity Framework. It’s pretty obvious that just cannot be right.

As before, we can solve this problem by moving DataAccess to the business layer:

look-no-dal

Here we’ve dropped the dependency, and the design is much simpler than the one before.

4. The data layer can force changes to the business layer

To see how this is possible, let’s again consider this situation:

no-domain-types

If we now change a mapped property of our EntityRecord class, this will break our EntityMapper, which lives in the business layer. This is obviously something we shouldn’t want, since an implementation detail of a low-level component (data access) can now break, and therefore force a change to, a high-level component (business). In general, we just don’t want something as low-level as data access changing the most important part of our software, that is, the business.

Conclusion

In this post I’ve shown some problems that can occur when you’re applying dependency injection without also taking into account dependency inversion. In my experience these problems can become pretty big  the further you get into a project.

Therefore, next time you’re using dependency injection, please also consider the direction of your dependencies and put them in a direction where low-level components depend on the high-level ones. It will save you a lot of trouble later on.

Disclaimer: In this post I mostly focused on the situation of business apps, which I’m most familiar with. I am, however, pretty sure these principles apply in other contexts as well.

Classless entities in DDD

In Domain-Driven Design (DDD) we try to create a shared model between customers, developers and code. The value in such a model is that everyone speaks the same language, reducing the possibilities of miscommunication and thereby hoping to improve the quality of the software.

One of the implications of this strategy is that any model we create must actually be representable in code. That draws some requirements upon the actual programming language we’re using. For that reason, in a lot of DDD examples an object-oriented language is used, since those languages provide constructs, especially objects, that seem to map pretty well to the structure of a lot of domains. But doing DDD definitely doesn’t constrain you to using an OO language: I see people doing DDD in F# and Eric Evans himself refers to using Prolog for some specific domains in the blue book.

When doing DDD in an OO language, however, there seem to be some established patterns and practices on how to represent specific domain concepts in code, the most common probably being always mapping entities or value objects in the model to a class in code. I’ve seen this former pattern cause some problems from time to time, and I personally don’t always follow that one as strictly anymore.

Let’s take the Purchase Order example from the blue book. The problem is that a given purchase order (a domain entity) can only have a maximum total value. This total order value is determined by the PO’s purchase items (also entities), consisting of a quantity, price and a reference to the part. We model this using a Purchase Order aggregate which is responsible for maintaining this invariant. So how do we represent this in code?

A common way of doing this is providing the following API:

[TestMethod]
public void when_changing_the_quantity_of_an_item_we_cannot_exceed_the_max_value() {
    Order order = CreateOrder(ofMaximumValue:10, ofCurrentValue:9);
    Part part = CreatePartOfValue(1);

    PurchaseItem item = order.NewItem(part, quantity:1);

    try {
        order.ChangeQuantity(item,newQuantity:2);
    }
    catch(MaximumOrderValueExceeded) {
        return;
    }

    Assert.Fail("MaximumOrderValueExceeded was not thrown when exceeding the maximum order value");
}

In which we have represented the order and its items as classes in our code. There is a problem with this particular solution, however: as an API consumer we get a reference to the PurchaseItem, which allows us to break the aggregate’s invariant by directly modifying the Quantity property on the PurchaseItem. This is undesirable, because someone less familiar with the code might actually do this. Of course, there are technical ways of preventing this, but in general those ways tend to clutter the code and it’s just not really necessary.

We can do better by changing the API as follows:

[TestMethod]
public void when_changing_the_quantity_of_an_item_we_cannot_exceed_the_max_value() {
    Order order = CreateOrder(ofMaximumValue:10, ofCurrentValue:9);
    Part part = CreatePartOfValue(1);

    PurchaseItemId itemId = order.NewItem(part, quantity:1);

    try {
        order.ChangeQuantity(itemId,newQuantity:2);
    }
    catch(MaximumOrderValueExceeded) {
        return;
    }

    Assert.Fail("MaximumOrderValueExceeded was not thrown when exceeding the maximum order value");
}

Here we changed from passing around entities to passing around entity Ids, which means that we cannot invoke operations on PurchaseItems directly anymore (besides not handing out references, we also internalized its class to be sure). Now, what’s interesting is that from a consumer point of view, you don’t care whether the Item is implemented as a class or something else anymore. I think this is nice property in itself, because it hides implementation details, but besides that it actually frees up alternative implementations for the entire aggregate, meaning we’re not tied to a class per entity anymore (though we can still implement it this way, of course).

Now, would there be a reason why we’d want to do that? The answer is it depends. Like I said in the intro, the goal of DDD is to create a shared model of the domain and we need to be able to represent that model in code. Now, since we have multiple options of representing a concept in code, it becomes a matter of picking a representation that models the domain most clearly from code, and is therefore the more desirable model.

Does an object fit our model of a purchase item satisfactory? I think in this case, the purchase item is more like a data structure than an object: it has data, but doesn’t really have any behavior (since that has to go through the aggregate root). So no, looking at it from the (OO) perspective where an object should have behavior working on its private data, I don’t think an object matches the domain concept really well in this case. Of course, in practice, a data structure is usually represented as a class as well (unfortunately), which then does actually fit pretty well to the model.

An alternative implementation might use a simple Tuple, which would go as follows:

using PurchaseItem = Tuple<PurchaseItemId,PartNumber,int>;
class Order {
    Dictionary<PurchaseItemId,PurchaseItem> _items = new Dictionary<PurchaseItemId,PurchaseItem>();

    public void ChangeQuantity(PurchaseItemId itemId,int newQuantity) {
        if(!CheckOrderValueWithNewQuantity(itemId,newQuantity)) {
            throw new MaximumOrderValueExceeded();
        }
        _items[itemId] = UpdateQuantityForItem(itemId,newQuantity);
    }
    ...
}

This model conveys pretty clearly that there is no behavior associated with the PurchaseItems themselves, which is what we actually want to model. Also, you won’t be inclined to actually put any behavior on it, since you can’t.

In the end, working with tuples is probably gonna make the code a little less readable in C# (where are my record types?) so I might actually factor it out into a class, but in my opinion that’s a programming language driven compromise in model accuracy.

Conclusion

The primary reason I wrote this post is that I see the one-class-per-entity solution being applied in a lot of situations where it can actually cause harm, such as in the example where you could violate the aggregate’s invariant. In any other situation, we would think twice before exposing an object in that way to its consumers. But since we think this is the way that things are done, we end up having these kind of problems. We shouldn’t be doing that, and we should always stay critical to our own and other people’s solutions; they might be wrong or simply not apply to your context. I’d be happy to hear what others are thinking about this, so please use the comments.

BTW. It often also works the other way around: people trying to cram all the entity’s logic into 1 class. This is a way bigger problem, because it really clutters the aggregate’s code and you’re missing opportunities to abstract. Thus it perfectly valid to have classes in your domain layer that are not in your model. I’ll try to write about that later.