Posts in "DDD"

DDD Layered architecture in Clojure: A first try

The first step in my effort to freshen up our time tracker using DDD & Clojure has been finding a way to structure my code. Since I don’t have that much Clojure experience yet, I decided to take the DDD layered architecture and port it as directly as possible. This probably isn’t really idiomatic Clojure, but it gives me a familiar start. This post should be regarded as such, my first try. If you know better ways, don’t hesitate to let me know.

The architecture

As a picture is worth a thousend words:
layered
This architecture is mostly the same as the one advocated in the DDD Blue Book, except that the Domain Layer does not depend on any data-related infrastructure and there’s a little CQRS mixed in. I think this is mostly standard these days. In this design, the application layer is responsible for transaction management. The ‘Setup’ part of the UI layer means setting up things like dependency injection.

In this post I’ll focus on the interaction between application services, domain objects, repositories and the data layer. I’ll blog about other parts (such as validation) in later posts.

The domain

Unfortunately, I’m not able to release the code for the time tracker just yet (due to some issues with the legacy code). So for this post I’ll use an example domain with curently just one entity… Cargo 🙂 The Cargo currently has one operation: being booked onto a Voyage.

The approach

Let’s start with the Domain Layer. Here, we need to define an “object” and an “interface”: the Cargo and CargoRepository respectively.

Cargo entity

The Cargo entity is implemented as a simple record containing the fields cargo-id, size and voyage-id. I’ve defined a constructor create-new-voyage which does its input validations use pre-conditions.

There’s one domain operation, book-onto-voyage which books the cargo on a voyage. For now, the requirement is that it can’t already be booked on another Voyage. (Remember this post is about overall architecture, not the domain logic itself, which is for a next post).

Furthermore, there is a method for setting the the cargo-id since we rely on the data store to generate it for us, which means we don’t have it yet when creating a new cargo.

Here’s the code:

Cargo Repository

The Cargo Repository consists of 2 parts: the interface which lives in the domain layer, and the implementation which lives in the data layer. The interface is very simple and implemented using a Clojure protocol. It has 3 functions, -find, -add! and -update!.

A note about concurrency: -find returns both the cargo entity and the version as it exists in the database in a map: {:version a-version :cargo the-cargo}. When doing an -update! you need to pass in the version so you can do your optimistic concurrency check. (I’m thinking of returning a vector [version cargo] instead of a map because destructuring the map every time hurts readability in client code, I think.)

Furthermore, I’ve defined convenience methods find, add! and update!, which are globally reachable and rely on a call to set-implementation! when setting up the application. This is to avoid needing to pass (read: dependency inject) the correct repository implementation along the stack. This is probably a bit controversial (global state, pure functions, etc), and I look forward to exploring and hearing about alternatives.

Cargo repository MySQL implementation

I’m using MySQL as the data store, and clojure.java.jdbc for interaction with it. The cargoes are mapped to one table, surprisingly called cargoes. I don’t think there’s anything particular to the implementation, so here it goes:

The final parts are the Application Services and the UI.

The Application Service

I never have good naming conventions (or, almost equivalently, partitioning criteria) for application services. So I’ve just put it in a namespace called application-service, containing functions for all domain operations. The operations can be taken directly from the Cargo entity: creating a new one, and booking it onto a voyage. I use the apply construct to invoke the entity functions to avoid repeating all parameters.

Code:

The UI The tests

To not make this post any longer than it already is I’m not going to show a full UI, but a couple of tests exercising the Application Service instead. This won’t show how to do queries for screens, but for now just assume that I more or less directly query the database for those.

There isn’t much to tell about the tests. If you are not that familiar with Clojure, look for the lines starting with deftest, they define the actual tests. The tests show how to use the application service API to handle commands. They test the end result of the commands by fetching the cargo from the repository and checking its state. I use the MySQL implementation for the database, since I already have it and it performs fine (for now).

Conclusion

The code in this post is pretty much a one-to-one mapping from an OO kind of language to Clojure, which is probably not ideal. Yet, I haven’t been able to find some good resources on how you would structure a business application in a more Clojure idiomatic way, so this will have to do. Nevertheless, I still like the structure I have now. I think it’s pretty clean and I don’t see any big problems (yet). I look forward to exploring more alternatives in the next couple of months, and I’ll keep you updated.

All code (including tests) is available on GitHub.

Moving away from legacy, DDD style

One of the things I like about DDD is that it has solutions for a wide variety of problems. One of those is how to handle legacy software, and specifically how to move away from those. There’s a passage in the book about this subject, as well as some additional material online.

This year I’m planning to apply some of these principles and techniques to a project we’ve developed and use internally at Infi: our time tracker. This tool has been under development for 8+ years now, and throughout the years it’s become ever harder to add new functionality. There are various reasons for this, such as outdated technology, a missing vision on the design and the software trying to solve many separate problems with just one model. So there’s been pressure to replace this system for a while now, and doing so via DDD-practices seems both natural and fun.

This is going to be more of a journey than a project, so I’ll try to keep you updated during the year.

DDD style legacy replacement

The DDD style approach to moving away from legacy is to first and foremost focus on the core domain. We shouldn’t try to redesign the whole system at once, or try to refactor ourselves out of the mess, since that hardly ever works. Besides, there is probably a lot of value hidden in the current non-core legacy systems, and it doesn’t make sense to rewrite that since it’s been working more or less fine for years and we don’t actually need new features in these areas.

Instead, we should identify the actual reasons why we want to move away from the legacy system, and what value it’s going to bring us when doing so. More often than not, the reason will be deeply rooted in the core domain: maybe we’re having problems delivering new features due to an inappropriate model, maybe the code is just really bad, etc. Whatever the reason, the current system is holding back development in the core domain, and that’s hurting the business.

So how do we approach this? The aforementioned resources provide a couple of strategies, and they all revolve around a basic idea: create a nice clean, isolated environment for developing a new bounded context. This new bounded context won’t be encumbered by existing software or models, and we can develop a new model that addresses the problems we’d like solve in our core domain.

The goal

So what are our reasons for wanting replace our current application? Well, you can probably imagine that time tracking is very important to us since this is what we use to bill our clients. Also, we use it internally to measure all sorts of stuff and make management decisions based on that data. These issues make time tracking a key process in our organization. To be fair, it’s not mission critical, but still important enough to consider Core Domain.

For our goals, time tracking is only effective if it’s both accurate and entered timely. I think the number one way to stimulate this is making tracking your time as easy and convenient as possible. I think we can improve on this by creating a model of the way that people actually spend in our company. By having deep knowledge about the way the time is spent, I envision the model being able to, for example, provide context-sensitive suggestions or notify people at sensible times that time-entry is due. Having these kind of features would make tracking your time a little less of a burden.

The plan

Let’s look at our current context map:

initial-context-map

There are currently 4 Bounded Contexts

  • Time database. This is the current application. I’ve declared it a big ball of mud since I don’t think there’s a consistent model hidden in there, and frankly I don’t care. This application is currently used for entering your times, generating exports, managing users, etc.
  • Client reporting. Client reporting is concerned with regularly giving updates to our clients about how we spend our time. It gets its data from TTA, but uses a different model for the actual reporting step, which is why it’s a separate BC. Most of the work with this model is manual, in Excel.
  • Invoicing. While the TTA currently has functionality for generating invoices, we don’t directly use that for sending invoices to our customers. We use data from the TTA, but then model that differently in this context. Again, this is mostly manual work.
  • Management reporting. This is what we use to make week-to-week operational decisions, and uses yet another model. This is actually an API that directly queries the TTA database.

I’m not planning on replacing the entire existing application for now, just the parts that have to do with time entry. Reporting, for example, is out of scope.

We see all BCs partners because functions are required to successfully run the company. It’s probably possible to unify the “satellite” models, but we don’t care about that now since we want to focus on the actual core domain of actually doing the time tracking.

For the new system, we’re going to try the “Bubble Context with an ACL-backed repository” strategy, and hope we can later evolve it to one of the other strategies. The destination context map will look like this:
new-context-map
The new BC will contain all the new code: an implementation of the model as well as a new GUI. For a lack of a better name I’ve called it Time-entry for now.

A final twist

Just to make things more interesting, I’m planning on doing the new code in Clojure. There are a couple of reasons for this:

  • I just like learning new stuff, and Clojure is new for me.
  • I’ve been encountering Lisps more and more over the last couple of years, and people that I highly respect often speak about Lisps in high regard. So it’s about time I figure out what all the fuss is about.
  • I’d like to try something outside of .NET, for numerous reasons.
  • Lisps are known for their ability to nicely do DSLs, and that seems a good fit for DDD.
  • I want to see how DDD patterns map to a more functional language, and specifically what impact that has on modeling.
  • I wonder how interactive programming (with a REPL) works in real-life

My experience with Clojure thus far has been some toy projects and I read the Joy of Clojure, but that’s about it. So expect me to make a lot of rookie mistakes, and please tell me when I do 🙂

Next steps

All the new code will be open source and on Github. I probably won’t be able to open source the code for the original application, but I hope I can publish enough to be able to run the ACL. This should be enough to get the entire application running. I hope to get the first code out in a couple of weeks.

My 16 definitions of Microservices

If you’re in software development these days it’s almost impossible to have missed all the buzz about microservices. The first time I heard about it was in a presentation by Fred George, which I really enjoyed and was mind-challenging for me. After that, I heard the term popping up more and more and noticed that there isn’t really any consensus on what it means. This really confused me since a lot of people are talking about different things but are calling it the same. However, they all do seem to revolve around the idea of partitioning a large body of code into smaller parts. The question then becomes: on what criteria are you partitioning? Or, what defines a service boundary?

I think that’s the essence of my confusion: people partition their code across different dimensions. So, to clear up my own mind, I’ve decided to compile a list of partitioning dimensions/definitions I’ve come across and share them here with you:

#1 Deployment unit

A microservice is a unit of deployment. By this definition every part of your system that you individually deploy is a microservice. For example: a web site, a web service, a SPA, background tasks, etc. This definition is usually related to the scalability properties of microservices: the idea that you can individually scale different parts of your system, something that’s useful if different parts are under different loads. The services are not necessarily independently deployable.

#2 Independent deployments

By this definition a microservice is a piece of your system that you can (and should) deploy independently. This consists of at least one deployment unit. You are independently deployable if you can deploy your service without other services needing to deploy as well. A service will consist of more than one deployment unit if the individual deployment units must be deployed together for the service to keep functioning correctly. In this approach services are autonomous in the sense that they can be updated independently without depending on other services.

#3 Business function

Each service is responsible for a specific business function. Business functions like billing, sales, pricing, recommendations, etc. This is mostly useful if it’s combined with Independent teams (#5), Separate code base (#15) or Independent deployments (#2). With Independent teams, it’s clear for the business owner who to talk to when there’s a problem or he needs a new feature. Separate code bases help because the discipline of not using raw data from a different business function is enforced since that code is simply not nearby.

#4 Technical function

Here, a service is defined by its technical function such as front-end, back-end, database or messaging infrastructure. As with Business function (#3) this would usually not be considered a microservice without Independent Teams (#5), Separate code base (#15) or Independent deployments (#2). Yet, I’ve seen people calling a good old 3-tier architecture a microservices architecture without there being any other criteria.

#5 Independent teams

Each team is responsible for one service. In essence the code a team is working on defines the service boundary. The team develops and runs the service completely by themselves (2 pizza team). They can work autonomously in the sense that they can make all the decisions with regard to the service they are responsible for. When they have a dependency on another service they agree on a contract with the team responsible for that service. Teams are usually aligned to Business function (#3) or Technical function (#4) and sometimes also have their Separate code base (#15).

#6 Private memory space

Services are defined by their ability to run in their own memory space. If an application runs in its own memory space, it’s a service. Such services cannot be crashed by other services running in the same process (since there aren’t any). Also the in-memory data is completely private to the service, so can’t be used by other services. Each service can potentially be built on a different platform or programming language (#10).

#7 Independent database

Each service has its own private database. Services can’t access databases from other services. There are as many services as there are databases in the system: a database defines the service. The services are completely autonomous in their choice for data storage, schema refactorings, etc. They can also be held completely responsible for the conceptual integrity of their data. In general a service will only have a few tables (or data schemas if you like). This is important because a big problem in large monolithic is the ease of access to data that you conceptually have no business touching. If you’re providing a CRUD-y interface on your service, this doesn’t count.

#8 Temporally decoupled

A service is a piece of code that’s temporally decoupled from other pieces of code. This means a service can keep operating (for a finite amount of time) even if services it needs to interact with/depends on are down. This usually implies some form of async messaging using queues or service buses. RPC between services is out of the question because you seize to be functional whenever the other service is down.

#9 Communicating via REST/JSON over HTTP

A microservice is any application that communicates to other microservices via REST/JSON over HTTP. This specifically discounts the possibility of multiple services running in the process or using some form of binary protocol. This is mostly done from an interoperability standpoint since such a service is highly interoperable as a lot of platforms speak REST/JSON and HTTP.

#10 Independent choice of platform/programming language

Two microservices are not two microservices if it isn’t possible to write them in a different language. In this sense, it’s possible for each service to “pick the right tool for the job”. Service boundaries are defined by the ability to do this.

#11 Objects

Microservices are no different from “real objects”. With real objects being the way Alan Kay originally thought of them: objects providing a goal-based interface to provide some sort of service to the user. Interactions between objects will generally occur through messaging with a synchronous in-proc call being a specific kind of message, but not the only way objects can communicate (though that’s the only way that ended up in OOP languages).

#12 Containers/Docker

A microservice is any application that runs inside a container such as Docker. When people define it this way there are usually other contraints as well (specifically Independent deployments (#2) or Communication via REST/JSON over HTTP (#9)), but I’ve come across definitions that seemingly don’t require anything else but containerization.

#13 SOA done right

SOA was originally intended to do a lot of stuff that’s now attributed to microservices. However, SOA somehow ended up being largely synonymous with web services, and people that were once big on SOA are now happy to have a new name for the same ideas with microservices. The definition of a service in SOA is also not that clear, but I think (correct me if I’m wrong) it was mostly supposed to be about interoperability and IT/business alignment.

#14 Shared-nothing architecture

Services are pieces of code that “share nothing”. That means they’re very well encapsulated: no conceptual or physical data sharing. Combines Private memory space (#6) and Independent database (#7). For some people also means no sharing of virtual or physical machine to run on.

#15 Separate code base

A service is defined by the code base it’s in. Or equivalently: each service has its own code base. This is absolutely required for Independent choice of platform/programming language (#10) but it’s also implied in many of the other definitions.

#16 Bounded context

A service is the isomorphic with a DDD Bounded Context. This means services and BCs map one-to-one. It will usually involve the strong code and runtime isolation found in some of the other definitions, to enforce the models staying clean. Eric Evans also did a talk about this at DDDX.

Conclusion

I think of the above list more as axes than definitions. In fact, I find it highly unlikely that any architecture that calls itself a microservices architecture will conform to only one of the points in the above list. Two or three will be much more likely.

So what is the right definition? I really think it doesn’t matter. All of the above definitions have pros and cons in given contexts. The question is not that much which definition to use, but what context you’re in. You are probably not Amazon or Netflix, so things that apply to them might not apply to you. Quite the contrary, using their definitions and playing by their rules will probably hurt you more than it will help you. Therefore, pick the definition that helps you with the problems you’re facing, be it scalability, time to market, quality or whatever, but don’t get distracted by anything else just to be on the microservices bandwagon.

Closing remarks

I’m sorry I haven’t provided a list of sources with regard to the definitions. Most are straight from the top of my head, and come from blogs I read, talks I’ve seen or people I’ve talked to. If you disagree and think microservices are well defined, please let me know. Also, if I missed any definitions, don’t hesitate to comment.

 

Event Sourcing NerdDinner – Coding Dojo

We had an Event Sourcing coding dojo at Infi this Tuesday. Turnout was 18 people, half of them being Infi employees, the others enthusiastic community members. The dojo was a big success: we had a lot of fun, learned new stuff, met new people and had a great discussion afterwards. I’m really happy about this 🙂

For this dojo we prepared a custom version of NerdDinner, the canonical ASP.NET MVC example app, so it could act as a playground for experimenting with event sourcing. To support event sourcing we made these modifications:

  • Added basic infrastructure for raising events (the Event, Event<T>, IEventData and EventScope types)
  • Load events when hydrating a Dinner entity (DinnerRepository.Find)
  • Added DbSet<Event> to the NerdDinners DbContext to add a table for storing Events to disk
  • Have a hook for the NerdDinners DbContext intercepting published Events (NerdDinners.OnEventsPublished)
  • Removed mocking from the unit tests so that you could run the tests against the DB
  • Implemented RSVPing using Event Sourcing (RSVPController.Register)

All event handling is done synchronously, so you don’t have to bother with eventual consistency and all that stuff, instead you can just focus on the core event sourcing concepts. Also, the result is actually a hybrid mutable state / ES solution, since there’s still some data stored in table rows. We didn’t feel it was worth the effort to completely move to event sourcing for this dojo, though it might be a fun exercise. You can have a look at the code on github. Please note that this infrastructure is not production-ready nor production-tested, so keep that in mind if you want to base your own infra on this code.

The exercises

With the above code available at the start of the dojo, we set up a couple of exercises for pairs to tackle during the evening.

Spoiler alert: the following description of the exercises might contain partial answers to the exercises; if you want to do the dojo please consider doing that first.

1. Cancel your RSVP

With RSVPing already in place, canceling is the next logical step. In this exercise you can explore how event sourcing works in the RSVP case, and try to apply that to canceling. It ends up being more or less a copy-paste exercise, but one that lets you quickly explore the various steps involved.

2. Show a dinner activity feed

In the dinners details screen we’d like to show an activity history, with entries for RSVPs, cancelled RSVPs, location changes, etc. This is very easy to implement using event sourcing, but hard using mutable state based persistence. In fact, in the latter case you probably end up persisting event-like things. The goal of the exercise is to show how event sourcing helps you do useful stuff that is otherwise hard to implement.

3. Change the address for a Dinner

In the original application, updating details for the dinner was very CRUD-y. You would have one form in which all details could be edited and then saved. This approach doesn’t work particularly well with event sourcing since you’d usually want the events to express more intent than just “Dinner Changed”. To do so, you usually build a task-based UI, with explicit commands for certain (domain) operations. In this exercise that domain operation was changing the address of the Dinner (in hindsight the operation maybe should’ve been called ChangeVenue). The goal was showing how using event sourcing might end up requiring changes to the UI.

4. Optimize the popular dinners query

The NerdDinner UI has a list of popular dinners:

popular-dinners

It’s a list of dinners sorted by RSVP count descending. Producing this list efficiently with event sourcing is harder than in a mutable state based scenario, because we don’t have the current state readily available to query on. Since we now need to sort globally on all dinners, we would have to fetch all dinners and events and then do the sort entirely in memory. That could become a problem if we have a lot of dinners in our system.

So in this exercise the goal was to listen to the RSVPed and RSVPCanceled events and build a separate read model for keeping the count of RSVPs per dinner, and use that to sort the list. We expected this exercise to take most time, but a few pairs managed to finish the exercise within the allotted time (about 2.5 hours for all exercises).

Please do try this at home!

All in all, I think doing the above exercises could be a good introduction to event sourcing, and most of the people that attended agreed.

So, if you want to try out event sourcing at your company or just want to experiment with it yourself, doing this dojo might be a great way to start. Just clone the code and try doing the exercises (all exercises have accompanying tests to get you started). Please let me know about your results 🙂

CQRS without ES and Eventual Consistency

Command Query Responsibility Segregation (CQRS) is one of the concepts the profoundly changed the way I develop software. Unfortunately most of the information available online conflates CQRS with event sourcing (ES) and asynchronous read-model population (introducing eventual consistency). While those 2 techniques can be very useful, they are actually orthogonal concerns to CQRS. In this post I’ll focus on the essence of CQRS, and why it helps creating more maintainable, simpler software.

A little history

In 2012 I was working on a project that had some fairly involved business logic, so we tried applying DDD to control the complexity. All started out fairly well, and we actually created a pretty rich, well-factored model that handled all the incoming transactions in a way that reflected how the business actually thought about solving the problem.

There was just one problem with our solution: the (entity) classes in our domain model were littered with public get methods. Now, those getters weren’t used to make decisions in other objects (which would break encapsulation), but were only there for displaying purposes. So whenever I need to display an entity on a page, I would fetch the entity from a repository and use the model’s get methods to populate the page with data (maybe via some DTO/ViewModel mapping ceremony).

While I think getters on entities are a smell in general, if the team has enough discipline to not use the getters to make any business decisions, this could actually be workable. There were, however, bigger problems. In a lot of scenario’s we needed to display information from different aggregates on one page: let’s say we have Customer and Order aggregates and want to display a screen with information about customers and their orders. We were now forced to include an Orders property on the Customer, like this:

From now on, whenever we were retrieving a Customer entity from the database we needed to decide whether we also needed to populate the Orders property. You can imagine that things get more complex as you increase the amount of ‘joins’, or when we just needed a subset of information, like just the order count for a given customer. Rather quickly, our carefully crafted model, made for processing transactions, got hidden by an innumerable amount of properties and its supporting code for populating those properties when necessary. Do I even need to mention what this meant for the adaptability of our model?

The worse thing was, I had actually seen this problem grind other projects to a halt before, but hadn’t acted on it back then, and I almost hadn’t this time. I thought it was just the way things were done™. But this time, I wasn’t satisfied and I thought I’d do a little googling and see if other people are experiencing the same problem.

Enter CQRS

I ended up finding a thing or two about a concept called Command Query Responsibility Seggregation, or CQRS. The basic premise of CQRS is that you shouldn’t be using the code that does the reading to also do the writing. The basic idea is that a model optimized for handling transactions (writing) cannot by simultaneously be good at handling reads, and vice versa. This idea is true for both performance AND code structure. Instead, we should treat reading and writing as separate responsibilities, and reflect that in the way we design our solution:

cqrs

By applying this separation we can design both of them in a way that’s optimal for fulfilling their specific responsibility.

In most of the domains I’ve worked in, the complexity lies in the transaction-handling part of our system: there are complex business rules that determine the outcome of a given command. Those rules benefit from a rich domain model, so we should choose that for our write side.

On the read side, things in these systems are much simpler and usually just involve gathering some data and assembling that together into a view. For that, we don’t need a full-fledged domain model, but we can probably directly map our database to DTO’s/ViewModels and then send them to the view/client. We can easily accommodate new views by just introducing new queries and mappings, and we don’t need to change anything on our command-processing side of things.

I’ve recently worked on a project that actually had all the complexity on the read side, by the way. In that case the structure is actually inverted. I’ll try to blog about that soon.

Note that in the above description, the data model is shared between the read and write model. This is not a strict necessity and, in fact, CQRS naturally allows you to also create separate data models. You would typically do that for performance reasons, which is not the focus of this post.

But what if I change the data model!?

Obviously, in the above description both the read and write model are tightly coupled to the data model, so if we change something in the data model we’ll need changes on both sides. This is true, but in my experience this has not yet been a big problem. I think this is due to a couple of reasons:

  • Most changes are actually additions on the write side. This usually means reading code keeps working as it is.
  • There is conceptual coupling between the read and write side anyway. In a lot of cases, if we add something on the write side, it’s of interest to the user, and we need to display it somewhere anyway.

The above isn’t true if you’re refactoring your data model for technical reasons. In this case, it might make sense to create a separate data model for reading and writing.

Conclusion

For me, the idea of CQRS was mind-blowing since everyone I ever worked with, ever talked with or read a book from used the same model to do reading and writing. I had simply not considered splitting the two. Now that I do, it helps me create better focused models on both sides, and I consider my designs to be more clear and thereby maintainable and adaptable.

The value in CQRS for me is mostly in the fact that I can use it to create cleaner software. It’s one of the examples where you can see that the Single Responsibility Principle really matters. I know a lot of people also use it for more technical reasons, specifically that things can be made more performant, but for me that’s just a happy coincidence.

Acknowledgements

Greg Young is the guy that first coined CQRS, and since I learned a lot of stuff on CQRS from his work it would be strange not to acknowledge that in this post. The same goes for Udi Dahan. So if you’re reading this and like to know more (and probably better explained), please check out their blogs.