To become a software crafter…Or die in the attempt.

In defense of Event Sourcing

Practicing and teaching Test Driven Development (TDD) since many years now, I start to see where the point of acceptation of this practice is: when you accept that the problem is not the method, but the way you are coding, and that this method is just a revealer of bad practices. 
Indeed testing code without dependency inversion or single responsibility principle will be really painful. Hence lots of people conclude in the name of pragmatism that the problem is TDD, not their code. But the people who are ready to challenge their way of thinking and coding will learn a lot, and usually accept TDD, or at least Unit Testing as a good practice. Because it can avoid bad habits in code and design (after years of practices, I agree). 

In the last years, I also actively practice and teach CQRS/ES, mainly implementing it using C# or F#, or both. And I’m convinced that it has the same power as TDD for this: this method is also a revealer of bad practices. Why does it matter? Because most of the critics I hear sounds like “I had this huge error in the conception of my system, my traditional way of coding doesn’t tell me that, but your Event Sourcing stuff put my head in my ####, so I guess the problem is this Event Sourcing stuff, not my way of coding right?” 
 
So let’s talks about Event Sourcing and some usual critics we can find on the internet, or that we can have during trainings. 

Functional Event Sourcing in a nutshell by Jeremie Chassaing

How can I manage my very complex entity with billions of events??! 

This question almost always arises. Especially from people who tried an implementation and end up in this situation.  
As I heard for the first time from my co-worker Florent Pellet, an event stream is nothing but the representation of the lifespan and responsibility of an entity. So the question should not be how to handle it, but why should we handle it? If this situation happens, it’s your domain model shouting at you “I’M WROOOOOOOOONG, I’M a MONSTEEEEEEER! PLEASE CHANGE ME OR KILL ME BUT DON’T LEAVE ME LIKE THAT !!”. 

Too many events in a stream means an error in design. And the thing is that this error exists independently of the way you’re implementing your domain. We could easily detect it in a statefull/relational database implementation, if we care about bug tracking and which region of the code we need to change at each release. It would quickly reveal this monster aggregate (also called god object).  
The problem here is not Event Sourcing, but the design of the domain, and ignore it for too long will be much more painful than Event Sourcing by itself. 

That being said, we indeed have a technical solution for this problem, it’s called snapshots. Basically, the idea is to store intermediate state to avoid rebuild from scratch each time you need to load this monster aggregate. But when you have to use it, you can consider it as a design failure. It’s like code comments: it can be useful, but it’s often just used to hide bad coding habits. 

I can no longer access and change data in the database??! 

Have you ever heard something close to this: “You know, in my data-intensive applications issues are often caused by data anomalies rather than code-based bugs.”  
It’s a more elegant version of “The problem is not the software, it’s the users”. 
Indeed, quite a widespread developer’s bad habit is to fix problems directly in production data (ie consequences) rather than the root causes (ie code or process). 
Which can be painful in Event Sourcing because events contents are usually stored as JSON (when human readable), or even as blob not human readable at all. 

So let’s back to the magic question, instead of wondering how to do it, we can ask ourselves why to do it? It can often be tracked to UX, design or process errors.

So handle these problems at their causes, and then fix the consequences. You can’t just modify the database by hand because you have too many events to update? Good, write a script then. Which should have been done anyway for maintenance, no matter if you’re using an event store or a relational database. And yes, this script might be harder to write than for your classical relational database, hence the importance to fix the root cause.

But the business people do not understand it??! 

Ho yes they do. If you think they don’t, ask them if they think that a user with an empty cart because he just logged in is the same thing as a user with an empty cart because he adds and removes something 3 times. 
A dev might think it is because an empty cart is just an empty cart after all (and most of the time this is how it will be designed). The business though will understand the learning opportunities in this add/remove behavior, and would like to track it. 

Also have you ever worked in a system to “add logging”? It’s painful because it adds dependencies, and it’s not always easy to know what to log. Event Sourcing, at least coupled with Domain Driven Design (DDD), will reply to this question. Also this need of “logging” by the business is a sign that they understand well the concept of Event Sourcing and the value it can bring. 

But the devs are not trained for that and they don’t understand how to use it??! 

It’s one of my favorite one. Most devs use GIT. So basically most devs already understand and use the value of logging every past changes. And they also already understand that logging little changes will be much easier to exploit in time than logging big changes. 

It’s true though that they’re not trained to do such implementation by themselves, because they have done years of Oriented Object (even if it’s done in a procedural way), ORMs and relational database. At some point some people do not even know that alternatives do exist. 
Compared to this way of coding, it requires indeed a mind shift, but I can’t believe that someone smart enough to code and courageous enough to use ORM, will not be smart enough to learn about Event Sourcing. 
I do believe though that most employers do not want to invest in their own employees training, but that’s another topic. 

And I can no longer easily change my schema??! 

Finally, an (almost) valid point. Yes changing the schema (ie the serialization of events because you change, add or remove properties) is not a funny part. Schema migration was never straightforward anyway, again no matter the implementation you choose. 
I agree though it will require a more complex process in an event sourced system because each time you want to update the present, you need to care about the past. It might be unusual in a relational database, but it is actually a good idea. 

You have 3 solutions: 
1- you can pretend the past hasn’t existed, and use a script to update old events to become valid events 
2- you can pretend the past hasn’t existed, and fix the invalid events in the event repository using default values in the code 
3- you can care about the past because the version of the events might impact the way you want to handle it in your business model, in this case you can use event versioning and different paths in code 

In other words: you have to explicitly choose an update strategy for each change that could affect the past. Ignoring the past or not depends on the business’s needs. 

 
So you say it’s a silver bullet? 

Of course not, but I would like to avoid critics that are basically complaining that they need to change some bad habits.  

Event sourcing gives you more options, hence more responsibility. It gives you the opportunity to think with the business in mind (especially when coupled with DDD and CQRS). I believe that it’s this new world of “many options” that can afraid people who prefer the prescription of a rigid framework. The very fact that it does not have a proper standard and that everybody can come up with its own implementation is partly what makes it so powerful for me. 
 
Event Sourcing isn’t trivial. As we already saw, it makes designs error even more painful than usual (I’d prefer to say: harder to ignore or postpone). It means that using it without knowing about DDD for example might be a good way to shoot yourself in the foot. It also means that if you’re discovering a domain (proof of concept for a startup?) it won’t fit. 
 
But if you’re looking for a way to build a robust and scalable system in a domain that you know (even if it will change), I still haven’t found a better approach so far . 
 
Surprisingly enough, context is king. The power of Event Sourcing is that your implementation can greatly change depending on your context. 
 
 

Leave a Reply

Your email address will not be published. Required fields are marked *