Today is another day where a data model lealking into my business model ruins my productivity. So let’s talk again about impedance mismatch please.
Relational data model
We are all used to relational database. Back in school we learned how to design 1-n or n-n relationships and how to manage primary and foreign keys to manage consistency. We also learned how to normalize and denormalize the model, and to use views for performances.
What we did not learned in school is that it is only one way to store data, and that an automatic translation from this data storage model into memory is not a great idea.
What the fuck ORM means?
According to wikipedia: an Object Relational Mapper is a programming technique for converting data between incompatible type systems in object-oriented programming (OOP) languages.
The problem lies in the definition: we have incompatible type systems. But we somehow believe that the storage data model and the in memory object oriented model are the same thing.
The mismatch
Relational and object oriented are two different representation of the same model. One uses keys and relations to achieve data integrity, based on relational algebra and tuple relational calculus . The other defines bounded objects with defined responsibilities to simulate business processes in order to take decision, inspired by how cells work in biology.
Loading a relational model into memory as an object grape leads to well known problems. Lack of encapsulation (public get/set) is one of them. Anaemic model (data object without any behaviours) is another one. Both of them violate OOP principles.
There are also technical problems: when do you stop to load the grape? How do you manage circular references?
Complexity increase
In all the projects I met so far, the benefice of using an ORM was balanced by the incredible number of hacks we used to load and save what we need. Not speaking about the unmanageable complexity as soon as we try to cache something, because it is always a nightmare to know what is in memory or not.
Just to be clear: the core problem is not SQL or ORM by themselves (they are just tools), but the fact that most developers consider the data model to be their domain model.
How to escape it?
Keep in mind that a relational model is for storage, and that there is no simple way to automatically convert this model in your object oriented representation.
Using other data storage (document, graph, events…) mights help to remind that storing data and running business processes are two different problems with two different purposes.
And if a SQL storage is choose, I encourage to use simple library like Dapper to write actual SQL requests in the data layers. It will simplify the code, clarify the layers responsibilities and learn some SQL to the team, which is a great idea.
Trying to match automatically data model and domain model is a failure, because they are two different concepts.
In event sourcing your data model is your domain model. I don’t think its a failure.
Hi Anthony, thanks for the remark.
Still I dont agree with you: the domain model is more than the events imho. It’s more likely the result of an event storming for instance, ie the set events + aggregates + commands + projections.
In event sourcing, the data model should be domain events, which is not the full domain model.
Again the events stored are just the “data” part of the system, no behavior involved.
Sure its the event plus the behavior of how they get created. I just disagree with your final statement that they have to be two different concepts.
Its just an accident we have objects and a problem mapping them to relational storage.
Yes, they are the same concept, but I think they are two different representation of the same concept. And I don’t find easy way to automatically map these representation together, because on each project, it will be slightly different.