What the fuck ORM means?

Today is another day where a data model lealking into my business model ruins my productivity. So let’s talk again about impedance mismatch please.

Relational data model

We are all used to relational database. Back in school we learned how to design 1-n or n-n relationships and how to manage primary and foreign keys to manage consistency. We also learned how to normalize and denormalize the model, and to use views for performances.
What we did not learned in school is that it is only one way to store data, and that an automatic translation from this data storage model into memory is not a great idea.

What the fuck ORM means?

According to wikipedia: an Object Relational Mapper is a programming technique for converting data between incompatible type systems in object-oriented programming (OOP) languages.
The problem lies in the definition: we have incompatible type systems. But we somehow believe that the storage data model and the in memory object oriented model are the same thing.

The mismatch

Relational and object oriented are two different representation of the same model. One uses keys and relations to achieve data integrity, based on relational algebra and tuple relational calculus . The other defines bounded objects with defined responsibilities to simulate business processes in order to take decision, inspired by how cells work in biology.
Loading a relational model into memory as an object grape leads to well known problems. Lack of encapsulation (public get/set) is one of them. Anaemic model (data object without any behaviours) is another one. Both of them violate OOP principles.
There are also technical problems: when do you stop to load the grape? How do you manage circular references?

Complexity increase

In all the projects I met so far, the benefice of using an ORM was balanced by the incredible number of hacks we used to load and save what we need. Not speaking about the unmanageable complexity as soon as we try to cache something, because it is always a nightmare to know what is in memory or not.
Just to be clear: the core problem is not SQL or ORM by themselves (they are just tools), but the fact that most developers consider the data model to be their domain model.

How to escape it?

Keep in mind that a relational model is for storage, and that there is no simple way to automatically convert this model in your object oriented representation.
Using other data storage (document, graph, events…) mights help to remind that storing data and running business processes are two different problems with two different purposes.
And if a SQL storage is choose, I encourage to use simple library like Dapper to write actual SQL requests in the data layers. It will simplify the code, clarify the layers responsibilities and learn some SQL to the team, which is a great idea.

Trying to match automatically data model and domain model is a failure, because they are two different concepts.

 

It’s all about fast feedback

I’m a big fan of James Coplien.
I appreciate his controversial point of view about Unit Testing, even if I don’t agree with him. Recently he tweeted about this article explaining why integration tests are much better than unit tests.
Let’s explore it together.

Unit test problem

You write code and then run the unit tests, only to have them fail. Upon closer inspection, you realize that you added a collaborator to the production code but forgot to configure a mock object for it in the unit tests.”

In other words he points out that after a refactoring, some tests will fail despite there is no behaviour modification.
But it really depends on how you write your unit tests. When practicing TDD, I’m testing a logical behaviour. Several classes can emerge, I don’t necessarily have one test class for one production class. And when coupled with DDD practices, I can write unit tests without mocking anything, because I test business logic. Also when I strive for applying SRP, I have a few functions per interfaces, meaning one test won’t fail just because I change some interfaces in another place.
That being said, I agree that some refactoring might change some contracts, which requires to change both the production and the test code. My point is: if it is complicated, it is not because of the unit tests.

System tests solution

“While I was enslaved by unit tests, I couldn’t help but notice that the system tests treated me with more respect. They didn’t care about implementation details. As behavior remained intact. they didn’t complain.”
(System tests for the author are almost end to end and include database)

Again, I agree, if it’s a technical refactoring, the tests that use only production code without mocking may be less brittle.
But I also notice that, when a system test goes red after an actual regression, it takes me a while to find out what the actual problem is. Also, when I had some bugs in production, it is harder to reproduce it in my system tests. It encourages me to fix the code without changing the system tests.
And finally, when there is a business change, it hurts much more to change my system tests. I need a few hours just to write the test managing the new behaviour. I have to initialize half of the application just to check 2 lines of code.

 

Pyramid tests is a myth!

“If you’re like me, you were trained to believe in and embrace the testing pyramid.”…”But my experience revealed a different picture altogether. I found that not only were unit tests extremely brittle due to their coupling to volatile implementation details, but they also formed the wider base of the regression pyramid. In other words, they were a pain to maintain, and developers were encouraged to write as many of them as possible.”

I definitely disagree.
I was not trained to believe in the testing pyramid. I was trained to believe in the “get your shit done, faster please”. And using this approach I believed, like the author and many other developers, that system tests have a better ROI.

It’s all about Fast Feedback

So I tried, and it doesn’t work for me because of long feedback (more than a few minutes).
Feedback time is everything when you run your tests hundreds of times per day, on your local machine. And running your tests hundreds of times per day is really important to practice TDD.
Running the tests has to be easy and fast, or it becomes a pain. And when it’s painful, we loose discipline. We start checking Twitter while the tests are running, and that’s the start of the end of our productivity.


Contextual testing

Maybe there are contexts where systems tests have a better ROI than unit tests. But in my context of backend complex application, with lots of business requirements changing every day, it is not the case.
And I’m not speaking theoretically here. I was involved in 6 projects since I started my career. One has no tests at all, two used inverse pyramid, and three used the classic Mike Kohn Pyramid.
No need to talk about the failure of the project without tests.
The two who used the inverse pyramid have commons characteristics: it was really painful to maintain after a few months, and we need one or two developers almost full-time to keep the system tests working. Not mentioning the time we loose to run the tests before pushing our code, meaning that we often pushed without testing and break the build.

The projects I enjoyed the most as a developer, and my customer as well I believe, was when we have a solid unit testing strategy following the classical Mike Kohn Pyramid. We were able to run thousands of unit tests in a few seconds, and keep a few integration tests running only on the server.

To get results in the long run, don’t give up unit tests, at least if you are working in the same context than me.

 
IP Blocking Protection is enabled by IP Address Blocker from LionScripts.com.