OOP, FP and the expression problem

As a software developer, you will probably have at least once a month a discussion about Functional Programming (FP) vs Oriented Object Programming (OOP). Of course, it’s always about the context, I already talked about it.
But thanks to Samir Talwar, I had new insights recently on this topic that I would like to share with you.
To give more context, I’m working more and more with F# for 3  years, hence have more and more practical feedback about it. And in the last weeks I came to the conclusion that I feel much more comfortable with FP for mid term code maintenance, and here is why.

OOP in theory

On my current gig, I had to switch back mainly to C# (even if some part of the code are in F#). I recently had a challenging refactoring on the core domain to do, and it leads me to this tweet.

My thought at this moment was that OOP can indeed be beautiful in theory, but I rarely see it beautiful in practice. Not because lack of practice or bad code, just because it’s almost impossible to keep it beautiful in time.
At first the design is really clear, a few nouns, a few actions (aka verbs), sometimes a nice inheritance between objects, and it all sounds good. But of course, after a while the specifications evolve, and you need to add new objects and/or behavior.

OOP in practice

What I usually observe is that evolution of the domain forces you to refine your abstraction. It makes sense, because you improve your knowledge about the domain while you’re working on it. In practice it means that you try to add objects in your inheritance hierarchy, with behaviors that might be slightly different of what you had thought at first.
That’s where the design starts to slide out of control. Maybe we could just add a boolean here instead of multiplying the class implementations? Maybe we’ll keep this inheritance because there are so many shared behaviors in the base class that we don’t want to spread in most classes, just because one implementation doesn’t need this behavior. And so on…
That’s why I usually find OOP using composition rather than inheritance easier to maintain in the long term. But even with that, it’s hard to keep data and behavior together and maintainable.
Hence I somehow feel like I prefer the constraints of FP rather than the constraints of OOP, with separation of data and behaviors, but it was hard for me to explain clearly why.
That’s where Samir’s wisdom shows me the light…

The expression problem

Here is Samir’s answer to my tweet:

That’s it. You can change behaviors (verbs) easily or you can change nouns (classes) easily, but you can’t have both. It is known as the expression problem. FP focus on changing verbs easily whereas OOP focus on changing/adding nouns easily. Samir takes the time to write in detail about it.
I think a lot about that in the last years, but never see it so shortly and clearly explained than in this tweet. And to be fair when you see it so simply explained, you wonder why you need years to think about it!
Despite the expression problem seems to be a well-known issue in software industry, it is rarely cited in any debate about FP vs OOP, as far as I know.

OOP or FP?

I’ll rather let Samir answer if you don’t mind:

I totally agree with him, and this explains I guess why most people who jump from OOP to FP never looked back, even if it could be hard to tell exactly why.
It’s not better by design, it just has different pros and cons, but these pros tend to make the evolution of the software easier, if the verbs of your domain change more than your nouns.
In Samir’s experience it’s usually the case, in mine too, what’s your experience about it?

The context of OOP and FP

If you want lots of reaction when you tweet, a sure recipe is to talk about Object Oriented Programming (OOP) vs Functional Programming (FP).

Let’s throw the cat among the pigeons, with another blog post on the Internet giving personal perspective on OOP vs FP. I’m far from being a functional expert. I code professionally mainly in C#, and sometimes in F#. I only play with Haskell and Idris in coding dojo. I’m not trying to convince anyone about anything. I just would like to share my thoughts about these two important paradigms.

And the first step would be to agree on what’s OOP and what’s FP.

What’s OOP?

To really grasp the philosophy, I highly encourage you to follow Alan Kay’s work. Of course, he’s not the only one behind this paradigm, but I find his explanations really clear. This keynote is a good start. In a nutshell, Alan has (beyond other knowledge) a PHD in biology. The way complex organisms are working is basically what inspired the OOP paradigm. In other words: collaboration through specialize components to achieve a complex task. Each component can be independent and don’t need to be aware of the full system. They just collaborate, receiving some messages when something was required from them, and sending messages when they have done something meaningful for the system. How these components communicate, and how the orchestration between them is managed is not imposed by the paradigm. And different languages have used different implementation. But this notion of independent component collaboration is the core concept of OOP. Not really what we see when we type OOP in duckduckgo…

What’s FP

Instead of looking for inspiration in the biological field, Functional Programming is rooted in mathematics. I’ll quote Philip Wadler to give a better definition:
A program in a pure functional language is written as a set of equations. Explicit data flow ensures that the value of an expression depends only on its free variables. Hence substitution of equals for equals is always valid, making such programs especially easy to reason about. […] Explicit data flow also ensures that the order of computation is irrelevant, making [functional] programs susceptible to lazy evaluation.”
What does he means? If I write X = A+B, or Y = C * D I don’t care about the values of A and B, or C and D. That’s a mathematical equation, it represents the concept of all the valid values for an addition of type A and B, or a multiplication of type C and D. Hence I can compose X with Y, and do substitution:
X>>Y = (A + B) >>  Y = X >>( C * D) = (A + B) >>( C * D)
(>> is the composition operator in F#)
If any of these values have the possibility to change (mutate) outside of my scope, all this reasoning is no longer true. Even worse if not only the values, but also the types can change. That’s why we say that in FP, we can write code only thinking locally. I know that my “equations” (or pure functions) are always true, no need to think about the rest of the world to ensure that.
Please note that this willing of “local thinking” can also be find in the “independent components” from OOP. But contrary to OOP, FP explains how it can be achieved, thanks to mathematics.

Which one is the best?

Finally, the answer the whole industry was looking for since many years. Here it is: none of them, because this is not the good question. The question should be: which one is the best in my context? And now we can start to talk. I think the answer is easier that what many believe.
The more constraints you have, the harder it is to write code that woks, the easier it is to maintain the code in the long term. What are code constraints? We have many of them depending on the language: syntax, immutability, no side effects, unit tests, types and dependent types for example. Is it better to write code faster or to have long term maintainable code? In a start-up context, I probably need to produce code quickly, until I find the good product. In a critical context where lives are involved, I probably want something robust, even if I need 1 year to write it. And of course, there are all the scales from “quick and dirty” to “mathematically robust” that might be your context.

Just tools

Functional Programming is by nature more constrained than Object Oriented Programming. Because it forbids mutability and side effects. When the language also uses types, it leads to powerful concept like Type Driven Development where we can, by design, remove the invalid states from our system, to make it more robust at runtime. And when the language also uses dependent types, we can even encode business rules into the types, to avoid compilation if a business rule is not met. As a drawback, it is harder to write. But when you achieve to write it, you can trust the famous quote “if it compiles, it works”. Maintaining this code is usually easier than a huge OO codebase, because thanks to pure functions we can use local reasoning to maintain the system. Please note the “usually” in my previous sentence. A “good” OO codebase can in theory have this property of local thinking, but the fact that side effects and global impact are not forbidden by design makes it almost impossible to scale (at least in my experience).

FP is hard to write, but easy to maintain (easier to reason about). OOP is easier to write, but harder to maintain (harder to reason about due to side effect).

Explaining monads

If you have grasped the mathematical basics, and if you have understood what’s a monoid, you are ready to discover monads. Of course, we’ll continue to have fun with F#. There are tons of approaches trying to explain what monads are. Personally, I found out once I understood the problem they are solving, thus it will be the approach of this blog post. Again, the following code can be found online.

First compositions rule

Remember the function composition rule: “The result of each function is passed as the argument of the next, and the result of the last one is the result of the whole “.

It means that functions must be “pluggable” (the output of the first must be compatible with the input of the second).  But It also implies that we need a side effects free world. If one of the function can throw an error, there is no more composition because the result of my function (an exception is thrown) is not passed as the argument of the next, and the result of the last function will not be the result of the whole. Instead it breaks the runtime, and someone at a higher level (the caller) must deal with it.

Note that there is nothing to explicit the fact that isPositive function can fail in this code. It compiles as everything seems fine, but it can fail at runtime.

Sometimes vs Always

What if instead of thowing an error sometimes (if i >0), we always return a Result, which is either a success containing the value, or a failure?

I can wrap my result into something to defer the side effects. Of course it implies changing my signatures. My functions are no longer compatible, because a Result<int> cannot be used as the input by the isPositive function, it expects a simple int.

Bind to the rescue

What if I write a simple transformation function, that I can call bind. Its roles are:

  • In case of success: to wrap my simple int into a Result<int>, and to call the next function with the wrapped result
  • In case of failure: to return a failure, and not to call the next function, because after all, we don’t need to execute the next function if the previous one fails.

And now I can call this function between two incompatible functions and… Voilà! They are pluggable again!

My final composition will look like this, and this time it compiles.
Of course, you probably don’t want to create special functions to apply the bind. F# like many languages allow us to define a proper operator bind: >>= (it is the common sign for bind operator).

A more elegant composition will look like this:

I get rid of side effects, thus I can compose my functions.

Dude, you lost me.

No problem, take a breath and think about that:

  • I wanted to compose functions, but they have side effects
  • I wanted to get rid of side effects, thus I introduce the Result<T> instead of throwing an exception
  • My functions where no longer “pluggable”, thus I introduced a bind function to transform a simple type into a more complex Result<T> type.
  • We defined a new operator bind (>>=) to replace the explicit bind call
  • In the end I can have a nice composition, without the side effects

This can be resume by: we find a way to compose function, despite the side effects.

And where’s the monad?

Let’s decrypt the Wikipedia definition of a monad and see if something match:

“Monads allow developer to build pipelines that process data in a series of steps, in which each action is decorated with the additional processing rules provided by the monad.”

Sounds familiar?  In our case we had a pipeline (ToInt -> IsPositive -> ToString), and we decorated each step with the ability to convert from T to Result<T> (and to escape the pipeline in case of failure). You understand that we could have done other things in our bind function, such as Logging, Exception handling, IO management… We managed side effects in the bind used to compose the functions, instead of in the functions themselves.

“A monad is defined by a return operator that creates values, and a bind operator used to link the actions in the pipeline; this definition must follow a set of axioms called monad laws, which are needed for the composition of actions in the pipeline to work properly. The result of the last one is the result of the whole.”

I think we talked enough about the bind function. In our case, the return function is as simple as: return x -> x

Here are the monad’s laws: return acts approximately as a neutral element of bind, in that:

  • (return x) >>= f ≡ f x
  • m >>= return ≡ m

Binding two functions in succession is the same as binding one function that can be determined from them (in other words: bind is associative):

  • (m >>= f) >>= g ≡ m >>= (x -> (f x >>= g))

This is what these tests demonstrate:

Thus, the type we defined Result<T>, with the associated bind and return functions is a monad. The container alone is not a monad. The bind function alone is not a monad. But the container Result<T> with the bind function, because this function defines the identity function (i.e. a return function) when used with a neutral element, and because it is associative, is a monad.

Last words

Let’s finish the Wikipedia decryption:

“Purely functional programs can use monads to structure procedures that include sequenced operations like those found in structured programming. Many common programming concepts can be described in terms of a monad structure without losing the beneficial property of referential transparency, including side effects such as input/output, variable assignment, exception handling, parsing, nondeterminism, concurrency, continuations, or domain-specific languages. This allows these concepts to be defined in a purely functional manner, without major extensions to the language’s semantics.”

And because input/output, variable assignment, exception handling, parsing, nondeterminism, concurrency, continuations and DSL are very common IT problems, FP use monads a lot to manage them. I hope it is now clear for you how we used a monad to structure our chain of operation, and how we managed exception in a purely functional manner.

If so, you now understand what’s a monad, and why it is so useful, especially in FP. If not, please let me know and I’ll try to improve my explanations!


References: If you feel frustrated and want to go further, I recommand you the excellent Category Theory for Beginners, and the even better (but harder) Category Theory for Programmers.

Explaining Monoids

Let’s have some fun with F#. Through example, we will discover together what are monoids, and why they could be useful. All the following code can be found online.

Function composition example with integers

We now all agree about what’s function composition. We all know what are integers. So, let’s compose some functions using the set of integers.

As shown in this code, add and minus for the set of integers are functions defining a rule of combining two elements for the set of integers. These functions have two interesting properties: they are associative, and they behave as an identity function when the neutral element is one of the input.

As you can guess, the neutral element in this case is 0. You can trust me, or you can just run this code instead:

Let’s wrap up: the set of integers has a rule (add) for combining two elements of the set, this rule is associative and has a neutral element (0).


Function composition example with hours 

We can also agree on what are hours (positive integers between 1 and 24), so we could compose some functions using the set of hours.

As it happens, we can also defined add and minus operation on hours. It’s a bit more interesting now, because these operations use a modulo, it’s no longer a simple operation. We can still prove associativity and the existence of a neutral element for the set hours on operation add.

We can add to our knowledge that the set of hours has a rule (add) for combining two elements of the set, this rule is associative and  has a neutral element (24).

Monoid land

Monoids are a collection of things (a set) with a rule of combining two elements of this set together (this rule is called the internal law of composition). This composition law has to be associative, and to contain a neutral element (otherwise it is not a monoid, but it could be another mathematical structure like a magma or a semigroup that we won’t detail here, because they have less interest for a programmer).

Does it remember you something? You got it:

  • The integers set with the function add is a monoid, because add is a rule to combine two int together, add is associative, and add is an identity function when it is used with the neutral element of the set which is zero.
  • The hours set with the function add is a monoid, because add is a rule to combine two hours together, add is associative, and add is an identity function when it is used with the neutral element of the set which is 24.

We use simple examples here, but please understand that any law (function) defining the combination of two element of a set, that is associative and behave as an identity function when used with a neutral element, is a valid rule to define a monoid. No matter the name or the complexity of this function. Note also that we call monoid the set with the function. Integers are not monoids. But integers with the function add is a monoid. Hours with the function add is another monoid. You can’t define a monoid without its internal law of composition.

Parallel Monoid

The coolest thing about monoid is the associativity law. And the cool thing with associativity in programming is that it allows parallel streams. Why?
Because if (f°g)°h =f°(g°h), you don’t care which function is executed in which order. How convenient is that for parallel processing?

Monoids everywhere

Cyrille Martraire has a great talk to explain why you should look for monoids in your domain. By their very nature, you can assume they have a neutral element, they are composable and parallelizable (because they are associative), and they are based on a set of value objects. You say all that when you say that something is a monoid in your domain.

Monoids are powerful and deserve to be known, as you probably use them already without noticing their existence!

What’s composition?

You remember that Functional Programming (FP) is all about function composition? But what’s composition by the way? Let’s explain it here with a few other useful concepts for FP, and go one step further in monads understanding.

Integer composition

A composition of integer in mathematics is a way of writing N as the sum of a sequence strictly positive. Weak composition is for sequence positive or equal to zero. It means that every positive integer admits infinitely many weak compositions. 0 is also known as the neutral element for the set of integers.

Function composition

A function composition is when we apply a function (say f) to the result of another one (say g) to produce a third function (say h). We write it h=g°f. Intuitively, composing two functions is a chaining process in which the output of the inner function becomes the input of the outer function. This is in essence functional programming.

Identity Function

A function that returns exactly its input as the output without any alteration is also known as the identity function. There is a symetry between weak composition of integers and weak composition of functions, because an identity function is a neutral element for the set of functions. Like 0 for integers, it can help in several situations to allow more composition. Which is useful, because we want to compose everything!

Which functions are composable?

Two functions are composable if their Domain and Co-Domain are compatible. Let’s imagine two functions:

  • StringToInt : s (string) -> i (integer) StringToInt is a function mapping a string (Domain) with an integer (Co-Domain).
  • TwoTimes: i (integer) -> o (integer) TwoTimes is a function mapping an integer (Domain) with another integer (Co-Domain).

Can I imagine a composition TwoTimes>>StringToInt? No because TwoTimes output is an integer, and StringToInt input is a string. The Co-Domain of TwoTimes is not compatible with the Domain of StringToInt. But the Co-Domain of StringToInt is compatible with the Domain of TwoTimes. Thus, we could define a correct composition StringToInt>> TwoTimes. Most FP patterns are just a way to wrap data so that Domain and Co-Domain stay compatible, thus composable, despite side effects.

Composable functions properties

If functions are composable, they can be associative if the Domain and the Co-Domain are suitable. It means: g°f(x) = g(x)°f .
You know for example that addition is associative.
So if g is +5, f is +10 and if x is 3, we can write: 5+ (3+10) = (5+3) + 10 = 18.
Note that, if two functions have the same Domain and Co-Domain, they are composable, and associative. The composition of functions with the same Domain and Co-Domain are called transformations, because mathematicians like to give a name to everything (and for once it seems a logical name).

How powerful is function composition?

In a few words: it allows to express complex things in a simple form. In physics for example, Lorentz Transformations are very famous. Using a series of transformation, Lorentz showed that the speed of light is the same in all inertial reference frames. The invariance of light speed is one of the postulates of Einstein’s Special Relativity theory.

Of course we’re not Einstein, but expressing complex things in a simple form is definitely something interesting for software design.

Do you know what’s a function?

I’m so used to functions in my daily job as software programmer that I was sure to know what it is. As often I was wrong, let me tell you why.

Function and relation

I intuitively called a function anything in my code with an input and an output. I suspected that functions without input and/or output where a special kind of functions. But in truth, f: a->b defines a relation between a and b. The Domain is the set containing a, and the Co-Domain is the set containing b.  A relation maps an input from the Domain with an output in the Co-Domain. A relation can be:

Back to code

If a relation can map an element from the Domain with several elements in the Co-Domain, it is not a function. Do I code a lot of relation with different possible outputs for the same input? Yes, any function with side effects when a is an element of the Domain, can produce different output for the same input. They are not functions by the mathematical definition.

It also means that we can’t define a function without defining the Domain and the Co-Domain. If we change the Domain or the Co-Domain, a relation can be a function or not. A famous example is the SquareRoot relation because:

Sqrt : a (|R) -> b (|R) is not a function (indeed, sqrt(4) can be either 2 or – 2).

PositiveSqrt: a (|R+) -> b (|R+) is a function (a subset of squareRoot that return only the positive value)

Note as well that constraining the Domain and the Co-Domain can make the function more specific (injective, bijective, or both).

So what?

Ok my coding function are not mathematical function, why should I care? Let me ask a different question: why do mathematicians care to manipulate functions instead of relations? Because it’s easier to reason about.

In the same way, the more specific the relation is, the more predictable the code will be. If a relation can produce several outputs for the same inputs, the code is not predictable. It means that my relation depends of a more global context, and thus I can’t reason at the function scale without taking this context into account. On the other hand when my relation is a function (usually called pure function in FP, ie without side effects) my code is predictable, and I can think at a function scale without considering the global context.

It is really powerful, because It doesn’t really matter how smart you are if you have the ability to split a complex problem into simple ones.

Why Functional Programming?

In this post, I would like to share why I think the functional paradigm is such a big deal, and why it is worthwhile to learn about it, no matter which technology you are using.

Functional Programming Roots

Functional Programming (FP) is deeply rooted in maths, and is intimidating because of that. But I think we underestimate the value of a mathematical approach of software: it allows to express complex problems with a succession of simple statements. When we translate this to IT, the point is to compose complexity to keep software systems predictable, no matter how big they are.

How mathematics works

Mathematics always follow the same pattern. They define something in a general way, like a polygon: “a polygon is a plane figure that is bounded by a finite chain of straight line segments closing in a loop to form a closed polygonal chain”. Then by applying specific laws, they are able to define more specific things: “A triangle is a polygon with three edges and three vertices”. And by adding more laws, we obtain even more specific things, like an equilateral triangle “An equilateral triangle has all sides the same length”. As you will see, FP design patterns can be explained in the same way: we will use something general, and because we will add some laws, it will define a more specific concept. Note that “A monad is just a monoid in the category of endofunctors” follows this pattern as well. It seems complicated because it uses other concepts we’re not familiar with.

Don’t be scared to learn the symbols

Maths are often intimidating due to complex words and abstract symbols. Learning them is a first step to demystify it. Here are some symbols often used to explain FP concepts:

The mathematical notation y=f(x) means for x, the relation f will produce the output y. f: a->b means for an input a, the relation f will produce an output b. We usually use this notation in FP. Of course we need to define the set for a  and b (integers, positive integers, real numbers, string…). The set of a is called the domain of f, and the set of b is called the codomain of f.

g°f means the composition of g and f, in other word the application of f,then g.

f ≡ g means that f is equivalent to g.

Software predictability

In a pure FP program, f: a->b is always true. Functions are pure, which means they will always produce the same output for the same input, no matter when the function is called. In our industry, because states can mutate over time, we build software where it’s false. If f has side effects, or if anyone can mutate a or b, we can no longer predict the output of f, because it is context dependent. It adds a lot of complexity in our code. Building a program by composing pure function can remove this complexity.

Why Function Composition?

Functional programming is all about function composition. If I have 2 function f:a->b and g: b->c, I want the composition g°f : a->c. The composition of simple functions allows to build services. The composition of services can manage use cases. And a composition of uses cases is an application.

Wrap up

Don’t be scared by mathematical symbols, and remember that if we compose pure functions by respecting some mathematical laws, we’ll be able to build a predictable complex software. It is certainly something that has some interest, isn’t it?

The post where you don’t learn what’s a monad, but where you may want to learn more about it

Since a while I wander how to write another blog on the internet trying to explain what’s a monad. The problem is well known, as soon as you understand what a monad is, you lose the ability to explain it.
The good news is that I still don’t fully understand what’s a monad, so maybe I’m still able to explain it.
I don’t claim to be a mathematician, or Functional Programming (FP) expert. To be honest I am pretty bad in maths, and I use F# when I can, but I still work with C# most of the time.
Thus if you are an expert and see any errors in the FP related blog posts , please do me a favor and correct me 🙂


A combination of Failures

To explain monads, I started a first blog post, to explain the math theory and stay at a theoretical level. My draft was quickly 3 pages long, and unclear, even for me. How deep should I dig into the maths? Ok then, let’s try the practical approach.
I started a new blog post, only to focus on a practical example, to show how we can handle error with something close of the maybe monad. I tried to present my work in a local meetup, and to livecode the explanation. By the way it was a 10-minutes talk. No need to say it was also a failure, I hope the attendees learn some stuff, but they certainly still not understand what’s a monad.


What’s the point?

It slowly drives me crazy. I was thinking day and night about which metaphor I could use to explain monads. Something different from the railway metaphor from Scott Wlaschin if possible. I started two more blog posts, and was still unhappy with them.
Weeks were running, and I couldn’t explain monads in the simple way I was looking for.
I tried to figure out at which moment in my learning path I finally understood it. I also challenged the really reason to do that? Why should I care? Why a developer should know what’s a monad?

Why should we care?

Because monads are a basic building block in functional programming. I think this tweet  nailed it.
Semigroup, monoid, applicative and monad are building blocks, or design patterns inspired from mathematics. They allow to combine anything.
And because functional programming is all about function composition,  it is important to know these blocks as a functional programmer. But the concepts of FP are helpful for an Object Oriented (OO) developer in many aspects as well, as uncle Bob recently points out.


What’s a monad?

Something hard to understand without the maths roots. That’s why I will realease several articles to explain both technical and mathematical aspects, this way you can choose which part is relevant for you. From basic maths to concrete example, choose your path to monads understanding!