You don't need a Monad (yet)
Table of Contents
Functors, applicatives and why Monads are the last step #
I’ve spent a lot of time talking about Monads as the “programmable semicolon,” but I’ve noticed a recurring problem: developers often treat the Monad as the starting point. In reality, the Monad should be the “final boss.”
If you reach for a Monad when a Functor would do, you’re essentially using a chainsaw to cut a piece of paper. It works, but it’s overkill, and you lose the safety and simplicity of smaller tools.
In my previous post, we defined a Context as a wrapper around a value (like a Box). We rarely work with plain values; we work with values wrapped in some kind of context:
- Failure the value might not be there (
Optional,Maybe,Result) - Time the value isn’t there yet (
Promise,Future) - Multiplicity there are many values (
List,Array)
The question is not: “Do I need a monad?” The question is: “How much power do I need?”
These three typeclasses, Functor, Applicative and Monad, form a hierarchy of capabilities. Think of them as a set of nested Russian dolls, where each outer doll contains the functionality of the inner one, plus a bit more.
- Functor “I want to change the value, but the box stays the same.”
- Applicative “I have multiple boxes and I want to combine them.”
- Monad “The result of one box tells me which box to open next.”
Here’s how they relate
Every Monad is also an Applicative, and every Applicative is also a Functor. A Monad can do everything a Functor or Applicative can, but the reverse is not true.
The Functor #
The Functor is the most basic level of “context awareness.” It says: if you give me a pure function \(a \to b\), I can apply it to the value inside my box without breaking the box.
In Haskell, this is fmap:
\[
\text{fmap} :: (a \to b) \to f\space a \to f \space b
\]
fmap takes a function and “lifts” it into the context, applying it to the value inside the functor. The outer f (the context itself) remains unchanged.
addOne :: Int -> Int
addOne x = x + 1
myMaybe :: Maybe Int
myMaybe = Just 5
result :: Maybe Int
result = fmap addOne myMaybe -- Just 6
nothing :: Maybe Int
nothing = fmap addOne Nothing -- Nothing
fmap handles the Just or Nothing context automatically. You don’t have to write something like
case myMaybe of
Just x -> Just (addOne x)
Nothing -> Nothing
If your function is “pure” (it doesn’t create new effects or boxes), Functor is all you need.
Functor laws #
Just like Monads, functors must follow specific mathematical laws to ensure they behave predictably. If a type implements the interface but breaks these laws, it will cause subtle bugs when you try to refactor or compose your code.
A Functor must preserve identity and composition. Essentially, mapping should only change the values, never the structure.
Identity If you map the identity function (
id) over a functor, the functor should remain exactly the same. Mapping “do nothing” over a box should do nothing to the box. \[ \text{fmap } id \equiv id \]Composition Mapping two functions one after the other should be the same as mapping their composition in one go.
\[ \text{fmap } (f \circ g) \equiv \text{fmap } f \circ \text{fmap } g \]
When a Functor is not enough #
What if the function itself is wrapped in a context? Or what if you have multiple wrapped values you need to combine?
Consider trying to add two Maybe Int values with fmap:
add :: Int -> Int -> Int
add x y = x + y
maybeX :: Maybe Int
maybeX = Just 6
maybeY :: Maybe Int
maybeY = Just 7
How would we combine maybeX and maybeY using just fmap? fmap add maybeX would give us Maybe (Int -> Int) (a function wrapped in a context) and we can’t apply this to Maybe Int using fmap.
This is a problem fmap can’t solve. We have two independent contexts (maybeX and maybeY), and we want to apply a function (add) that needs both values. We need a way to “lift” add so it can operate on values inside multiple contexts.
This creates the exact motivation for Applicative.
Applicative #
Applicative allows us to apply functions that are wrapped in a context to values that are wrapped in a context. It’s designed for situations where you want to combine several independent effects or wrapped values.
The core functions for Applicative are:
\[
\text{pure} :: a \to f \space a \] \[
(\langle * \rangle) :: f \space (a \to b) \to f \space a \to f \space b
\]
pure takes a raw value and “lifts” it into the minimal context. E.g. pure 5 becomes Just 5 for Maybe.
(<*>) (pronounced “apply”) takes a context-wrapped function (f (a -> b)) and applies it to a context-wrapped value (f a), producing a context-wrapped result (f b).
add :: Int -> Int -> Int
add x y = x + y
maybeX :: Maybe Int
maybeX = Just 5
maybeY :: Maybe Int
maybeY = Just 10
-- Lift 'add' into the Maybe context using pure, then apply with <*>
result :: Maybe Int
result = pure add <*> maybeX <*> maybeY
-- result will be Just 15
-- If any part is Nothing, the whole result is Nothing
failX :: Maybe Int
failX = Nothing
failedResult :: Maybe Int
failedResult = pure add <*> failX <*> maybeY
-- failedResult will be Nothing
Here, pure add creates a Maybe (Int -> Int -> Int). Then, (<*>) maybeX applies the first argument, resulting in Maybe (Int -> Int). Finally (<*>) maybeY applies the second argument, yielding Maybe Int.
The key insight is that all effects (maybeX and maybeY) are independent and known upfront. If this feels restrictive, that’s intentional, this is exactly the restriction monads remove, as discussed in the previous post.
The code above can actually be made more idiomatic. As we will see in a bit, the Applicative laws state that pure f <*> x must be the same as fmap f x. Since fmap in infix form is <$> we can rewrite the code to look more like regular function application, just with some “decorations” around the arguments, as follows:
result :: Maybe Int
result = add <$> maybeX <*> maybeY
add <$> maybeX will create a Maybe (Int -> Int) which we then apply on maybeY using the apply (<*>) operator.
Applicative laws #
Since applicatives are more powerful, they naturally have more laws (four primary ones) to ensure that pure and <*> play nicely together.
Identity Applying a “pure” identity function to a wrapped value should do nothing. \[ \text{pure } id \space\langle * \rangle\space v \equiv v \]
Homomorphism Applying a “pure” function to a “pure” value should be the same as applying the function to the value normally and then wrapping it \[ \text{pure } f \space\langle * \rangle\space \text{pure } x \equiv \text{pure } (f\space x) \]
Interchange Applying a wrapped function
uto a pure valueyis the same as applying a “pure” function that receivesyto that wrapped functionu\[ u \space\langle * \rangle\space \text{pure } y \equiv \text{pure } (\$ \space y) \space\langle * \rangle\space u \] This ensures that it doesn’t matter if you treat the value or the function as the “primary” context; the result is consistentComposition It ensures that applying wrapped functions follows a similar rule to the Functor composition law \[ \text{pure } (\circ) \space \langle * \rangle \space u \space\langle * \rangle\space v \space\langle * \rangle\space w \equiv u \space\langle * \rangle\space (v \space\langle * \rangle\space w) \]
In Category Theory, we say that pure is a natural transformation. This is just a fancy way of saying that pure doesn’t look at the data inside or change how fmap behaves. Given pure and <*>, a lawful Functor instance is uniquely determined and must satisfy:
fmap f x = pure f <*> x
Practical use case #
Applicative shines in scenarios where multiple things can happen independently from each other and the results are collected. For example in form validation where you want to collect all errors, not just stop at the first one.
-- A simplified Validation context (often implemented with Either or specific types)
-- Success stores a value, Failure stores a list of error messages
data Validation e a = Success a | Failure [e] deriving (Show)
-- Example Applicative instance for Validation (simplified)
instance Applicative (Validation [e]) where
pure = Success
(Success f) <*> (Success x) = Success (f x)
(Failure e1) <*> (Failure e2) = Failure (e1 ++ e2) -- Collect all errors
(Failure e) <*> _ = Failure e
_ <*> (Failure e) = Failure e
-- Validator functions
validateName :: String -> Validation String String
validateName name
| null name = Failure ["Name cannot be empty"]
| length name < 3 = Failure ["Name too short"]
| otherwise = Success name
validateEmail :: String -> Validation String String
validateEmail email
| '@' `notElem` email = Failure ["Invalid email format"]
| otherwise = Success email
-- A record constructor
data User = User { userName :: String, userEmail :: String } deriving (Show)
-- Combine validations using Applicative
buildUser :: String -> String -> Validation String User
buildUser name email = User <$> validateName name <*> validateEmail email
-- Example usage
validUser :: Validation String User
validUser = buildUser "Tom" "tom@example.com"
-- Success (User {userName = "Tom", userEmail = "tom@example.com"})
invalidUser :: Validation String User
invalidUser = buildUser "" "invalid-email"
-- Failure ["Name cannot be empty", "Invalid email format"]
NB: in practice, the error accumulation should depend on (<>) and should thus require a Semigroup constraint on the error type
Notice how buildUser collects both errors in invalidUser. A Monad would typically stop at the first failure, which is often undesirable for user input validation.
With Applicative, all effects are known up front. This is the defining distinction from a Monad.
Why applicative is weaker than monad #
The “weakness” of Applicative is its strength. Because it enforces independence, the compiler (and human reader) knows that the order of the independent effects doesn’t matter (or that they can even happen concurrently). This allows for optimizations and makes reasoning about code simpler.
-- Applicative: structure known in advance, effects are independent
applicativeResult :: Maybe (Int, Int)
applicativeResult = (,) <$> Just 3 <*> Just 4
-- Result: Just (3, 4)
-- Monad: structure/flow depends on the value
monadicResult :: Maybe Int
monadicResult = do
x <- Just 3
if x > 0 then Just (x * 2) else Nothing
-- Result: Just 6
monadicConditionalResult :: Maybe Int
monadicConditionalResult = do
x <- Just (-1)
if x > 0 then Just (x * 2) else Nothing
-- Result: Nothing
In the Applicative example, the (,) function (which creates a tuple) and its arguments (Just 3 and Just 4) are all independent. The computation path is predetermined. Applicative describes a static computation graph; Monad allows the graph itself to depend on values.
In the Monad example, the if x > 0 condition creates a data-dependent branch. What happens next (whether Just (x * 2) or Nothing is produced) depends directly on the value of x from the previous monadic step. This dynamic branching is the exclusive power of Monads.
When you actually need Monads #
You need a Monad when your computations have a sequential dependency, meaning the outcome or even the existence of a later step relies on the result of an earlier step. This is when the “programmable semicolon” truly shines.
Reasons to use a Monad:
- control flow deciding which computation to run next based on a previous result
- early exit stopping a sequence of computation upon the first failure
- state threading managing explicit state that changes across computations
- IO sequencing performing I/O operations in defined order
Monads are not bad, they are just the last resort. They give you the most power, but with that power comes a bit more complexity in reasoning, as the path of execution can be dynamic.
Back to category theory #
Let’s briefly revisit the perspective from category theory, which elegantly formalizes these relationships.
- Functor a functor is a mapping between categories that preserve their structure but changes the objects. In Haskell, a
Functormaps types (objects) and functions (morphisms) inHaskto other types and functions inHask. It’s a way to “lift” ordinary functions. - Applicative in Haskell, Applicative corresponds to a Lax Monoidal Functor with respect to the cartesian product. This means that, in addition to being a Functor, it has a way to combine its contexts (
<*>) and has a neutral element (pure). The monoidal structure captures the idea of combining independent effects. It essentially says: “if I know how to combine A and B in my starting category, I know how to combine them in my target category” - Monad as we discussed, a Monad is a Monoid over the category of Endofunctors. This means that the “objects” of this monoid are Functors themselves, and the monoid operation (
joinor>>=) is a way to compose or flatten these Functor layers.
The beauty is that the mathematical structure precisely mirror the programming concerns:
Functor= applying a function to a contextApplicative= combining independent effectsMonad= composing effects where the next effect can depend on the previous result
Conclusion #
At this point, the pattern should be clear: each step up the ladder buys you more expressive power, but also fewer guarantees. The real question was never “Do I need a Monad?”, it was always “What constraints can I afford to give up?”
Before blindly using Monads for everything, first think of what suits your use case the best.
- If you can use
fmap, do it. It’s the simplest and least constrained. It’s for when you only need to apply a pure function to a value inside a context - If you need
pureand<*>, stop there. Applicative is for when you want to combine several independent effects or values wrapped in contexts. Think parallel operations or collecting multiple validation errors. - Reach for
>>=only when later steps depend on earlier values Monad is for sequential, data-dependent computations, control flow, early exiting, or threading state/IO
Or, to put it simply: start with a functor, upgrade to applicative, use monad when the program demands it.