Order of Evaluation

The order of evaluation in Haskell is determined by one and only one thing: Data Dependence.

An expression will only be evaluated when it is needed, and will not be evaluated if it is not needed.

Pure Code

Consider the following code:

f x = v2 + 4
  where v0 = undefined + 1
        v1 = 2 * x
        v2 = v1 / 3 + x

main = putStrLn $ show (f 10)

The order of the variable definitions in the where clause has no effect whatsoever on the evaluation of this function.

f' x = v2 + 4
  where v2 = v1 / 3 + x
        v0 = undefined + 1
        v1 = 2 * x

main = putStrLn $ show (f' 10)

v0 is never evaluated in either f or f' since it is not needed by the result value v2 + 4.

Now that we've shown what won't be evaluated, how do we go about seeing when things are evaluated?

Detour: Tracing in Haskell

The standard Haskell libraries come with a module Debug.Trace which has a trace function

trace :: String -> a -> a

which will output the given string to stderr before returning its second argument.

However we can't use trace by itself to track when things are evaluated because trace, like everything else, is lazy. So evaluating trace str x will cause str to be output without forcing x to be evaluated.

import Debug.Trace

main = putStrLn $ show $ trace "res" $ trace "1" 1 + trace "2" 2

As we can see, res is output before 1 and 2.

However, we can write a function to force evaluation before tracing.

import Debug.Trace

tr msg x = seq x $ trace msg x

main = putStrLn $ show $ tr "res" $ tr "1" 1 + tr "2" 2

Now, as expected, res is output after 1 and 2.

Order of Evaluation

We can now see explicitly when each term is evaluated.

import Debug.Trace
tr msg x = seq x $ trace msg x

f' x = v2 + 4
  where v2 = tr "v2" $ v1 / 3 + x'
        v0 = tr "v0" $ undefined + 1
        v1 = tr "v1" $ 2 * x'
        x' = tr "x" x

main = putStrLn $ show (f' 10)

I've introduced x' so we can also see when x is evaluated.

x is needed by v1 which is needed by v2 which is needed by the resulting value which is printed. Therefore the order of evaluation is: x, v1, v2.

The same principle holds true for expressions anywhere, not just in where clauses.

data E a b = L a | R b

keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs

g = sum . keepRs

main = putStrLn $ show (g [R 10, L undefined, R 20])

No expressions underneath a L constructor in the input list to g ever gets evaluated since keepRs never uses the expressions underneath L constructors.

To get a more detailed picture of how evaluation is working, we can have a traced version of the above.

import Debug.Trace
tr msg x = seq x $ trace msg x

data E a b = L a | R b

keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs

g = sum . keepRs

main = putStrLn $ show (g [(tr "R_0" R) (tr "R_0's 10"  10), 
                           (tr "L"   L) (tr "undefined" undefined), 
                           (tr "R_1" R) (tr "R_1's 20"  20)
                          ]
                       )

Notice the outer constructors (L and R) of each list element are evaluated; they are needed by keepRs to figure out which clause to apply. However, only the arguments of the R constructors are evaluated when they are needed by sum.

One thing to mention is that bang patterns can be used to create strict data constructors which can force the evaluation of their arguments. For example, suppose we rewrite the previous code with a strict version of E:

data E a b = L !a | R !b

keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs

g = sum . keepRs

main = putStrLn $ show (g [R 10, L undefined, R 20])

Now, even though keepRs doesn't use the arguments of L, they still get evaluated since L is strict.

Monadic Code

Even in monadic code, data dependence is the only thing which determines if and when an expression gets evaluated. Although the order of monadic actions affects a program-- it determines the order in which the monadic operations are performed and in which variables are brought into scope-- it does not determine the order in which expressions get evaluated.

Consider the following monadic version of our first example, extended with some extra monadic actions.

import Debug.Trace
tr msg x = seq x $ trace msg x

fM x = do
  x' <- return $ tr "x" x
  v0 <- return $ tr "v0" $ undefined + 1
  v1 <- return $ tr "v1" $ 2 * x'
  v2 <- return $ tr "v2" $ v1 / 3 + x'
  putStrLn "Enter an Int: "
  v3 <- fmap (\x -> tr "v3" $ v0 + 2 + read x) getLine
  return $ v2 + 4

main = putStrLn . show =<< fM 10

Even though the getLine is executed, there is no error during evaluation. v0 does not get evaluated since it is only used in v3, but v3 is not evaluated since it is not used by the result value. In fact, you can enter complete gibberish and there will still be no error.

To understand better what exactly monadic code causes to happen, we just need to look a little closer. do notation is just syntactic sugar and can be expanded into regular Haskell syntax using >>= :

do x <- m
   f

where f is some code which depends upon x (i.e. in which x occurs unbound) gets translated to

m >>= (\x -> f) 

I have added parentheses to make the two arguments to >>= clear.

The type of >>=

(>>=) :: Monad m => m a -> (a -> m b) -> m b

along with the left identity monad law

return x >>= f  ==  f x

tell us that >>= does not look at the expression it takes out of its monad and passes on to its second argument; consider the case where x is undefined and f is const ().

The first argument of >>= is only evaluated enough to get a monadic expression, but anything under the constructor for the monad type is not evaluated. For example, if we are in the Maybe monad

data Maybe a = Nothing | Just a

instance Monad Maybe where
  return x = Just x

  (Just x) >>= f = f x
  Nothing  >>= _  = Nothing

>>= will not cause anything under the Just constructor to be evaluated.