Random numbers in Haskell

As of March 2020, School of Haskell has been switched to read-only mode.

Random sequences in functional languages

In an imperative programming language, the random number generator API is typically a function or method that consumes and returns the next available element in the sequence. There may be type-specific variants, or an argument to indicate the return type, and probably some way set the sequence to a known state.

Some applications require this style interface, so System.Random provides that, but requires running in an appropriate monad. For instance, if you were using a hardware source of randomness, they would run in the IO monad since they were doing IO.

However, functional programming encourages the use of pure interfaces, rather than hiding the state. Many applications just need a sequence of random numbers, and the System.Random API makes that fairly simple.

Monte Carlo method algorithms are particularly suited to this, and often used in introductory programming texts to introduce using random numbers in software. I'm going to follow that lead, and walk through a series of Monte Carlo method tests for problems with known solutions, so we can play with them and see how well they work.

Caveats on System.Random

While System.Random provides a random sequence of reasonably high quality, it is not a source for cryptographic quality random bits. The Crypto.Random module provides an interface similar to System.Random as well as functions better suited to cryptography purposes for such a source.

Second, System.Random has some performance issues. We are currently looking into providing a faster module of equal or better quality that provides instances of the classes in System.Random.

Getting a random number

There is, of course, no such thing as a random number. There are sequences of numbers that are more less unpredictable, and if they are unpredictable enough, we call them random sequences. The fundamental type from System.Random is a RandomGen, a generator for a random sequence. You can use mkStdGen to make one, giving it a starting point you've chosen. If you don't care about where it starts, you can use getStdGen to get a RandomGen that was initialized with a random seed when the system was started.

Generators can be passed to a variety of functions to perform operations on the random sequence they generate. The most basic function is randomR, which a tuple of a minimum and maximum value as a first argument, and the generator as a second, and returns a tuple of the first value from that generators sequence and a generator for the rest of the sequence. random is similar, but elides the first argument, deriving it from the type being returned.

These are pure functions, and return a new generator so the user can use them in a manner appropriate to the context they are running in. There are also convenience functions randomRIO and randomIO that use and modify the standard generator in the IO monad. They can - with a little work in the monad - be used to get a sequence of random numbers:

import System.Random
import Control.Monad (replicateM)

main = replicateM 10 (randomIO :: IO Float) >>= print

These functions can output any type that's an instance of the Random class. The instance determines the range used for random and randomIO. For Integral types, it's the entire range of the type, except that Integer is restricted to the range of Int. For Double and Float, it's 0 (inclusive) to 1 (exclusive). Other types will depend on the instance.

Getting a random sequence

Fortunately, there are much easier ways to get random sequences from a generator. Sequences are generated by randomRs and randoms. The first takes a range tuple and a generator as arguments:

import System.Random

main = do
  g <- getStdGen
  print $ take 10 (randomRs ('a', 'z') g)

randoms doesn't have the range argument, but uses the same range as random:

import System.Random

main = do
  g <- getStdGen
  print $ take 10 (randoms g :: [Double])

If you ran these multiple times, you'll notice that you always get the same sequence. That's because the generator was initialized when the runtime started, and the FP Haskell Center uses a single running interpreter for each tutorial being read (or project being edited). If you run this code in ghci in your desktop, you'll see the same behavior. If you compile the code with ghc (or deploy it from a FP Haskell Center project), you'll get a different sequence with each execution. You can work around this by using newStdGen instead of getStdGen, which will give you a new generator each time it is called:

import System.Random

main = do
  g <- newStdGen
  print . take 10 $ (randomRs ('a', 'z') g)

However, randoms and randomRs are pure, so they will generate the same sequence each time they are called with the same generator:

import System.Random

main = do
  g <- getStdGen
  print . take 10 $ (randomRs ('a', 'z') g)
  print . take 10 $ (randomRs ('a', 'z') g)

Flip you for it

We now have the tools needed to do a simple monte carlo experiment. The simplest possible one is calculating the probability of a coin toss coming up heads. So, let's get a sequence of coin flips:

import System.Random

data Coin = Heads | Tails deriving (Show, Enum, Bounded)

instance Random Coin where
  randomR (a, b) g =
    case randomR (fromEnum a, fromEnum b) g of
      (x, g') -> (toEnum x, g')
  random g = randomR (minBound, maxBound) g

main = do
  g <- newStdGen
  print . take 10 $ (randoms g :: [Coin])

In order for randoms to generate a sequence of Coins, they need to be an instance of Random, which needs random and randomR. Having the Coin type derive Enum and Bounded does that for us. This gives us everything we need to write those functions based on the existing instances. randomR needs to generate a random element of the generated type. We do that by translating the range arguments to integers, and then translating the result back to a Coin value. random just runs randomR with the appropriate default arguments.

If you think that the instance doesn't actually reference the Coin values, so ought to be usable with any instance of both Bounded and Enum, you'd be right. However, there are a number of ways to do this, all with different trade-offs, so it's not done in the library.

Now, we can add the framework needed to run the experiment. To do this, we want to calculate the percentage of Heads we see in the stream. We can just take the length of the list, and the length of the list after we filter for Heads.

process :: [Coins] -> (Int, Int)
process cs = (length cs, length (filter (== Heads) cs))

To display the results, we'll need a function to turn that pair of integers into a string:

display:: (Int, Int) -> String
display (c1, c2) = "We got " ++ (show $ 100.0 * c1 / c2) ++ "% heads."

We now just tie all these things together to run the simulation. However, the process function requires that the Coins type also be an instance of Eq, so we add that to the deriving clause.

-- /show
import System.Random

-- show
data Coin = Heads | Tails deriving ({-hi-}Eq,{-/hi-} Show, Enum, Bounded)
-- /show

instance Random Coin where
  randomR (a, b) g =
    case randomR (fromEnum a, fromEnum b) g of
      (x, g') -> (toEnum x, g')
  random g = randomR (minBound, maxBound) g

-- show
-- Number of samples to take
count = 10000

-- Function to process our random sequence
process :: [Coin] -> (Int, Int)
process cs = (length cs, length (filter (== Heads) cs))

-- Function to display the running value.
display:: (Int, Int) -> String
display (coins, heads) = "We got " ++ (show $ 100.0 * fromIntegral heads / fromIntegral coins) ++ "% heads."

main = do
  g <- newStdGen
  putStrLn . display . process . take count $ randoms g

We've also added variable count which sets how many and times to flip the coin.

Try changing the count, and notice that more flips you do, the closer the values stay to 50%.

Easy as π

Another common example of the monte carlo method is determining the value of π. This can be done by getting pairs of random values between 0 and 1, and seeing if they are inside the unit circle. The area of the unit circle is π, so the area in the square bounded by (0, 0) and (1,1) is π/4. That fraction of the the points taken randomly from that square will be inside the unit circle.

In this case, instead of possibly walking the list twice depening on how well the compiler optimizes things, we're going to explicitly process each element with foldl'. foldl' is a higher order function that repeatedly applies its first argument (which is our new function sumInCircle) to its second argument and the head of the list, then to the results of the previous application of its first argument and the next element of the list, returning the result of the last such application.

sumInCircle takes a pair of Ints and a pair of Doubles, and increments the second Int - thus counting the length of the list - and only increments the second Int if the square of the two Doubles is less than 1.0 - meaning it represents a point inside the unit circle.

In this case, we need to deal with the fact that Haskell is a lazy language. While foldl' will evaluate the arguments as they are generated, it will stop at the tuple returned by sumInCircle, meaning it will generate a list of thunks to be evaluated. In order to force those to be evaluated, we use the GHC extension BangPatterns, and put an exclamation point in front of the two arguments to sumInCircle. That will force them to be evaluated as well, keeping our memory usage constant.

{-# LANGUAGE BangPatterns #-}
-- /show
import System.Random
import Data.List (foldl')

-- show
-- Number of samples to take
count = 10000

-- Function to process our random sequence
process :: [(Double, Double)] -> (Int, Int)
process = foldl' sumInCircle (0, 0)

-- Function to process a running value and a random value, producing a new running value.
sumInCircle :: (Int, Int) -> (Double, Double) -> (Int, Int)
sumInCircle (!ins, !total) (x, y) = (ins + if x*x + y*y < 1.0 then 1 else 0,
                               total + 1)

-- Function to display the running value.
display:: (Int, Int) -> String
display (heads, coins) = "π = "  ++ (show $ 4.0 * fromIntegral heads / fromIntegral coins)

-- function to prepare the random sequence for processing
prep :: [Double] -> [(Double, Double)]
prep (a:b:r) = (a,b):prep r

main = do
  g <- newStdGen
  putStrLn . display . process . take count . prep $ randoms g

We also add prep, which processes the random sequence into the form we need to pass to our processing function.

Feedback

Please discuss this post on the feedback page.

comments powered by Disqus