# Random sequences in functional languages

In an imperative programming language, the random number generator API is typically a function or method that consumes and returns the next available element in the sequence. There may be type-specific variants, or an argument to indicate the return type, and probably some way set the sequence to a known state.

Some applications require this style interface, so `System.Random` provides that, but requires running in an appropriate monad. For instance, if you were using a hardware source of randomness, they would run in the `IO` monad since they were doing IO.

However, functional programming encourages the use of pure interfaces, rather than hiding the state. Many applications just need a sequence of random numbers, and the <hoogle search="System.Random">System.Random</hoogle> API makes that fairly simple.

*Monte Carlo method* algorithms are particularly suited to this, and often used in introductory programming texts to introduce using random numbers in software. I'm going to follow that lead, and walk through a series of Monte Carlo method tests for problems with known solutions, so we can play with them and see how well they work.

## Caveats on `System.Random`

While `System.Random` provides a random sequence of reasonably high quality, it is **not** a source for cryptographic quality random bits. The <hoogle search="Crypto.Random">Crypto.Random</hoogle> module provides an interface similar to `System.Random` as well as functions better suited to cryptography purposes for such a source.

Second, `System.Random` has some performance issues. We are currently looking into providing a faster module of equal or better quality that provides instances of the classes in `System.Random`.

# Getting a random number

There is, of course, no such thing as a random number. There are sequences of numbers that are more less unpredictable, and if they are unpredictable enough, we call them _random sequences_. The fundamental type from <hoogle search="System.Random">System.Random</hoogle> is a <hoogle search="System.Random.RandomGen">RandomGen</hoogle>, a generator for a random sequence.  You can use <hoogle search="System.Random.mkStdGen">mkStdGen</hoogle> to make one, giving it a starting point you've chosen. If you don't care about where it starts, you can use <hoogle search="System.Random.getStdGen">getStdGen</hoogle> to get a `RandomGen` that was initialized with a random seed when the system was started.

Generators can be passed to a variety of functions to perform operations on the random sequence they generate. The most basic function is <hoogle search="System.Random.randomR">randomR</hoogle>, which a tuple of a minimum and maximum value as a first argument, and the generator as a second, and returns a tuple of the first value from that generators sequence and a generator for the rest of the sequence. <hoogle search="System.Random.random">random</hoogle> is similar, but elides the first argument, deriving it from the type being returned.

These are pure functions, and return a new generator so the user can use them in a manner appropriate to the context they are running in. There are also convenience functions <hoogle search="System.Random.randomRIO">randomRIO</hoogle> and <hoogle search="System.Random.randomIO">randomIO</hoogle> that use and modify the standard generator in the IO monad. They can - with a little work in the monad - be used to get a sequence of random numbers:

```active haskell
import System.Random
import Control.Monad (replicateM)

main = replicateM 10 (randomIO :: IO Float) >>= print
```

These functions can output any type that's an instance of the <hoogle search="System.Random.Random">Random</hoogle> class. The instance determines the range used for `random` and `randomIO`. For `Integral` types, it's the entire range of the type, except that `Integer` is restricted to the range of `Int`. For `Double` and `Float`, it's 0 (inclusive) to 1 (exclusive). Other types will depend on the instance.

# Getting a random sequence

Fortunately, there are much easier ways to get random sequences from a generator. Sequences are generated by <hoogle search="System.Random.randomRs">randomRs</hoogle> and <hoogle search="System.Random.randoms">randoms</hoogle>. The first takes a range tuple and a generator as arguments:

``` active haskell
import System.Random

main = do
  g <- getStdGen
  print $ take 10 (randomRs ('a', 'z') g)
```

`randoms` doesn't have the range argument, but uses the same range as `random`:

``` active haskell
import System.Random

main = do
  g <- getStdGen
  print $ take 10 (randoms g :: [Double])
```

If you ran these multiple times, you'll notice that you always get the same sequence. That's because the generator was initialized when the runtime started, and the FP Haskell Center uses a single running interpreter for each tutorial being read (or project being edited). If you run this code in ghci in your desktop, you'll see the same behavior. If you compile the code with ghc (or deploy it from a FP Haskell Center project), you'll get a different sequence with each execution. You can work around this by using <hoogle search="System.Random.newStdGen">newStdGen</hoogle> instead of `getStdGen`, which will give you a new generator each time it is called:

``` active haskell
import System.Random

main = do
  g <- newStdGen
  print . take 10 $ (randomRs ('a', 'z') g)
```

However, `randoms` and `randomRs` are pure, so they will generate the same sequence each time they are called with the same generator:

``` active haskell
import System.Random

main = do
  g <- getStdGen
  print . take 10 $ (randomRs ('a', 'z') g)
  print . take 10 $ (randomRs ('a', 'z') g)
```

# Flip you for it

We now have the tools needed to do a simple monte carlo experiment. The simplest possible one is calculating the probability of a coin toss coming up heads. So, let's get a sequence of coin flips:

``` active haskell
import System.Random

data Coin = Heads | Tails deriving (Show, Enum, Bounded)

instance Random Coin where
  randomR (a, b) g =
    case randomR (fromEnum a, fromEnum b) g of
      (x, g') -> (toEnum x, g')
  random g = randomR (minBound, maxBound) g

main = do
  g <- newStdGen
  print . take 10 $ (randoms g :: [Coin])
```

In order for `randoms` to generate a sequence of `Coins`, they need to be an instance of <hoogle search="System.Random.Random">Random</hoogle>, which needs `random` and `randomR`.  Having the `Coin` type derive `Enum` and `Bounded` does that for us. This gives us everything we need to write those functions based on the existing instances. `randomR` needs to generate a random element of the generated type. We do that by translating the range arguments to integers, and then translating the result back to a `Coin` value. `random` just runs `randomR` with the appropriate default arguments. 

If you think that the instance doesn't actually reference the `Coin` values, so ought to be usable with any instance of both `Bounded` and `Enum`, you'd be right. However, there are a number of ways to do this, all with different trade-offs, so it's not done in the library.

Now, we can add the framework needed to run the experiment. To do this, we want to calculate the percentage of `Heads` we see in the stream. We can just take the length of the list, and the length of the list after we filter for `Heads`. 

``` haskell
process :: [Coins] -> (Int, Int)
process cs = (length cs, length (filter (== Heads) cs))
```

To display the results, we'll need a function to turn that pair of integers into a string:

``` haskell
display:: (Int, Int) -> String
display (c1, c2) = "We got " ++ (show $ 100.0 * c1 / c2) ++ "% heads."
```

We now just tie all these things together to run the simulation. However, the `process` function requires that the `Coins` type also be an instance of `Eq`, so we add that to the `deriving` clause.

``` active haskell
-- /show
import System.Random

-- show
data Coin = Heads | Tails deriving ({-hi-}Eq,{-/hi-} Show, Enum, Bounded)
-- /show

instance Random Coin where
  randomR (a, b) g =
    case randomR (fromEnum a, fromEnum b) g of
      (x, g') -> (toEnum x, g')
  random g = randomR (minBound, maxBound) g

-- show
-- Number of samples to take
count = 10000

-- Function to process our random sequence
process :: [Coin] -> (Int, Int)
process cs = (length cs, length (filter (== Heads) cs))

-- Function to display the running value.
display:: (Int, Int) -> String
display (coins, heads) = "We got " ++ (show $ 100.0 * fromIntegral heads / fromIntegral coins) ++ "% heads."

main = do
  g <- newStdGen
  putStrLn . display . process . take count $ randoms g
```

We've also added variable `count` which sets how many and times to flip the coin.

Try changing the count, and notice that more flips you do, the closer the values stay to 50%.

# Easy as π

Another common example of the monte carlo method is determining the value of π. This can be done by getting pairs of random values between 0 and 1, and seeing if they are inside the unit circle. The area of the unit circle is π, so the area in the square bounded by (0, 0) and (1,1) is π/4. That fraction of the the points taken randomly from that square will be inside the unit circle.

In this case, instead of possibly walking the list twice depening on how well the compiler optimizes things, we're going to explicitly process each element with <hoogle search="Data.List.foldl'">foldl'</hoogle>. `foldl'` is a higher order function that repeatedly applies its first argument (which is our new function `sumInCircle`) to its second argument and the head of the list, then to the results of the previous application of its first argument and the next element of the list, returning the result of the last such application.

`sumInCircle` takes a pair of `Int`s and a pair of `Double`s, and increments the second `Int` - thus counting the length of the list - and only increments the second `Int` if the square of the two `Double`s is less than `1.0` - meaning it represents a point inside the unit circle.

In this case, we need to deal with the fact that Haskell is a lazy language. While `foldl'` will evaluate the arguments as they are generated, it will stop at the tuple returned by `sumInCircle`, meaning it will generate a list of thunks to be evaluated. In order to force those to be evaluated, we use the GHC extension `BangPatterns`, and put an exclamation point in front of the two arguments to sumInCircle. That will force them to be evaluated as well, keeping our memory usage constant.

``` active haskell
{-# LANGUAGE BangPatterns #-}
-- /show
import System.Random
import Data.List (foldl')

-- show
-- Number of samples to take
count = 10000

-- Function to process our random sequence
process :: [(Double, Double)] -> (Int, Int)
process = foldl' sumInCircle (0, 0)

-- Function to process a running value and a random value, producing a new running value.
sumInCircle :: (Int, Int) -> (Double, Double) -> (Int, Int)
sumInCircle (!ins, !total) (x, y) = (ins + if x*x + y*y < 1.0 then 1 else 0,
                               total + 1)

-- Function to display the running value.
display:: (Int, Int) -> String
display (heads, coins) = "π = "  ++ (show $ 4.0 * fromIntegral heads / fromIntegral coins)

-- function to prepare the random sequence for processing
prep :: [Double] -> [(Double, Double)]
prep (a:b:r) = (a,b):prep r

main = do
  g <- newStdGen
  putStrLn . display . process . take count . prep $ randoms g
```

We also add `prep`, which processes the random sequence into the form we need to pass to our processing function.

## Feedback

Please discuss this post on the [feedback page](https://plus.google.com/100162554869434148021/posts/5Geh5qeXjNj).