```haskell newSTRef :: a -> ST s (STRef s a) readSTRef :: STRef s a -> ST s a writeSTRef :: STRef s a -> a -> ST s () ... ``` When you're done, you run the entire `ST s` calculation at once: ```haskell runST :: (forall s. ST s a) -> a ``` Since you are universally quantified over the choice of `s`, you can't use references produced in one `ST` calculation in another, and encapsulation, referential transparency and all the good things about functional programming are preserved. However, we had to run the entire effect at once. If I want to amortize it and pay it down over time, I'm out of luck. In the end I'm interested in building very large vectors but paying for their construction in very small, affordable, chunks. That's where today's hackery comes in. ## Every Step You Take My first thought was of course to use something like ```haskell walkST :: (forall s. Free (ST s) a) -> Free Identity a ``` or ```haskell walkST :: (forall s. Free (ST s) a) -> Partial a ``` But this has the problem that [every step you take](http://www.youtube.com/watch?v=OMOGaugKpzs&t=26s) will be observed and must be paid for, even little administrative actions at the end, such as freezing the result vector costs you an extra step. If I want to do something like work with existing `Stream` fusion out of `vector` and just have it pay once for every `step`, this approach isn't going to cut it. You could bandaid this with a coproduct and use `Free (ST s :+: Identity)` and promise not to use the knowledge that the `ST s` calculations were generated separately for evil, but given the amount of time I recently spent talking about [rightsizing abstractions](https://www.fpcomplete.com/user/edwardk/editorial/procrustean-mathematics), it'd be hypocritical for me not to try to find a better way. ## Capretta's Iterative Monad Transformer We can upgrade Capretta's partiality monad to a monad transformer as done by [Capretta, Altenkirch and Uustalu](http://www.ioc.ee/~tarmo/tday-veskisilla/uustalu-slides.pdf). ```active haskell -- show newtype IterT m a = IterT { runIterT :: m (Either a (IterT m a)) } instance Monad m => Monad (IterT m) where return = IterT . return . Left IterT m >>= k = IterT $ m >>= either (runIterT . k) (return . Right . (>>= k)) fail = IterT . fail -- /show main = putStrLn "It typechecks!" ``` I've added `IterT` and its dual to `free` Here we can still insert explicit steps: ```haskell step :: Monad m => IterT m () step = IterT . return . Right . return ``` If you've been playing with free monads for a while you'll recognize this as a version of `FreeT Identity m` from the `free` package, rather than the simpler `Free m` above. Again, we face the technical distinction that this is based on `νx. ST s (a + x)` not `μx. ST s (a + x)`. `IterT` has been added to the `free` package in version 4.2 as a distinct construction from `FreeT Identity` to help drive this distinction home! Now what we want to do now is generate a slower version of `runST` that takes a properly quantified `ST s` calculation with inserted step markers and walks through it carefully, one step at a time: ```haskell walkST :: (forall s. IterT (ST s) a) -> Partial a ``` ## Walking the Walk We have a few options for how to implement `walkST`. It is possible to do this entirely with `unsafeInterleaveST`. This is actually a non-trivial exercise and it is very easy to accidentally write a version that is too eager and performs effects too soon. My best version so far requires two uses of `unsafeInterleaveST` to get the right semantics. I leave this as an exercise for the reader. You can also rummage through [λpaste](http://lpaste.net) for old versions of this monad for tips. ;) ## Newsflash: `unsafeInterleaveST` Is Unsafe! Even if you get that right though, `unsafeInterleaveST` is a whole lot more unsafe for this use case, than the equivalent `unsafeInterleaveIO` operation! To understand why we need to look down in the guts of each of them. When a thunk is evaluated in GHC there is an ever so tiny race condition. When one thread enters into a thunk there is a tiny 1-2 cycle window between that thread entering and establishing the [greyhole](http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.125.857) that will catch other threads and make them block, during which another thread could come along and start evaluating the same thunk at the same time. In the absence of side-effects this is benign. The risk is so low relative to the astronomical costs of synchronization across threads that we, well, just don't bother synchronizing. This of course would be bad if the thunk _did_ have side-effects, _e.g._ if it called `unsafePerformIO`. Internally `unsafePerformIO` calls `noDuplicate` to check to make sure that we're not duplicating effort and effects: ```haskell unsafePerformIO :: IO a -> a unsafePerformIO m = unsafeDupablePerformIO (noDuplicate >> m) ``` Similarly `unsafeInterleaveIO` also checks `noDuplicate`. ```haskell unsafeInterleaveIO :: IO a -> a unsafeInterleaveIO m = unsafeDupableInterleaveIO (noDuplicate >> m) ``` But `unsafeInterleaveST`, on the other hand rather boldly does not. ```haskell unsafeInterleaveST :: ST s a -> ST s a unsafeInterleaveST (ST m) = ST ( \ s -> let r = case m s of (# _, res #) -> res in (# s, r #) ) ``` This means that it may very well wind up duplicating work if the result `r` of the `ST s a` calculation we called `unsafeInterleaveST` on is evaluated via `par`. And since we don't control what users will do with our code, you really do need to allow for that. Roman Leschinskiy's cute [Sparking Imperatives](http://unlines.wordpress.com/2010/04/21/sparking-imperatives/) hack even mixes `par` with `ST`, but he is careful to `noDuplicate` as he goes. Now, there is possibly a very good reason for this distinction. If we look at the haddocks for `noDuplicate` it > Ensures that the suspensions under evaluation by the current thread are unique; that is, the current thread is not evaluating anything that is also under evaluation by another thread that has also executed 'noDuplicate'. So we're faced with a dilemma (trilemma?), we can either: 1. abandon the use of `unsafeInterleaveST` entirely as too risky. 2. reason through whether `noDuplicate` would be legal to use and how to mix it with the existing `unsafeInterleaveST`. 3. or we can require the end user to only ever perform idempotent operations in the `ST` monad! For now I'm largely restricting myself to #1 and #3. For an API I expect to expose to an end user, I'm most likely to choose option #1. Anything they do with the resulting construction can be branded `Trustworthy` and they don't need to know how it is built in too much detail. But when I'm writing code myself that merely exposes a pure façade, the performance benefits of #3 may well outweigh the reasoning difficulties. In principle, if my principal operations are merging elements from two immutable input vectors and generating output in another vector, so long as I'm not bumping a counter stored in an `STRef`, everything I do will have idempotent effects. In the long term, it is probably worth checking to see if `unsafeInterleaveST` should be updated to do `noDuplicate` and thereby close out the concern about #2, effectively merging it performance-wise with option #1. This still leaves option #3 open for constant tuning in the same crazy way as the `inlinePerformIO` hackery gets used down in `bytestring`. ## A Safer Alternative: `unsafePerformIO` You have to love it when `unsafePerformIO` is the safest option. ```active haskell {-# LANGUAGE RankNTypes #-} import Control.Monad.ST import System.IO.Unsafe as Unsafe import Control.Monad.ST.Unsafe as Unsafe -- show walkST :: (forall s. IterT (ST s) a) -> Partial a walkST m = go m where go (IterT m) = case Unsafe.unsafePerformIO $ Unsafe.unsafeSTToIO m of Left a -> Stop a Right m -> Step (go m) -- /show newtype IterT m a = IterT { runIterT :: m (Either a (IterT m a)) } instance Monad m => Monad (IterT m) where return = IterT . return . Left IterT m >>= k = IterT $ m >>= either (runIterT . k) (return . Right . (>>= k)) fail = IterT . fail data Partial a = Stop a | Step (Partial a) instance Monad Partial where return = Stop Stop a >>= f = f a Step m >>= f = Step (m >>= f) main = putStrLn "It typechecks!" ``` Here we're relying on the fact that we perform one step at a time, and that we're evaluating an "entirely sealed" `ST s` calculation. When we're done and have the answer `a` in our `Partial a`, like with conventional `ST s` we can't go back and use any of our references any more. In `walkST` we keep reopening the same ST region, but we only do so after we look around and make sure nobody else is going to catch us and nobody else is doing the same thing. With this, we can go through and do things like calculate an `n`-element unboxed vector in `n` individually worst-case constant time steps! ## Next Time This opens up new opportunities for matching worst case asymptotic bounds of algorithms from the imperative world in a purely functional setting. Next time I'll start to explore how we can mix this approach with a novel (to me) choice of number system to dynamize and deamortize large immutable lookup structures. Happy Halloween! -[Edward Kmett](mailto:ekmett@gmail.com) Oct 31, 2013