Lens/Aeson Traversals/Prisms

As of March 2020, School of Haskell has been switched to read-only mode.

In the first Haskell Cast podcast Rein Henrichs and Chris Forno interviewed Edward Kmett in part about lens and it was suggested that Prisms don't have the same kind of introductory tutorial treatment. That's a shame, though. Prisms arise naturally all the time when using sum types.

You could have invented Data.Aeson.Lens (and maybe you already have)

I learned to use Prisms when using lens [1] with aeson so let us examine an extended use case to motivate using Prisms.

It's fairly common in dynamic languages to consume JSON like this

> it = json.loads('[{"someObject" : { "version" : [1, 0, 0] }}]')
> it[0]["someObject"]["version"][1]

which might, say, be accessing the minor version number of some nested object. In Python here we use an isomorphism (i.e. a two-way mapping that establishes equivalence) between JSON types and fairly common Python types and then just destruct the JSON information as an "array of dictionaries of dictionaries of arrays".

Python users translating that line to Haskell with the JSON library aeson can grow frustrated by the (pathological) example below showing how tedious the manual destructuring in Haskell is when you're forced to be explicit but don't know how to use Lenss or Prisms.

import Prelude hiding (lookup)
import Data.HashMap.Strict (lookup)
import Data.Vector ((!?))

case decode "[{\"someObject\" : { \"version\" : [1, 0, 0] }}]" of
  Nothing        -> error "Oh, I guess sometimes the JSON string might be invalid..."
  Just (Array a) ->
    case a !? 0 of
      Nothing -> error "Oh, previously I assumed there was at least ONE object..." 
      Just (Object o) ->
        case lookup "someObject" o of
      Just _          -> error "Oh, what about if I don't actually have an object?"
  Just _         -> error "Oh, what about if I don't actually have an array?"

Destructuring is far more explicit here and conflates getting the values with all kinds of "schema validation" errors.

Usually the next step is to learn to use the ToJSON/FromJSON machinery to fold all of your schema validation into decode and then build a really robust, independent serialization adapter for whatever internal ADTs you're using to actually model your domain. Which is great when you've got time to kill, but it doesn't really represent that original Python snippet which is far more likely to be a one-off JSON-munging script than a robust program.

So can we do better if we just want to either get our value or fail?

Failure is a monad right?

Of course it is. Let's pick Maybe and inject everything into it to consolidate those errors.

We'll need to extract a few helpers from the above

anObject :: Value -> Maybe Object
anObject (Object m) = Just m
anObject _          = Nothing

anArray :: Value -> Maybe Array
anArray (Array a) = Just a
anArray _         = Nothing

and then using do notation the example is much clearer.

do json   <- decode thatString
   ary    <- anArray json
   zeroth <- ary !? 0
   obj    <- anObject zeroth
   keys   <- lookup "someObject" obj
   keyObj <- anObject keys
   ver    <- lookup "version" key Obj
   verAry <- anArray ver
   verAry !? 1

which is much nicer and, for those with a touch of Monad-fu, can be further simplified by using (>=>) into something that's almost as nice as the original Python code while being far more explicit about the kinds of assumptions and failure modes that exist.

decode >=> anArray  >=> (!? 0) 
       >=> anObject >=> lookup "someObject"
       >=> anObject >=> lookup "version" 
       >=> anArray  >=> (!? 1)

One way to think of this chain is it's a method of traversing our JSON structure by repeated descent into sum types like Value. Getting a Nothing indicates that the descent we would have liked to take is actually not available. With all this talk of "repeated descent" and "traversing" you can begin to expect that there might be a way to achieve this from within the lens ecosystem.

Basic Lenses don't quite work because they don't have a good notion of failure, but there is a straightforward and powerful way to encode this into lens and we'll see that it's essentially exactly what is done in Data.Aeson.Lens.

A short diversion on the nature of Traversals

A Lens can usually be thought of as a first-class method of separating a piece of an object from its whole. In this light, we can think of _1, suitably specialized, as the split

(a, b)   -->    a    AND   ( _ ,  b )

which then lets us operate by either retreiving the piece or updating the whole. An obvious generalization is the Traversal which lets us generalize our target to 0-or-more parts at once. This takes shape in both :: Traversal' (a, a) a which targets both the fst and snd element simultaneously or each :: Traversal' [a] a which targets all of the elements of a list at once.

Traversals are much more powerful than Lenses but, as usual, with more power comes less certainty. Lenses can be viewed or (^.)'d easily since they embody the guarantee that we're focused on exactly one part of the whole. Traversals on the other hand can target subparts which may not exist. Consider what would happen if you tried to view each of a list. If each were a lens then view elements would have type [a] -> a but there are two things that can go wrong:

  1. If the list is empty we'll have to throw an error

  2. If the list has multiple elements we'll have to "summarize" them somehow.

In order to implement view for a traversal, we'll need support from a Monoid instance for a so that we can handle case (1) with mempty and case (2) with mappend. This is why you can sometimes get weird errors in lens such as what happens when we try view each "foo"

    No instance for (Data.Monoid.Monoid Char)
      arising from a use of `each'

as view is trying to combine the multiple subparts with a non-existent Monoid instance. If we try it instead like view each ["f", "o", "o"] we can use the (free) Monoid instance for [a]

> view each ["f", "o", "o"]

For convenience, lens gives us two variants on view/(^.) which pre-wrap our subparts in useful monoids. We have preview/(^?) which prewraps the subparts in First and toListOf/(^..) which prewraps the subparts in [].

> "foo" ^? each
Just 'f'

> "foo" ^.. each

We'll make use of (^?) especially in the parts to come.

Traversing JSON with Prisms

Remember that the trouble with using Lenses to peel off layers of our JSON structure is that our expectations about whether or not a layer can actually be peeled away may be incorrect and Lenses assume that we have exactly one valid subpart to target.

More generally, you might say that Lenses target product types well, picking one of several pieces to focus on, but have trouble with sums (coproducts) since we may not have any pieces at all for them to target.

Traversals solve this problem by using a Monoid instance to handle failure and multiple results well, but they do so at the cost of knowing anything at all about how many pieces are being targeted. This is sufficient for JSON, but a little bit of overkill. We know that there will be exactly 0 or 1 Arrays in any Value. That notion is the first part of a Prism.

The haddocks say that a Prism is a Traversal that can also be turned around with re to obtain a Getter in the opposite direction. It turns out that "turning around" is a very powerful property that lets us rebuild entire structures from only their piece and the knowledge that they ought to be accessible via our prismatic Traversal. To drive the distinction home, consider a Traversal over the 3rd element in a list (it's called ix 2). It's 0-or-1 targeted, but given only a piece of the list and that Traversal you cannot rebuild the origin list uniquely. The Traversal laws hold for every Prism and we traverse at most 1 element.

0-or-1 target Traversals that can be "Reviewed" like this form exactly the Prisms. For instance, lens has an entire module and class devoted to Cons-ing things on to the front of lists—an operation that, unlike ix 2 can be "reviewed".

The place you typically will see Prisms is when desconstructing a sum type. Each disjoint constructor will get its own Prism as any value of the sum has either 0 or 1 of those constructors being used. This is exactly the notion we need to "lensify" the aeson Value type, and, indeed, that's exactly what Data.Aeson.Lens from lens provides.

_Object :: Prism' Value Object
_Object = prism' anObject Object

_Array  :: Prism' Value Array
_Array = prism' anArray Array

_String :: Prism' Value String
_String = prism' aString String

_Number :: Prism' Value Number
_Number = prism' aNumber Number

Which is not much more than an introduction of our "failing deconstructors" from the first section.

Data.Aeson.Lens also provides a few Traversals based upon ix but specialized to JSON types

key :: Text -> Traversal' Object Value
nth :: Int  -> Traversal' Array Value

Using these Prisms and Traversals we can pick apart our JSON object as we please.

case decode theString of
  Nothing -> Nothing
  Just it -> it ^? _Array  . nth 0
                 . _Object . key "someObject"
                 . _Object . key "version"
                 . _Array  . nth 1

Or, even better, by using the built-in Data.Aeson.Lens isomorphisms and type classes we can write this directly.

theString ^? nth 0 . key "someObject" . key "version" . nth 1

and have the 1-or-0 target nature of Prisms work its magic in the background. Finally, we're looking at a Haskell example which rivals the original Python's terseness without sacrificing nearly as much type safety.

Want to play with the example? Here's a bit of preamble to get you going in ghci.

-- to play with this example from ghci,
-- install lens 4 or newer and :set -XOverloadedStrings
-- OverloadedStrings is so it can automatically turn the String literals into Text
-- values which is the type used for indexing into Object.

import Data.Aeson
import Data.Aeson.Lens
import Control.Lens

-- this code is intended for ghci experimentation, so we use "let" to bind a value to a name
let jsonBlob = "[{\"someObject\": {\"version\": [1, 0, 3]}}]"

-- example from above
let myVal = jsonBlob ^? nth 0 . key "someObject" . key "version" . nth 1

-- What progressively composing the prisms looks like, note the composition order:
-- λ> jsonBlob ^? nth 0
Just (Object fromList
         Object fromList
             (fromList [Number 1.0, Number 0.0, Number 3.0]))])])

-- λ> jsonBlob ^? nth 0 . key "someObject"
Just (Object fromList
         Array (fromList [Number 1.0, Number 0.0, Number 3.0]))])

-- λ> jsonBlob ^? nth 0 . key "someObject" . key "version"
Just (Array (fromList [Number 1.0, Number 0.0, Number 3.0]))

-- λ> jsonBlob ^? nth 0 . key "someObject" . key "version" . nth 1
Just (Number 0.0)

(Thanks for commentary and corrections: acow, Effnote, edwardk, shachaf. Thanks for updating this to be current for the lens-4.* series to Chris Allen (@bitemyapp))

Comments on Reddit

[1] Note that that isn't aeson-lens, which is a similar package that looks like a bunch of Lenses but is actually invalid, in no small part because it doesn't invoke any Prisms. Also, lens-aeson has been deprecated and its support for Aeson has been merged into the main lens library.