# Enumeration Types Like many programming languages, Haskell allows programmers to create their own _enumeration types_. Here’s a simple example: ```haskell data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show ``` This declares a new type called `Thing` with five _data constructors_ `Shoe`, `Ship`, etc which are the (only) values of type `Thing`. [The `deriving Show` is a magical incantation which tells GHC to automatically generate default code for converting `Thing`s to `String`s. This is what GHCi uses when printing the value of an expression of type `Thing`.] ```haskell shoe :: Thing shoe = Shoe listO'Things :: [Thing] listO'Things = [Shoe, SealingWax, King, Cabbage, King] ``` We can write functions on `Thing`s by _pattern matching_: ```active haskell -- /show data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show -- show isSmall :: Thing -> Bool isSmall Shoe = True isSmall Ship = False isSmall SealingWax = True isSmall Cabbage = True isSmall King = False main = print (isSmall Cabbage) ``` Recalling how function clauses are tried in order from top to bottom, we could also make the definition of `isSmall` a bit shorter like so: ```active haskell -- /show data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show -- show isSmall2 :: Thing -> Bool isSmall2 Ship = False isSmall2 King = False isSmall2 _ = True main = print (isSmall2 Cabbage) ``` # Beyond Enumerations `Thing` is an _enumeration_ type, similar to those provided by other languages such as Java or C++. However, enumerations are actually only a special case of Haskell’s more general algebraic data types. As a first example of a data type which is not just an enumeration, consider the definition of `FailableDouble`: ```haskell data FailableDouble = Failure | OK Double deriving Show ``` This says that the `FailableDouble` type has two data constructors. The first one, `OK`, takes an argument of type `Double`. So `OK` by itself is not a value of type `FailableDouble`; we need to give it a `Double`. For example, `OK 3.4` is a value of type `FailableDouble`. ```active haskell -- /show data FailableDouble = Failure | OK Double deriving Show -- show a = Failure b = OK 3.4 main = print (a,b) ``` _Exercise_: What is the type of `OK`? @@@ For any `x` of type `Double`, `OK x` has type `FailableDouble`. We must therefore have `OK :: Double -> FailableDouble`. @@@ Here's one way we might use our new `FailableDouble` type: ```active haskell -- /show data FailableDouble = Failure | OK Double deriving Show -- show safeDiv :: Double -> Double -> FailableDouble safeDiv _ 0 = Failure safeDiv x y = OK (x / y) main = print (safeDiv 2 0, safeDiv 3 4) ``` More pattern matching! Notice how in the `OK` case we can give a name to the `Double` that comes along with it. For some applications, we might consider mapping a failed computation to a value of zero: ```active haskell -- /show data FailableDouble = Failure | OK Double deriving Show -- show failureToZero :: FailableDouble -> Double failureToZero Failure = 0 failureToZero (OK d) = d main = print (failureToZero Failure, failureToZero (OK 3.4)) ``` Data constructors can have more than one argument: ```active haskell -- /show data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show -- show -- Store a person's name, age, and favorite Thing data Person = Person String Int Thing deriving Show brent :: Person brent = Person "Brent" 30 SealingWax stan :: Person stan = Person "Stan" 94 Cabbage getAge :: Person -> Int getAge (Person _ a _) = a main = print (getAge brent) ``` Notice how the type constructor and data constructor are both named `Person`, but they inhabit different namespaces and are different things. This idiom (giving the type and data constructor of a one-constructor type the same name) is common, but can be confusing until you get used to it. # Algebraic Data Types in General In general, an algebraic data type has one or more data constructors, and each data constructor can have zero or more arguments. ```haskell data AlgDataType = Constr1 Type11 Type12 | Constr2 Type21 | Constr3 Type31 Type32 Type33 | Constr4 ``` This specifies that a value of type `AlgDataType` can be constructed in one of four ways: using `Constr1`, `Constr2`, `Constr3`, or `Constr4`. Depending on the constructor used, an `AlgDataType` value may contain some other values. For example, if it was constructed using `Constr1`, then it comes along with two values, one of type `Type11`and one of type `Type12`. One final note: type and data constructor names must always start with a capital letter; variables (including names of functions) must always start with a lowercase letter. (Otherwise, Haskell parsers would have quite a difficult job figuring out which names represent variables and which represent constructors). # Pattern Matching We've seen pattern matching in a few specific cases, but let’s see how pattern-matching works in general. Fundamentally, pattern matching is about taking apart a value by finding out which constructor it was built with. This information can be used as the basis for deciding what to do—indeed, in Haskell, this is the only way to make a decision. For example, to decide what to do with a value of type `AlgDataType` (the made-up type defined in the previous section), we could write something like ```haskell foo (Constr1 a b) = ... foo (Constr2 a) = ... foo (Constr3 a b c) = ... foo Constr4 = ... ``` Note how we also get to give names to the values that come along with each constructor. Note also that parentheses are required around patterns consisting of more than just a single constructor. This is the main idea behind patterns, but there are a few more things to note. - An underscore \_ can be used as a “wildcard pattern” which matches anything. - A pattern of the form `x@pat` can be used to match a value against the pattern `pat`, but also give the name `x` to the entire value being matched. For example: ```active haskell -- /show data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show -- Store a person's name, age, and favorite Thing data Person = Person String Int Thing deriving Show brent :: Person brent = Person "Brent" 30 SealingWax -- show baz :: Person -> String baz p@(Person n _ _) = "The name field of (" ++ show p ++ ") is " ++ n main = putStrLn (baz brent) ``` [Previously we had used `print` to display results, but here we use `putStrLn`, because the value we're displaying is already a `String`. Change the code to use `print` to see the difference.] - Patterns can be _nested_. For example, ```active haskell -- /show data Thing = Shoe | Ship | SealingWax | Cabbage | King deriving Show -- Store a person's name, age, and favorite Thing data Person = Person String Int Thing deriving Show -- show checkFav :: Person -> String checkFav (Person n _ SealingWax) = n ++ ", you're my kind of person!" checkFav (Person n _ _) = n ++ ", your favorite thing is lame." main = putStrLn (checkFav (Person "Brent" 30 SealingWax)) ``` Note how we nest the pattern `SealingWax` inside the pattern for `Person`. In general, the following grammar defines what can be used as a pattern: ``` pat ::= _ | var | var @ ( pat ) | ( Constructor pat1 pat2 ... patn ) ``` The first line says that an underscore is a pattern. The second line says that a variable by itself is a pattern; such a pattern matches anything, and “binds” the given variable name to the matched value. The third line specifies _@ patterns_. The last line says that a constructor name followed by a sequence of patterns is itself a pattern; such a pattern matches a value if that value was constructed using the given constructor, and `pat1` through `patn` all match the values contained by the constructor, recursively. [In actual fact, the full grammar of patterns includes yet more features still, but the rest would take us too far afield for now.] Note that literal values like 2 or `'c'` can be thought of as constructors with no arguments. It is as if the types `Int` and `Char` were defined like ```haskell data Int = 0 | 1 | -1 | 2 | -2 | ... data Char = 'a' | 'b' | 'c' | ... ``` which means that we can pattern-match against literal values. (Of course, `Int` and `Char` are not actually defined this way.) # Case Expessions The fundamental construct for doing pattern-matching in Haskell is the `case` expression. In general, a `case` expression looks like ```haskell case exp of pat1 -> exp1 pat2 -> exp2 ... ``` When evaluated, the expression `exp` is matched against each of the patterns `pat1`, `pat2`, ... in turn. The first matching pattern is chosen, and the entire `case` expression evaluates to the expression corresponding to the matching pattern. For example, ```haskell n = case "Hello" of [] -> 3 ('H':s) -> length s _ -> 7 ``` evaluates to 4 (the second pattern is chosen; the third pattern matches too, of course, but it is never reached). In fact, the syntax for defining functions we have seen is really just convenient syntax sugar for defining a `case` expression. For example, the definition of `failureToZero` given previously can equivalently be written as ```active haskell -- /show data FailableDouble = Failure | OK Double deriving Show -- show failureToZero' :: FailableDouble -> Double failureToZero' x = case x of Failure -> 0 OK d -> d main = print (failureToZero' Failure, failureToZero' (OK 3.4)) ``` # Recursive Data Types Data types can be _recursive_, that is, defined in terms of themselves. In fact, we have already seen a recursive type—the type of lists. A list is either empty, or a single element followed by a remaining list. We could define our own list type like so: ```haskell data IntList = Empty | Cons Int IntList ``` Haskell’s own built-in lists are quite similar; they just get to use special built-in syntax ([] and :) (Of course, they also work for any type of elements instead of just `Int`s; more on this in the next lesson.) We often use recursive functions to process recursive data types: ```active haskell -- /show data IntList = Empty | Cons Int IntList -- show intListProd :: IntList -> Int intListProd Empty = 1 intListProd (Cons x xs) = x * intListProd xs main = print (intListProd (Cons 3 (Cons 2 (Cons 4 Empty)))) ``` As another simple example, we can define a type of binary trees with an `Int` value stored at each internal node, and a `Char` stored at each leaf (Don't ask me what you would use such a tree for; it's an example, OK?): ```active haskell data Tree = Leaf Char | Node Tree Int Tree deriving Show tree :: Tree tree = Node (Leaf 'x') 1 (Node (Leaf 'y') 2 (Leaf 'z')) main = print tree ```