Type classes

Why type classes

We'd really like to be able to use == and other notation for more than one thing. But if Haskell were just strictly typed with no additional features, there'd be no way to write

x :: Int
...
if x == 0 then ...

s :: String
...
if s == "yes" then ...

We'd need the impossible:

(==) :: Int -> Int -> Bool
(==) :: String -> String -> Bool

(By the way, ML avoids the problem by defining things like + and - that only work for integers, and +. and -. that only work for floating point numbers, etc.)

Haskell has a tool called type classes that are a huge help with this notation problem:


class Eq t where
    (==) :: t -> t -> Bool
  
instance Eq Int where
    (==) = {- something built in -}

Here's an example of searching a list:


listIncludes x [] = False
listIncludes x (y:ys) =
    if x == y then True else listIncludes x ys
    
main = do
    print (listIncludes 3 [1,2,3,4,5])
    print (listIncludes 10 [1,2,3,4,5])
    print (listIncludes "green" ["red", "blue", "green"])

So what's going on? The type of listIncludes is

listIncludes :: Eq t => t -> [t] -> Bool

which means, listIncludes is a function that takes a t and a list of ts and returns a Bool, but the context out in front of the bold arrow => contains the constraint Eq t, which means that the function also requires t to be an instances of type class Eq.

What the compiler does is roughly as follows:

While compiling listIncludes, it sneaks in an extra parameter called a dictionary, and the call to == is done by fetching the apropriate function out of that dictionary. I'm going to make up some notation here for internal forms that the compiler might work with to give you some idea of what it's doing:

dEqLookup== :: DictionaryEq t -> (t -> t -> Bool)
dEqLookup== d = ...
...
listIncludesImpl :: DictionaryEq t -> t -> [t] -> Bool 
listIncludesImpl dEq x xs = ...
    if (dEqLookup== dEq) x y then ...

Everywhere in your program that you call listIncludes, the compiler implicitly passes a dictionary. So it might transform listIncludes 3 [1,2,3,4,5] into

listIncludesImpl dEqInt 3 [1,2,3,4,5]

and continue compiling from there.

Sometimes the type constraint sort of percolates outward. Example: Here's a function that takes an item, and a list of lists of items, and returns the first of those lists that contains that item if such a list exists, using Haskell's Maybe type.

-- The built-in Maybe type is the standard way of specifying something
-- that might not exist.
-- You might think of it like pointer that is allowed to be null in C or Java.

data Maybe t = Nothing | Just t

firstThatIncludes :: Eq t => t -> [[t]] -> Maybe [t]

firstThatIncludes x (ys : moreYs) =
    if listIncludes x ys
    then Just ys
    else firstThatIncludes x moreYs

firstThatIncludes x [] =
    Nothing

Even though firstThatIncludes doesn't call ==, it calls a function listIncludes that does call ==. Which means that the compiler has to get a dictionary for class Eq from somewhere to pass implicitly to listIncludes. So the dictionary must be passed in to firstThatIncludes as well, which means Eq t has to become a constraint in the context of the type of firstThatIncludes.

The compiler does have other options, by the way. Sometimes it can determine exactly which version of an overloaded function it needs, and it just inlines it. In the case of listIncludes 3 [1,2,3,4,5], the compiler can tell that it's dealing with integers, so it looks up the implementation of == from the instance Eq Int dictionary and probably uses that directly.

Here's how to implement Eq for lists by the way:

instance Eq t => Eq [t] where
    [] == []  =  True
    (x:xs) == (y:ys)  =  (x == y) && (xs == ys)
    _ == _ = False

Explanation: - This gives an implementation of class Eq for the list type parameterized by t - It requires an implementation of class Eq for t, and this is indicated by the constraint Eq t in the context of the instance declaration - The definition of == is just plain baffling to look at. Here's another way to write it out:

instance Eq t => Eq [t] where
    -- The empty list equals the empty list
    (==) [] []  =  True
    
    -- Two nonempty lists are equal if their first elements are equal
    -- and their tails are equal 
    (==) (x:xs) (y:ys)  =  
        (x == y) && (xs == ys)
    
    -- An empty list is never equal to a non empty list.

    (==) [] (_:_) = False
    (==) (_:_) [] = False 
    
        -- Since pattern equations defining a function are checked in order,
        -- we could abbreviate those last two cases down to
        -- (==) _ _ = False
        -- Every case that could return True is handled by something above

Here's how to do Eq for pairs:

instance (Eq t1, Eq t2) => Eq (t1, t2) where
    (x1, x2) == (y1, y2) = (x1 == y1) && (x2 == y2)

Note that this time, the context includes two constraints because a pair might contain two items of different types.

Built-in classes and arithmetic

The arithmetic operators are overloaded through a couple of built-in classes:

class Num t where
    (+) :: t -> t -> t
    (*) :: t -> t -> t
    ...

and Int and Double are instances of Num. But division is sort of funky, so it goes in separate classes:

class Num t => Fractional t where
    (/) :: t -> t -> t
    ...
    
class (Real t, Enum t) => Integral t where
    div :: t -> t -> t
    mod :: t -> t -> t
    ...

Built-in classes and converting data to and from strings

Two important built-in classes are Show and Read. These are required for these all-important functions that convert data objects to and from readable strings:

show :: Show a => a -> String
read :: Read a => String -> a
print :: Show a => a -> IO ()

Show isn't too bad. It includes the show function, plus a couple of others that are sometimes useful for improved efficiency.

Read can be complicated to implement, because it has to deal with operator precedence.

Getting the compiler to derive instances

For a lot of data types, the compiler can automatically create reasonable instances for the classes Read, Show, Eq, and Ord plus a few others. Equality is checked by matching data structures exactly, and ordering is done field by field, with constructors ordered by the order in which they're listed in the definition. Here's the syntax:

data ChessPiece = Pawn | Bishop | Knight | Rook | Queen | King
    deriving (Read, Show, Eq, Ord)

data ChessBoard = ChessBoard [(ChessPieces, Int, Int)] [(ChessPiece, Int, Int)]
    deriving (Read, Show, Eq, Ord)
    
-- This is built in:
data Maybe t = Nothing | Just t
    deriving (Read, Show, Eq, Ord)

Frequently, you need your data types to be instances of these four classes. Reading and showing are really useful for debugging, even if you don't plan to ever store your data items to a file in that format. Eq and Ord are required if you're going to put data into a container like Data.Set or Data.Map.

Exercises

Type checking

Go back to the tree example in Basic Haskell Syntax and put in type signatures for all the functions. You'll have to use the Ord type class, which is for things that can be ordered using <, >, etc.

Some instances

Go back to the set example in Basic Haskell Syntax and implement an instance for trees for the class Eq. For our purposes, two trees are equal if they contain the same items, perhaps in a different branching structure, so these trees have set-like semantics.

Trying out Ord

Make sure you understand these:

main = do
    print (Nothing < Just 1)
    print (Nothing > Just 1)
    print (Just 1 < Just 2)
    print (Just 1 > Just 2)

Then, figure out ahead of time what this will print out, then try it:

data Point = Point Int Int
    deriving (Read, Show, Eq, Ord)
    
data GraphicsPrimitive =
    LineSegment Point Point
    | BezierCurve Point Point Point Point
    deriving (Read, Show, Eq, Ord)

a = Point 0 0
b = Point 25 100
c = Point 75 100
d = Point 100 0

l = LineSegment a b
v = BezierCurve a b c d

main = do
    print l
    print (a == b)
    print (a == Point 0 0)
    print (l < v)
    print (v < l)