Conventional algebraic datatypes are parametric in their type variables. For example, if we define an ADT like
data Expr a = IntLit Int
| BoolLit Bool
| If (Expr Bool) (Expr a) (Expr a)
with the hope that this will statically rule out non-well-typed conditionals, this will not behave as expected since the type of IntLit :: Int -> Expr a
is universially quantified: for any choice of a
, it produces a value of type Expr a
. In particular, for a ~ Bool
, we have IntLit :: Int -> Expr Bool
, allowing us to construct something like If (IntLit 1) e1 e2
which is what the type of the If
constructor was trying to rule out.
Generalised Algebraic Data Types allows us to control the resulting type of a data constructor so that they are not merely parametric. We can rewrite our Expr
type as a GADT like this:
data Expr a where
IntLit :: Int -> Expr Int
BoolLit :: Bool -> Expr Bool
If :: Expr Bool -> Expr a -> Expr a -> Expr a
Here, the type of the constructor IntLit
is Int -> Expr Int
, and so IntLit 1 :: Expr Bool
will not typecheck.
Pattern matching on a GADT value causes refinement of the type of the term returned. For example, it is possible to write an evaluator for Expr a
like this:
crazyEval :: Expr a -> a
crazyEval (IntLit x) =
-- Here we can use `(+)` because x :: Int
x + 1
crazyEval (BoolLit b) =
-- Here we can use `not` because b :: Bool
not b
crazyEval (If b thn els) =
-- Because b :: Expr Bool, we can use `crazyEval b :: Bool`.
-- Also, because thn :: Expr a and els :: Expr a, we can pass either to
-- the recursive call to `crazyEval` and get an a back
crazyEval $ if crazyEval b then thn else els
Note that we are able to use (+)
in the above definitions because when e.g. IntLit x
is pattern matched, we also learn that a ~ Int
(and likewise for not
and if_then_else_
when a ~ Bool
).