Haskell Language Role and Purpose of IO


Haskell is a pure language, meaning that expressions cannot have side effects. A side effect is anything that the expression or function does other than produce a value, for example, modify a global counter or print to standard output.

In Haskell, side-effectful computations (specifically, those which can have an effect on the real world) are modelled using IO. Strictly speaking, IO is a type constructor, taking a type and producing a type. For example, IO Int is the type of an I/O computation producing an Int value. The IO type is abstract, and the interface provided for IO ensures that certain illegal values (that is, functions with non-sensical types) cannot exist, by ensuring that all built-in functions which perform IO have a return type enclosed in IO.

When a Haskell program is run, the computation represented by the Haskell value named main, whose type can be IO x for any type x, is executed.

Manipulating IO values

There are many functions in the standard library providing typical IO actions that a general purpose programming language should perform, such as reading and writing to file handles. General IO actions are created and combined primarily with two functions:

 (>>=) :: IO a -> (a -> IO b) -> IO b

This function (typically called bind) takes an IO action and a function which returns an IO action, and produces the IO action which is the result of applying the function to the value produced by the first IO action.

 return :: a -> IO a

This function takes any value (i.e., a pure value) and returns the IO computation which does no IO and produces the given value. In other words, it is a no-op I/O action.

There are additional general functions which are often used, but all can be written in terms of the two above. For example, (>>) :: IO a -> IO b -> IO b is similar to (>>=) but the result of the first action is ignored.

A simple program greeting the user using these functions:

 main :: IO ()
 main =
   putStrLn "What is your name?" >>
   getLine >>= \name ->
   putStrLn ("Hello " ++ name ++ "!")

This program also uses putStrLn :: String -> IO () and getLine :: IO String.

Note: the types of certain functions above are actually more general than those types given (namely >>=, >> and return).

IO semantics

The IO type in Haskell has very similar semantics to that of imperative programming languages. For example, when one writes s1 ; s2 in an imperative language to indicate executing statement s1, then statement s2, one can write s1 >> s2 to model the same thing in Haskell.

However, the semantics of IO diverge slightly of what would be expected coming from an imperative background. The return function does not interrupt control flow - it has no effect on the program if another IO action is run in sequence. For example, return () >> putStrLn "boom" correctly prints "boom" to standard output.

The formal semantics of IO can given in terms of simple equalities involving the functions in the previous section:

 return x >>= f ≡ f x, ∀ f x
 y >>= return ≡ return y, ∀ y
 (m >>= f) >>= g ≡ m >>= (\x -> (f x >>= g)), ∀ m f g

These laws are typically referred to as left identity, right identity, and composition, respectively. They can be stated more naturally in terms of the function

 (>=>) :: (a -> IO b) -> (b -> IO c) -> a -> IO c
 (f >=> g) x = (f x) >>= g

as follows:

 return >=> f ≡ f, ∀ f
 f >=> return ≡ f, ∀ f
 (f >=> g) >=> h ≡ f >=> (g >=> h), ∀ f g h

Lazy IO

Functions performing I/O computations are typically strict, meaning that all preceding actions in a sequence of actions must be completed before the next action is begun. Typically this is useful and expected behaviour - putStrLn "X" >> putStrLn "Y" should print "XY". However, certain library functions perform I/O lazily, meaning that the I/O actions required to produce the value are only performed when the value is actually consumed. Examples of such functions are getContents and readFile. Lazy I/O can drastically reduce the performance of a Haskell program, so when using library functions, care should be taken to note which functions are lazy.

IO and do notation

Haskell provides a simpler method of combining different IO values into larger IO values. This special syntax is known as do notation* and is simply syntactic sugar for usages of the >>=, >> and return functions.

The program in the previous section can be written in two different ways using do notation, the first being layout-sensitive and the second being layout insensitive:

 main = do
   putStrLn "What is your name?"
   name <- getLine
   putStrLn ("Hello " ++ name ++ "!")

 main = do {
   putStrLn "What is your name?" ;
   name <- getLine ;
   putStrLn ("Hello " ++ name ++ "!")

All three programs are exactly equivalent.

*Note that do notation is also applicable to a broader class of type constructors called monads.