Haskell is a pure language, meaning that expressions cannot have side effects. A side effect is anything that the expression or function does other than produce a value, for example, modify a global counter or print to standard output.
In Haskell, side-effectful computations (specifically, those which can have an effect on the real world) are modelled using IO
. Strictly speaking, IO
is a type constructor, taking a type and producing a type. For example, IO Int
is the type of an I/O computation producing an Int
value. The IO
type is abstract, and the interface provided for IO
ensures that certain illegal values (that is, functions with non-sensical types) cannot exist, by ensuring that all built-in functions which perform IO have a return type enclosed in IO
.
When a Haskell program is run, the computation represented by the Haskell value named main
, whose type can be IO x
for any type x
, is executed.
There are many functions in the standard library providing typical IO
actions that a general purpose programming language should perform, such as reading and writing to file handles. General IO
actions are created and combined primarily with two functions:
(>>=) :: IO a -> (a -> IO b) -> IO b
This function (typically called bind) takes an IO
action and a function which returns an IO
action, and produces the IO
action which is the result of applying the function to the value produced by the first IO
action.
return :: a -> IO a
This function takes any value (i.e., a pure value) and returns the IO computation which does no IO and produces the given value. In other words, it is a no-op I/O action.
There are additional general functions which are often used, but all can be written in terms of the two above. For example, (>>) :: IO a -> IO b -> IO b
is similar to (>>=)
but the result of the first action is ignored.
A simple program greeting the user using these functions:
main :: IO ()
main =
putStrLn "What is your name?" >>
getLine >>= \name ->
putStrLn ("Hello " ++ name ++ "!")
This program also uses putStrLn :: String -> IO ()
and getLine :: IO String
.
Note: the types of certain functions above are actually more general than those types given (namely >>=
, >>
and return
).
The IO
type in Haskell has very similar semantics to that of imperative programming languages. For example, when one writes s1 ; s2
in an imperative language to indicate executing statement s1
, then statement s2
, one can write s1 >> s2
to model the same thing in Haskell.
However, the semantics of IO
diverge slightly of what would be expected coming from an imperative background. The return
function does not interrupt control flow - it has no effect on the program if another IO
action is run in sequence. For example, return () >> putStrLn "boom"
correctly prints "boom" to standard output.
The formal semantics of IO
can given in terms of simple equalities involving the functions in the previous section:
return x >>= f ≡ f x, ∀ f x
y >>= return ≡ return y, ∀ y
(m >>= f) >>= g ≡ m >>= (\x -> (f x >>= g)), ∀ m f g
These laws are typically referred to as left identity, right identity, and composition, respectively. They can be stated more naturally in terms of the function
(>=>) :: (a -> IO b) -> (b -> IO c) -> a -> IO c
(f >=> g) x = (f x) >>= g
as follows:
return >=> f ≡ f, ∀ f
f >=> return ≡ f, ∀ f
(f >=> g) >=> h ≡ f >=> (g >=> h), ∀ f g h
Functions performing I/O computations are typically strict, meaning that all preceding actions in a sequence of actions must be completed before the next action is begun. Typically this is useful and expected behaviour - putStrLn "X" >> putStrLn "Y"
should print "XY". However, certain library functions perform I/O lazily, meaning that the I/O actions required to produce the value are only performed when the value is actually consumed. Examples of such functions are getContents
and readFile
. Lazy I/O can drastically reduce the performance of a Haskell program, so when using library functions, care should be taken to note which functions are lazy.
do
notationHaskell provides a simpler method of combining different IO values into larger IO values. This special syntax is known as do
notation* and is simply syntactic sugar for usages of the >>=
, >>
and return
functions.
The program in the previous section can be written in two different ways using do
notation, the first being layout-sensitive and the second being layout insensitive:
main = do
putStrLn "What is your name?"
name <- getLine
putStrLn ("Hello " ++ name ++ "!")
main = do {
putStrLn "What is your name?" ;
name <- getLine ;
putStrLn ("Hello " ++ name ++ "!")
}
All three programs are exactly equivalent.
*Note that do
notation is also applicable to a broader class of type constructors called monads.