This is an essay inspired by Philip Wadler's paper "How to Declare an Imperative" [Wadler97]. We will show uncanny similarities between monadic i/o in Haskell, and UNIX filter compositions based on pipes and redirections. UNIX pipes (treated semantically as writing to temporary files) are quite similar to monads. Furthermore, at the level of UNIX programming, all i/o can be regarded monadic.
Let's consider the following UNIX command line:
echo 'aaa bbb' | cat > /dev/tty
From shell's point of view, the filters -- echo
and cat
-- are pure,
referentially-transparent functions of their arguments: input files
and command-line parameters. Operator |
(a pipe), "forces" a command, which is
a promise of output, on its left producing a stream, an argument
for a command on the right-hand side of the pipe. A shell pipe thus is an
equivalent of a monadic >>=
operator.
Furthermore, we can write the following correspondence between Haskell's monadic i/o forms and UNIX shell operations:
Haskell | Shell | |
---|---|---|
bound variable | the name of an existing file | |
command | sh-command, executable, filter | |
a >> b
|
(a; b)
|
|
a >>= b
|
a | b
|
|
done
|
cat /dev/null
|
|
return x
|
cat x
|
|
puts
|
echo
|
As in Haskell, cat /dev/null
and (a; b)
form a monoid; and so do return x
and >>=
. Indeed, cat /dev/null
is the left and the right
unit for the "command concatenator" (a; b)
, and the command
concatenation is associative. By the same token,
Haskell [Wadler97] | Shell | |
---|---|---|
return v >>= \x -> m
is m[x:=v]
|
cat v | cmd
is cmd < v
|
|
m >>= \x -> return x
is m
|
cmd | cat
is cmd
|
|
m >>= \x -> (n >>= \y -> o)
is (m >>= \x->n) >>= \y->o
|
cmd1 | (cmd2 | cmd3)
is ( cmd1 | cmd2 ) | cmd3
|
We can have /bin/sh literally run both parts of the last
expression: cmd1 | (cmd2 | cmd3)
and
( cmd1 | cmd2 ) | cmd3
. The shell will interpret the
parentheses properly. Both expressions will produce the same output!
Incidentally, a UNIX filter is expected to sequentially consume its input and produce a single output stream. Therefore, from the point of view of the shell and filters, the input and the output streams are 'linear', singularly referenced objects. No wonder i/o involving linear streams is monadic.
Doug McIlroy, the inventor of pipes, is said to point out that both pipes and lazy lists behave exactly like coroutines.
One may wonder if the analogy between pipes and monads is accurate, given that m >>= \x-> n
first executes m
and then n
, whereas m | n
runs m
and n
in parallel.
Let us consider a situation where command m
in a pipe m | n
produces only 100 bytes of output and finishes. We also assume that m
is scheduled to run first. As m
's output fits entirely into the pipe buffer (4k in most OSes) m
can finish before n
starts to run. The whole pipe then becomes equivalent to
m > temp_file; n < temp_file
Suppose that that both m
and n
are "proper" filters. That is, m
produces its output strictly sequentially and never reads or scans back what it already wrote; n
reads its input strictly sequentially and only once. For m
and n
, their input and output are linear
datatypes -- the assumption that also holds for Haskell IO commands. We can claim therefore that if the output from m
is bounded, the pipe m | n
is semantically identical to
m > temp_file; n < temp_file
Incidentally, this is how pipes are implemented under MS-DOS, which is a single-task operating system. In any case, the operating system shields consumers and producers of sequential streams from details of the i/o. The processes will use the same OS primitives regardless of physical data exchange channels -- a local file, a remote file, a pipe, a TCP pipe, etc.
A UNIX pipe may therefore be considered an optimization. A pipe
has the same sequential semantics as the data exchange via a temporary file. A pipe however saves the OS trouble creating a
file and doing any physical i/o. In addition, like any lazy
contraption, pipes can handle an unbounded communication. That is
the only thing that sets them apart from Haskell's m >>= \x->
n
. Still, if the result produced by m
is a lazy list or a stream
and n
is not eager to evaluate its argument entirely, even that
distinction between |
and >>=
becomes blurred.
Thus with pipes treated as writing to temporary files, they are quite similar to monads.
A programmer writing an i/o processing code does not actually
expect the code to run as he types it. The processing will
occur when the code is compiled and submitted to an OS for
execution. Or when the last parenthesis is closed and the code is (re)
submitted to an evaluator/interpreter. It is highly unlikely that a
piece of code than handles pushing of a GUI button is written on the fly, at
the moment the button is pressed. Rather, a programmer wrote the
handler well in advance, anticipating
what the program
will do when the button actually gets pressed. Note the future tense
in the above phrases. All data processing is programmed in advance and
performed later, when submitted to an actor -- be it a CPU, an OS,
an evaluator or a do
keyword.
In classical PASCAL, the main program's code starts with
PROGRAM foo(input,output);That is, the "world", the input and output files, are given explicitly to a program as arguments. As long as data processing done by this code is deterministic and the program refrains from opening other files (which the classical PASCAL disallowed), a PASCAL program, as a whole , is truly a pure function of its argument, an input file. For example, if a program copies input into output, it would always produce the same output given the same input. Thus a PASCAL program - again, as a whole - is referentially transparent. Although it may use non-pure read/write operations to accomplish its job...
Philip Wadler's papers on monads, in particular, [Wadler97], as well as his additional comments have been the source of great inspiration, insight, and enjoyment.
[Wadler97] How to Declare an Imperative. ACM Comp. Surveys, Vol. 29, No. 3, September 1997
[Monadic-io-Scheme] Monadic scheme of i/o
<http://okmij.org/ftp/Scheme/misc.html#monadic-io>
[exec-with-piped] Writing agents in sh: conversing through a pipe
<http://okmij.org/ftp/Communications.html#sh-agents>
This site's top page is http://okmij.org/ftp/
oleg-at-okmij.orgConverted from SXML by SXML->HTML