display | more...

In haskell:

stutter :: [a] -> [a]
stutter [] = []
stutter l@(x : []) = l
stutter l@(x : y : []) = l
stutter l@(x : y : z : xs) = x : y : z : [] : stutter(y:z:xs)

So, what does this all mean? The first line is a type signature; it says that stutter maps a list of elements of any type a to another list of elements of type a. a is a type variable: it can stand for any type, however complex. The concept is similar to templates in C++ or generics in Ada. While it's good practice, especially for the sake of documentation, to provide a type signature, it's not really necessary. Much like ML, haskell uses Hindley-Milner type inference: by examining the way the function uses its arguments, the compiler can almost always figure out the most generic possible type of the function.

The remaining lines define the stutter function, using a pattern matching system similar to those in ML, erlang, and related languages. The pattern matching system is quite neat: patterns look like the data structures they match, except with variables. This is similar in some ways to Lisp's destructuring-bind, but much more powerful. Not only that, but it makes haskell (and ML) programs look a lot like mathematical definitions.

Anyway, back to the code. The first line after the type signature says that stutter of the empty list is the empty list. The next two say that stutter of a list with one or two elements is simply that list. `x:y:[]' means `a list beginning with x, followed by y, followed by nothing'. The : operator in haskell means basically the same thing as `.' (dotted-pair notation) in Lisp, except that : is right-associative; that is, x:y:[] means x:(y:([])); or, in Lisp terms, (x . (y . nil)) = (x y). The @ makes l a synonym for the entire expression within parentheses.

Having set up our inductive basis, we now can express the inductive step (recursion) in the last line. Here, xs (pronounced like the plural of `x') stands for `the rest of the list', and can be the empty list; in Lisp, xs would be accessed as (cdr (cdr (cdr l))). The line as a whole says that, for a list l of three or more elements, stutter of l is a list consisting of the first three elements of l, followed by stutter of the tail (cdr) of l. And there you have it.

Two interesting things to note. First of all, stutter works for more than just strings (in haskell, simply lists of characters). It also works for any list type, be it a list of integers ([Char]), a list of lists of strings ([[[Char]]], or a list of functions, each of which which maps lists of elements of type b to functions from integers to Maybe bs ([[b] -> Integer -> Maybe b), and so forth.

Second, and perhaps the defining characteristic of haskell, is the language's use of lazy evaluation. Suppose we have a very long string, and we want to know what the first five characters of its stutterization are. Of course, we could just write another function to do this. However, with haskell one can simply say:

take 4 (stutter verylongstring)

That's it. Because, in haskell, things are not evaluated until they are needed (e.g. for printing or some such), we can leave it at that. The system will not try to compute the full value of stutter verylongstring unless you need one of the last elements. For that matter, verylongstring could be an infinite list (more properly known as a stream). Until some non-deferable code such as I/O forces evaluation of this expression (directly or indirectly), the code might never be executed at all. I will not go into the details here, but with luck someone (perhaps I at a later date) will create a node on lazy evaluation or call by need.

One problem with the above function, as well as (as of 3 Apr 2002) the Lisp function in WWWWolf's writeup above, is that it is not tail-recursive (q.v.). Thus the maximum number of activation records on the stack is linear with the length of the input list. Since the definition of haskell specifies tail-recursion elimination, a tail-recursive version would be better:

tstutter :: [a] -> [a]

tstutter x = let stut :: [a] -> [a] -> [a]
                 stut [] lis = lis
                 stut l@(x : []) lis = x : lis
                 stut l@(x : y : []) lis = y : x : lis
                 stut (x : y : z : xs) lis =
                     stut (y:z:xs) (z : y : x : lis)
             in  reverse (stut x [])

In this version, stut is tail-recursive. Because the recursive call to itself is the last thing done by stut, the compiler can simply replace the parameters on the stack and execute the call---no new activation record needs to be pushed onto the stack.

In the tail-recursive version, the parameter lis to stut accumulates partial results of the computation as we recurse down the list. Recall that appending to the end of a singly-linked list is O(n) on the size of the list; thus, rather than tacking things onto the end of the accumulator parameter, stut actually builds the list backwards by prepending elements---an O(1) operation. We then reverse this list in tstutter. The transformation taking stutter to tstutter appears over and over again in functional programming.

Incidentally, tstutter demonstrates another interesting feature of haskell. Much like Python, it can be layout-sensitive. The occurrences of stut after the let should be lined up---otherwise, they may be interpreted incorrectly. Programs can be written without layout, using braces and semicolons. This is typically only done by compilers and automated program generation tools.