Streams and laziness

Functional Programming with OCaml

Streams and laziness

Module 7 · Lecture 4

KC Sivaramakrishnan
IIT Madras

We have spent the course building finite data: a list with n elements, a tree with so many nodes, an array of a fixed length. But many problems are naturally described by infinite sequences: the primes, the Fibonacci numbers, the input lines from a socket, the moves available from every position in a game tree. We do not want all of them at once (we cannot fit them in memory); we just want the first few, on demand.

OCaml is a strict language. let xs = 1 :: heavy_computation () runs heavy_computation () immediately, even if we never look at the tail. To represent something infinite we need to delay evaluation: keep the head, hold off on the tail until somebody asks. This lecture builds the stream, the standard functional data structure for that, in two steps. First a stream backed by a thunk (a unit -> 'a function); then a stream backed by a Lazy.t, which adds memoization so each tail is computed at most once.

This lecture: streams and laziness

Recursive values

OCaml lets you write let rec f x = ... for recursive functions. A less-discussed feature: let rec works for values too, as long as the recursive occurrence is hidden behind a constructor.

let rec ones = 1 :: ones

OCaml accepts this. The toplevel reports val ones : int list = [1; <cycle>]: the value is an infinite list whose tail is itself, represented at runtime by a single cons cell that points back to itself. Even though the list is conceptually infinite, it uses finite memory.

let rec zero_ones = 0 :: 1 :: zero_ones

val zero_ones : int list = [0; 1; <cycle>]. Two cons cells, the second pointing back at the first.

Recursive values

let rec ones = 1 :: ones let rec zero_ones = 0 :: 1 :: zero_ones

The catch: this is a cyclic list, not a stream. The only way the cycle exists is that the recursive occurrence sits behind a constructor (::); the structure was built once, in finite memory, and any traversal will eventually loop. That is enough for some uses (a circular buffer of fixed shape) but useless for most others. Try to convert zero_ones to a string with List.map string_of_int zero_ones and the call diverges: List.map is strict and walks the list, the list is infinite, the call never returns.

(* List.map string_of_int zero_ones  (diverges, do not run) *)

We need a data structure where each tail is computed only on demand.

Infinite data structures, properly

The fix: make the tail a function of type unit -> 'a stream, not a fully-built 'a stream. Calling the function "forces" the tail; not calling it leaves the rest of the structure unbuilt. This is the trick at the heart of streams.

type 'a stream = Cons of 'a * (unit -> 'a stream)

A stream has a head (a value) and a thunk for the tail (a zero-argument function that, when called, returns the next stream node). There is no Nil constructor: every stream is infinite. (You can add a Nil if you want finite streams; we omit it for simplicity.)

The recursive definition of zero_ones becomes:

let rec zero_ones = Cons (0, fun () -> Cons (1, fun () -> zero_ones))

The toplevel reports val zero_ones : int stream = Cons (0, <fun>). The tail is a function value, displayed as <fun>; it has not been called yet.

Streams: type definition + accessors

type 'a stream = Cons of 'a * (unit -> 'a stream) let hd (Cons (x, _)) = x let tl (Cons (_, xs)) = xs () let rec zero_ones = Cons (0, fun () -> Cons (1, fun () -> zero_ones)) let _ = hd zero_ones (* = 0 *) let _ = hd (tl zero_ones) (* = 1 *)

Two small accessor functions. hd returns the head; tl forces the thunk and returns the next node.

let hd (Cons (x, _)) = x let tl (Cons (_, xs)) = xs ()

hd zero_ones gives 0 without forcing anything beyond the outer cons. tl zero_ones calls the tail thunk, which builds the second node Cons (1, ...) and returns it.

Consuming a stream

A stream is infinite, but you almost always want a finite prefix of it: the first 10 primes, the first 30 Fibonacci numbers. The take function is the bridge between the infinite stream world and the finite list world.

let rec take n s = if n = 0 then [] else hd s :: take (n - 1) (tl s)

take walks the stream, forcing one tail per element, building up a list of length n. The first n nodes get computed; the rest stay as thunks.

let _ = take 10 zero_ones (* = [0; 1; 0; 1; 0; 1; 0; 1; 0; 1] *)

A companion, drop, throws away the first n elements and returns the rest of the stream:

let rec drop n s = if n = 0 then s else drop (n - 1) (tl s)

drop 1 zero_ones gives a stream whose first element is 1.

Consuming a stream

let rec take n s = if n = 0 then [] else hd s :: take (n - 1) (tl s) let rec drop n s = if n = 0 then s else drop (n - 1) (tl s) let _ = take 10 zero_ones (* = [0; 1; 0; 1; 0; 1; 0; 1; 0; 1] *) let _ = hd (drop 3 zero_ones) (* = 1 *)

Higher-order functions on streams

The shapes you saw on lists in the higher-order-functions module carry over almost line-for-line. map applies a function to every element; filter keeps only the elements satisfying a predicate; zip walks two streams in lockstep.

let rec map f s = Cons (f (hd s), fun () -> map f (tl s)) let rec filter p s = if p (hd s) then Cons (hd s, fun () -> filter p (tl s)) else filter p (tl s) let rec zip f s1 s2 = Cons (f (hd s1) (hd s2), fun () -> zip f (tl s1) (tl s2))

Notice the shape: each builds a Cons whose tail is a thunk that recurses. The recursion is not eager; it happens one tail at a time, when the consumer asks for more.

let zero_ones_str = map string_of_int zero_ones let _ = take 6 zero_ones_str (* = ["0"; "1"; "0"; "1"; "0"; "1"] *)

The mapped stream is itself infinite; we extract a finite prefix with take.

A subtlety in filter: if no element of the stream satisfies the predicate, filter will spin forever looking for one. (Streams are infinite; there is no "end" to give up at.) Use filter only when you know matching elements occur regularly.

map: transform every element

let rec from n = Cons (n, fun () -> from (n + 1)) let rec map f s = Cons (f (hd s), fun () -> map f (tl s)) let _ = take 5 (map (fun x -> x * x) (from 1)) (* = [1; 4; 9; 16; 25] *)

filter: keep elements satisfying a predicate

let rec filter p s = if p (hd s) then Cons (hd s, fun () -> filter p (tl s)) else filter p (tl s) let _ = take 5 (filter (fun x -> x mod 3 = 0) (from 1)) (* = [3; 6; 9; 12; 15] *)

zip: pair up two streams under a function

let rec zip f s1 s2 = Cons (f (hd s1) (hd s2), fun () -> zip f (tl s1) (tl s2)) let _ = take 5 (zip (+) (from 1) (from 100)) (* = [101; 103; 105; 107; 109] *)

Primes by the sieve of Eratosthenes

A famous use of streams: produce the infinite stream of primes using the sieve of Eratosthenes.

The idea:

Each step contributes one prime and produces a smaller (still infinite) stream of candidates.

First we need a stream of natural numbers from n upward:

let rec from n = Cons (n, fun () -> from (n + 1))

Then the sieve itself:

let rec sieve s = let p = hd s in Cons (p, fun () -> sieve (filter (fun x -> x mod p <> 0) (tl s))) let primes = sieve (from 2) let _ = take 10 primes (* = [2; 3; 5; 7; 11; 13; 17; 19; 23; 29] *)

A clean expression of an algorithm that is awkward to state with finite data. The recursion is exactly the recurrence: "the primes are p followed by the primes of (tl s with multiples of p removed)."

Primes: sieve of Eratosthenes

let rec sieve s = let p = hd s in Cons (p, fun () -> sieve (filter (fun x -> x mod p <> 0) (tl s))) let primes = sieve (from 2) let _ = take 10 primes (* = [2; 3; 5; 7; 11; 13; 17; 19; 23; 29] *)

Lazy values

A unit -> 'a thunk is the cheapest way to delay evaluation, but it has one cost: every time you force it, the body runs again. We can see this directly by putting a print_endline in the body and forcing the thunk twice:

let t () = print_endline "running thunk"; 10 + 20 let _ = t () (* prints "running thunk"; = 30 *) let _ = t () (* prints "running thunk" AGAIN; = 30 *)

The string running thunk prints twice: the body ran on each force. For a pure, cheap body that is harmless; for an expensive computation it is wasteful, and for an effect you wanted to happen once it is a bug.

OCaml has a built-in primitive for memoized delay: the lazy keyword. lazy e constructs a value of type 'a Lazy.t that wraps the expression e unevaluated. The first time the value is forced (with Lazy.force), e runs and the result is cached (this caching is what "memoization" means); every subsequent force returns the cached value without re-running e.

let v = lazy (print_endline "running lazy"; 10 + 20)

val v : int Lazy.t = <lazy>. Notice nothing has printed: the expression has not run.

let _ = Lazy.force v (* prints "running lazy"; = 30 *)

The result and a "has been forced" flag are stored inside the Lazy.t.

let _ = Lazy.force v (* prints NOTHING; = 30 *)

The body did not run again: the cached value is reused. That is the whole difference between a thunk (recomputes) and a lazy value (memoizes).

The thunk's one cost: it reruns every time

let t () = print_endline "running thunk"; 10 + 20 let _ = t () (* prints "running thunk"; = 30 *) let _ = t () (* prints "running thunk" AGAIN; = 30 *)

lazy: delay, then cache

let v = lazy (print_endline "running lazy"; 10 + 20) (* val v : int Lazy.t = <lazy> nothing printed yet *)

Lazy.force: run once, then it's cached

let _ = Lazy.force v (* prints "running lazy"; = 30 *) let _ = Lazy.force v (* prints NOTHING; = 30 *)

The lazy keyword is syntactic sugar; under the hood, 'a Lazy.t is a small mutable record (the cache lives in a ref slot inside). Forcing reads the slot; if empty, it runs the body and fills it. This is the simplest example of memoization, and we will return to it as a general technique in the next lecture.

Lazy streams

We can replace the (unit -> 'a stream) thunk in the stream type with 'a stream Lazy.t. Each tail is now memoized: forced once, cached, and reused.

type 'a lstream = LCons of 'a * 'a lstream Lazy.t let lhd (LCons (x, _)) = x let ltl (LCons (_, t)) = Lazy.force t

(The l prefix is just to keep this type distinct from the thunk-based stream in this lecture.)

Higher-order functions look almost the same; only the tail wrapping changes from fun () -> ... to lazy ....

let rec ltake n s = if n = 0 then [] else lhd s :: ltake (n - 1) (ltl s) let rec lzip f s1 s2 = LCons (f (lhd s1) (lhd s2), lazy (lzip f (ltl s1) (ltl s2)))

Lazy streams

type 'a lstream = LCons of 'a * 'a lstream Lazy.t let lhd (LCons (x, _)) = x let ltl (LCons (_, t)) = Lazy.force t let rec ltake n s = if n = 0 then [] else lhd s :: ltake (n - 1) (ltl s) let rec lzip f s1 s2 = LCons (f (lhd s1) (lhd s2), lazy (lzip f (ltl s1) (ltl s2))) let rec lfrom n = LCons (n, lazy (lfrom (n + 1))) let _ = ltake 5 (lfrom 0) (* = [0; 1; 2; 3; 4] *)

Fibonacci as a stream

A neat trick: the Fibonacci sequence 1, 1, 2, 3, 5, 8, 13, 21, ... satisfies f(n) = f(n-1) + f(n-2). If fibs is the sequence, then tl fibs is the same sequence shifted by one. Zipping fibs with tl fibs element-wise under + gives the sequence shifted by two. Prepending 1; 1 to that shifted sum recovers fibs. With the thunk-based stream (Cons / zip / tl from earlier), this is a one-liner:

let rec fibs = Cons (1, fun () -> Cons (1, fun () -> zip (+) fibs (tl fibs))) let _ = take 10 fibs (* = [1; 1; 2; 3; 5; 8; 13; 21; 34; 55] *)

The definition is recursive: fibs refers to itself inside its own body. The thunk wrapping is what makes this safe: the inner references to fibs and tl fibs are inside delayed expressions, so they are not forced when fibs is being constructed.

It produces the right answer, but it is exponentially slow. Each tl fibs rebuilds the shifted stream from scratch, and the zip forces both fibs and tl fibs again at every step, so the nth element costs roughly f(n) thunk-forces, the same exponential blow-up as the naive recursive fib.

Fibonacci as a thunk stream: correct but slow

let rec fibs = Cons (1, fun () -> Cons (1, fun () -> zip (+) fibs (tl fibs))) let _ = take 10 fibs (* = [1; 1; 2; 3; 5; 8; 13; 21; 34; 55] *)

Fixing the blow-up with a lazy stream

Swap the thunk-based stream for the lazy stream (LCons / lzip / ltl). The definition is identical in shape; only the tail wrapping changes from fun () -> ... to lazy .... We name it lfibs to keep it distinct from the thunk-based fibs:

let rec lfibs = LCons (1, lazy ( LCons (1, lazy (lzip (+) lfibs (ltl lfibs))))) let _ = ltake 10 lfibs (* = [1; 1; 2; 3; 5; 8; 13; 21; 34; 55] *)

Now each lfibs node is forced once and memoized. The lzip on the right re-reads already-computed nodes instead of rebuilding them, so ltake 30 lfibs is linear. This is the payoff of memoization: same code shape, exponential to linear, just by replacing the thunk with a lazy.

Fixing the blow-up: lazy stream

let rec lfibs = LCons (1, lazy ( LCons (1, lazy (lzip (+) lfibs (ltl lfibs))))) let _ = ltake 10 lfibs (* = [1; 1; 2; 3; 5; 8; 13; 21; 34; 55] *)

Timing the difference

The exponential-vs-linear claim is easy to see. A tiny helper times a thunk and returns its result alongside the elapsed milliseconds. Sys.time reports the processor time the program has used, not the wall clock; for a single-threaded pure computation like this the two track each other closely, and Sys.time works in the browser cells too.

let time_it f = let t0 = Sys.time () in let r = f () in (r, (Sys.time () -. t0) *. 1000.)

Now race the two streams at a largish prefix. Both compute the same 30 Fibonacci numbers; only the cost differs:

(* thunk stream: recomputes the prefix exponentially *) let _ = time_it (fun () -> take 30 fibs) (* = (..., 100s of ms) *) (* lazy stream: each node memoized, linear *) let _ = time_it (fun () -> ltake 30 lfibs) (* = (..., ~0 ms) *)

The thunk version takes hundreds of milliseconds (and roughly doubles for each extra element); the lazy version is effectively instant. These two cells are not run when the page loads; click Run on each to see the gap on your own machine.

Timing it: thunk vs lazy

let time_it f = let t0 = Sys.time () in let r = f () in (r, (Sys.time () -. t0) *. 1000.) let _ = time_it (fun () -> take 30 fibs) (* slow: ~100s of ms *) let _ = time_it (fun () -> ltake 30 lfibs) (* fast: ~0 ms *)

A quick check

What does take 4 (map (fun x -> x * x) (from 1)) evaluate to?

Why: from 1 is the infinite stream 1, 2, 3, 4, .... map (fun x -> x * x) squares every element, producing 1, 4, 9, 16, .... take 4 returns the first four as a list.

Why does lazy give you something a plain unit -> 'a thunk does not?

Why: Both delay evaluation. The unique thing lazy adds is memoization: the first force runs the body and stashes the result; subsequent forces return the cached value. A thunk has no cache; calling it twice runs the body twice.

Activity

Activity

Write take_while : ('a -> bool) -> 'a stream -> 'a list that returns the longest prefix of s whose elements satisfy p. The result is a list, not a stream: it terminates as soon as p returns false. Verify with take_while (fun x -> x < 5) (from 0) = [0; 1; 2; 3; 4].

Write take_while : ('a -> bool) -> 'a stream -> 'a list. It walks the stream and collects elements until the predicate returns false. If the predicate never returns false, the walk never terminates; callers must make sure it eventually does.

type 'a stream = Cons of 'a * (unit -> 'a stream) let hd (Cons (x, _)) = x let tl (Cons (_, t)) = t () let rec from n = Cons (n, fun () -> from (n + 1)) let rec take_while p s = failwith "not implemented"
Show reference solution

Reference solution:

let rec take_while p s = if p (hd s) then hd s :: take_while p (tl s) else []

The recursion mirrors the structure of take: forces the head, forces the tail thunk only when the predicate keeps holding. The caller must guarantee p eventually returns false on an infinite stream, otherwise take_while runs forever.

Show reference solution

Activity solution

let rec take_while p s = if p (hd s) then hd s :: take_while p (tl s) else [] let _ = take_while (fun x -> x < 5) (from 0) (* = [0;1;2;3;4] *) let _ = take_while (fun x -> x < 0) (from 10) (* = [] *)
  • Forces the head, then the tail (only if the predicate still holds): laziness preserved.
  • Terminates on an infinite stream only if p eventually fails; caller's job to guarantee that.

What's next

The next lecture generalises the caching trick that makes Lazy.t faster than a plain thunk. We write a memo combinator that wraps any function with a cache, time a memoized Fibonacci against the naive recursive one, and use the same idea to make a slow recursive edit-distance function fast. We close with the catch that ties memoization, laziness, and streams together: all three presume purity: a function that does I/O cannot be safely cached.

What's next

Lecture 5: memoization.

Reading

Sources

This lecture follows the structure and worked examples of IIT Madras CS3100, Streams, Laziness and Memoization (KC Sivaramakrishnan, Monsoon 2020). The stream type, the sieve of Eratosthenes example, the lazy Fibonacci derivation, and the framing of thunks vs Lazy.t are drawn directly from the source lecture; prose is freshly authored for the NPTEL format.