Streams and laziness
We have spent the course building finite data: a list with n
elements, a tree with so many nodes, an array of a fixed length.
But many problems are naturally described by infinite sequences:
the primes, the Fibonacci numbers, the input lines from a socket,
the moves available from every position in a game tree. We do not
want all of them at once (we cannot fit them in memory); we just
want the first few, on demand.
OCaml is a strict language. let xs = 1 :: heavy_computation ()
runs heavy_computation () immediately, even if we never look at
the tail. To represent something infinite we need to delay
evaluation: keep the head, hold off on the tail until somebody
asks. This lecture builds the stream, the standard functional
data structure for that, in two steps. First a stream backed by a
thunk (a unit -> 'a function); then a stream backed by a
Lazy.t, which adds memoization
so each tail is computed at most once.
Recursive values
OCaml lets you write let rec f x = ... for recursive functions.
A less-discussed feature: let rec works for values too, as long
as the recursive occurrence is hidden behind a constructor.
OCaml accepts this. The toplevel reports
val ones : int list = [1; <cycle>]: the value is an infinite
list whose tail is itself, represented at runtime by a single
cons cell that points back to itself. Even though the list is
conceptually infinite, it uses finite memory.
val zero_ones : int list = [0; 1; <cycle>]. Two cons cells, the
second pointing back at the first.
The catch: this is a cyclic list, not a stream. The only way
the cycle exists is that the recursive occurrence sits behind a
constructor (::); the structure was built once, in finite
memory, and any traversal will eventually loop. That is enough
for some uses (a circular buffer of fixed shape) but useless for
most others. Try to convert zero_ones to a string with
List.map string_of_int zero_ones and the call diverges: List.map
is strict and walks the list, the list is infinite, the call
never returns.
(* List.map string_of_int zero_ones (diverges, do not run) *)
We need a data structure where each tail is computed only on demand.
Infinite data structures, properly
The fix: make the tail a function of type unit -> 'a stream,
not a fully-built 'a stream. Calling the function "forces" the
tail; not calling it leaves the rest of the structure unbuilt.
This is the trick at the heart of streams.
A stream has a head (a value) and a thunk for the tail (a
zero-argument function that, when called, returns the next
stream node). There is no Nil constructor: every stream is
infinite. (You can add a Nil if you want finite streams; we
omit it for simplicity.)
The recursive definition of zero_ones becomes:
The toplevel reports
val zero_ones : int stream = Cons (0, <fun>). The tail is a
function value, displayed as <fun>; it has not been called yet.
Two small accessor functions. hd returns the head; tl forces
the thunk and returns the next node.
hd zero_ones gives 0 without forcing anything beyond the
outer cons. tl zero_ones calls the tail thunk, which builds
the second node Cons (1, ...) and returns it.
Consuming a stream
A stream is infinite, but you almost always want a finite
prefix of it: the first 10 primes, the first 30 Fibonacci
numbers. The take function is the bridge between the infinite
stream world and the finite list world.
take walks the stream, forcing one tail per element, building
up a list of length n. The first n nodes get computed; the
rest stay as thunks.
A companion, drop, throws away the first n elements and
returns the rest of the stream:
drop 1 zero_ones gives a stream whose first element is 1.
Higher-order functions on streams
The shapes you saw on lists in the
higher-order-functions module carry
over almost line-for-line. map applies a function to every
element; filter keeps only the elements satisfying a predicate;
zip walks two streams in lockstep.
Notice the shape: each builds a Cons whose tail is a thunk that
recurses. The recursion is not eager; it happens one tail at a
time, when the consumer asks for more.
The mapped stream
is itself infinite; we extract a finite prefix with take.
A subtlety in filter: if no element of the stream satisfies the
predicate, filter will spin forever looking for one. (Streams
are infinite; there is no "end" to give up at.) Use filter only
when you know matching elements occur regularly.
Primes by the sieve of Eratosthenes
A famous use of streams: produce the infinite stream of primes using the sieve of Eratosthenes.
The idea:
- Start with the stream
[2; 3; 4; 5; 6; 7; ...]. - The head is
2: a prime. - Filter out every multiple of
2from the tail. - The new head is
3: a prime. - Filter out every multiple of
3from the tail. - Repeat.
Each step contributes one prime and produces a smaller (still infinite) stream of candidates.
First we need a stream of natural numbers from n upward:
Then the sieve itself:
A clean expression of an algorithm that is awkward to state with
finite data. The recursion is exactly the recurrence: "the
primes are p followed by the primes of (tl s with multiples
of p removed)."
Lazy values
A unit -> 'a thunk is the cheapest way to delay evaluation, but
it has one cost: every time you force it, the body runs again. We
can see this directly by putting a print_endline in the body
and forcing the thunk twice:
The string running thunk prints twice: the body ran on each
force. For a pure, cheap body that is harmless; for an expensive
computation it is wasteful, and for an effect you wanted to
happen once it is a bug.
OCaml has a built-in primitive for memoized delay: the lazy
keyword. lazy e constructs a value of type 'a Lazy.t that
wraps the expression e unevaluated. The first time the value
is forced (with Lazy.force), e runs and the result is
cached (this caching is what "memoization" means); every
subsequent force returns the cached value without re-running e.
val v : int Lazy.t = <lazy>. Notice nothing has printed: the
expression has not run.
The result and a
"has been forced" flag are stored inside the Lazy.t.
The body did
not run again: the cached value is reused. That is the whole
difference between a thunk (recomputes) and a lazy value
(memoizes).
The lazy keyword is syntactic sugar; under the hood,
'a Lazy.t is a small
mutable record (the cache lives in a ref
slot inside). Forcing reads the slot; if empty, it runs the body
and fills it. This is the simplest example of memoization, and
we will return to it as a general technique in the
next lecture.
Lazy streams
We can replace the (unit -> 'a stream) thunk in the stream type
with 'a stream Lazy.t. Each tail is now memoized: forced once,
cached, and reused.
(The l prefix is just to keep this type distinct from the
thunk-based stream in this lecture.)
Higher-order functions look almost the same; only the tail
wrapping changes from fun () -> ... to lazy ....
Fibonacci as a stream
A neat trick: the Fibonacci sequence
1, 1, 2, 3, 5, 8, 13, 21, ... satisfies f(n) = f(n-1) + f(n-2).
If fibs is the sequence, then tl fibs is the same sequence
shifted by one. Zipping fibs with tl fibs element-wise under
+ gives the sequence shifted by two. Prepending 1; 1 to
that shifted sum recovers fibs. With the thunk-based stream
(Cons / zip / tl from earlier), this is a one-liner:
The definition is recursive: fibs refers to itself inside its
own body. The thunk wrapping is what makes this safe: the inner
references to fibs and tl fibs are inside delayed
expressions, so they are not forced when fibs is being
constructed.
It produces the right answer, but it is exponentially slow.
Each tl fibs rebuilds the shifted stream from scratch, and the
zip forces both fibs and tl fibs again at every step, so
the nth element costs roughly f(n) thunk-forces, the same
exponential blow-up as the naive recursive fib.
Fixing the blow-up with a lazy stream
Swap the thunk-based stream for the lazy stream (LCons /
lzip / ltl). The definition is identical in shape; only the
tail wrapping changes from fun () -> ... to lazy .... We name
it lfibs to keep it distinct from the thunk-based fibs:
Now each lfibs node is forced once and memoized. The lzip
on the right re-reads already-computed nodes instead of
rebuilding them, so ltake 30 lfibs is linear. This is the
payoff of memoization: same code shape, exponential to linear,
just by replacing the thunk with a lazy.
Timing the difference
The exponential-vs-linear claim is easy to see. A tiny helper
times a thunk and returns its result alongside the elapsed
milliseconds. Sys.time reports the processor time the
program has used, not the wall clock; for a single-threaded
pure computation like this the two track each other closely,
and Sys.time works in the browser cells too.
Now race the two streams at a largish prefix. Both compute the same 30 Fibonacci numbers; only the cost differs:
The thunk version takes hundreds of milliseconds (and roughly doubles for each extra element); the lazy version is effectively instant. These two cells are not run when the page loads; click Run on each to see the gap on your own machine.
A quick check
What does take 4 (map (fun x -> x * x) (from 1)) evaluate to?
[0; 1; 4; 9][1; 4; 9; 16][1; 2; 3; 4]- diverges
Why: from 1 is the infinite stream 1, 2, 3, 4, ....
map (fun x -> x * x) squares every element, producing
1, 4, 9, 16, .... take 4 returns the first four as a list.
Why does lazy give you something a plain unit -> 'a thunk
does not?
lazyis faster to construct than a thunk.lazycaches the result; a thunk re-runs the body on each force.lazyworks for infinite values; thunks do not.lazyis statically checked; thunks are not.
Why: Both delay evaluation. The unique thing lazy adds is
memoization: the first force runs the body and stashes the
result; subsequent forces return the cached value. A thunk has no
cache; calling it twice runs the body twice.
Activity
Write take_while : ('a -> bool) -> 'a stream -> 'a list. It
walks the stream and collects elements until the predicate
returns false. If the predicate never returns false, the
walk never terminates; callers must make sure it eventually
does.
Show reference solution
Reference solution:
The recursion mirrors the structure of take: forces the head,
forces the tail thunk only when the predicate keeps holding. The
caller must guarantee p eventually returns false on an
infinite stream, otherwise take_while runs forever.
Show reference solution
What's next
The next lecture generalises the
caching trick that makes Lazy.t faster than a plain thunk. We
write a memo combinator that wraps any function with a cache,
time a memoized Fibonacci against the naive recursive one, and
use the same idea to make a slow recursive edit-distance
function fast. We close with the catch that ties memoization,
laziness, and streams together: all three presume purity: a
function that does I/O cannot be safely cached.
Reading
- Cornell CS3110, Lazy evaluation: https://cs3110.github.io/textbook/chapters/adv/promises.html
- Real World OCaml, Lazy values: https://dev.realworldocaml.org/imperative-programming.html#scrollNav-3
Sources
This lecture follows the structure and worked examples of
IIT Madras CS3100, Streams, Laziness and Memoization
(KC Sivaramakrishnan, Monsoon 2020). The stream type, the
sieve of Eratosthenes example, the lazy Fibonacci derivation,
and the framing of thunks vs Lazy.t are drawn directly from
the source lecture; prose is freshly authored for the NPTEL
format.