Tutorial for Module 7
This tutorial pulls together the machinery from the rest of the module. We build a small functional queue (FIFO: first-in, first-out) using the classic two-stack trick. We package it as a module with a signature that hides the representation. Then we turn it into a functor parameterised by the element type, with a printer attached. By the end of the lecture you will have used every piece of vocabulary from Module 7 in one worked example.
The two-stack queue is a small classic of functional programming.
A queue is a FIFO: you push onto one end and pop from the other.
If you implement it naively as a single list, push and pop
can't both be O(1): one of them has to walk to the far end. The
trick is to keep two lists, one for the front (in normal order)
and one for the back (in reverse order). enqueue conses onto
the back; dequeue pulls from the head of the front; when the
front runs out, we reverse the back to become the new front.
This is the same trick that backs many real-world queue implementations in functional languages. It is the kind of small, elegant data structure that is exactly the sort of thing a module system is designed for: a few operations on an abstract type, maintaining an invariant we do not want callers to see or break.
A picture first: how the two lists work
Before the code, trace three enqueues and a dequeue. enqueue
always conses onto back, so back holds the newest elements in
reverse arrival order. dequeue pops the head of front; when
front is empty, it reverses back onto front, which flips
those elements back into arrival order, so the oldest comes out
first.
front (pop here) back (push here)
enqueue 1 [] [1]
enqueue 2 [] [2; 1]
enqueue 3 [] [3; 2; 1]
dequeue front is empty: reverse back onto front
[1; 2; 3] []
pop the head, returns 1
[2; 3] []
After the reverse, the next dequeues are plain list-head pops:
that is the seed of the amortised-O(1) argument we make below.
Both enqueue and dequeue are amortised O(1): enqueue is a
single cons, and each element is reversed onto the front only
once in its lifetime.
The implementation, unsealed
We start with the raw implementation, with no signature attached.
Let's walk through each piece.
The type 'a queue is a record with two fields, both 'a list.
The invariant (a property the implementation maintains and the
caller relies on) is: if there is anything in the queue, the
front list is the oldest elements in FIFO order, and the back
list is the newest elements in reverse FIFO order. Equivalently,
the conceptual queue is front @ List.rev back.
empty is the queue with both lists empty.
is_empty checks both lists. A queue is empty only if both are
empty: if the front is empty but the back is not, there are still
elements waiting (they will get moved to the front on the next
dequeue).
enqueue x q adds x to the new end: it conses onto the back.
This is O(1).
dequeue is the interesting one. Three cases:
[], []: the queue is empty; returnNone.x :: rest, _: the front has at least one element; return it, and the rest of the queue is{ front = rest; back = q.back }.[], back: the front is empty but the back is not; reverse the back into a fresh front and recurse.
The third case is where the amortised analysis kicks in. A
single List.rev is O(n), so in the worst case a single
dequeue takes O(n). But each element is only reversed once in
its lifetime in the queue: it gets pushed onto the back, sits
there, then is reversed onto the front exactly once. Over the
total lifetime of the queue, each element pays O(1) of reversal
cost. Amortised, dequeue is O(1). Worst-case, it is O(n).
The signature
We have a working queue, but the representation is exposed. A
caller can construct a record { front = [1; 2]; back = [3; 4] }
directly, possibly violating our invariant. Even worse, a caller
can come to rely on the two-list shape, so that any future
change to the representation breaks their code.
The fix from the signatures lecture: a signature with an abstract type.
Running it from outside the module:
The dequeue now returns <abstr> in place of the record: from
outside Queue, the second component is an opaque 'a t.
The QUEUE signature lists exactly the operations callers can
use. The type 'a t is abstract: outside Queue, you cannot see
that it is a two-list record. The constructors empty and
enqueue are how you build queues; dequeue is how you take
them apart; is_empty lets you check. That is the entire
surface.
Why hide the representation?
The two reasons we hid internals, applied here.
Invariants. Our queue depends on front containing the
elements in FIFO order and back containing them in reverse. If
callers could write { front = [3; 2]; back = [1] } directly,
they could create a queue that violates the invariant; subsequent
operations would return elements in the wrong order. The
signature prevents this: the only way to make a queue is to start
from Queue.empty and use Queue.enqueue, which maintain the
invariant.
Change. Maybe later we want to switch to a different
implementation: a Dynarray, a doubly-linked list, an array
ring buffer. As long as the new implementation provides empty,
is_empty, enqueue, dequeue with the same types, no caller
needs to change. The signature is the contract; we are free to
change anything below it.
Turning it into a functor
Suppose now we want a queue parameterised by the element type, with a typed printer attached. (Maybe we want a debugger view, or a logger that prints queue contents.) The element type can no longer be free: we need a way to turn an element into a string.
The mechanism from the functors lecture: a functor. We start by writing a signature describing what we need from the element type.
Apply it to int and run:
Notice print shows the queue as a single FIFO sequence, not the
internal two-list split: it joins front @ List.rev back, exactly
the conceptual queue. The printer respects the abstraction it
sits behind: callers see [1, 2, 3], never the front / back
representation. Internally front is empty and back is
[3; 2; 1] here, but that never reaches the output.
A few things to read out of this. The Make functor takes a
parameter E of signature ELT: any module with a type t and
a to_string : t -> string. Inside Make, we reference E.t
(the element type) and E.to_string (the printer). The resulting
module type fixes elt = E.t, so IQ.elt is int once we
instantiate with the int module.
This is the same shape as
Map.Make: a
constraint on the element type (via the parameter signature), and
a generic implementation parameterised by that constraint.
What is notable about the functor
The functor is specialised once you apply it: IQ.elt is
exactly int. Trying to enqueue a string into IQ is a type
error caught at compile time, the same way trying to add a string
to an int list is. But the implementation of the queue
operations is written once: the body of Make does not know or
care what the element type is, except through E.to_string.
To build a string queue you write one line, then use it like any other queue:
This is exactly how Map.Make, Set.Make, and Hashtbl.Make in
the standard library work: one implementation, many specialised
instantiations. You write the data structure once and reuse it
forever.
A quick check
In the two-stack queue, what is the worst-case time complexity of
a single dequeue operation?
- O(1)
- O(n)
- O(log n)
- O(n^2)
Why: when the front list is empty and the back has n
elements, dequeue calls List.rev on the back, which is O(n).
Amortised across many operations the cost is O(1) per element
(each element is reversed only once), but a single dequeue can
be O(n).
Given module Make (E : ELT) = struct ... end, what happens if
we try Make (struct type t = int end) (forgetting to_string)?
- Returns a partial module.
- Sets
to_stringtostring_of_intautomatically. - Compile error: signature mismatch,
to_stringnot provided. - Runtime error when
to_stringis called.
Why: the functor argument must satisfy ELT, which requires
both t and to_string. Missing one is rejected at the
functor application, at compile time, before any code runs.
Activity
Add a length operation to the queue. Both the signature and the
struct need updating. The starter has the unsealed version; your
job is to add length everywhere.
Show reference solution
Show reference solution
This is the value of the signature in action. Adding a feature
requires touching both files (or both halves of an inline
declaration): the signature gets a val, the implementation gets
a let. The compiler checks both directions. If you add to the
signature but forget the implementation, you get Signature mismatch: missing value 'length'. If you add to the implementation
but forget the signature, the new function exists but is
inaccessible from outside. The compiler keeps the contract
between interface and implementation consistent.
The implementation length q = List.length q.front + List.length q.back is itself O(n) (each List.length walks its list). For a
queue that you query often, you might cache the length in a third
field of the record, updated by each enqueue and dequeue. The
signature would not change; only the implementation would. This
is the kind of optimisation the abstract type lets you do
silently.
Activity: a queue of queues
The element type of 'a Queue.t is unconstrained, so 'a can be
any type, including another queue. Instantiate a queue of queues
of integers, type int Queue.t Queue.t, with two inner queues,
then reach in: dequeue the first inner queue, and dequeue its
first element.
Show reference solution
What you should be able to do now
You have now seen every piece of the imperative and modular OCaml
toolkit. Refs and
arrays for mutation when the
algorithm wants it. Exceptions for
unexpected failures, alongside
option and result for
predictable ones. Modules for
grouping and namespacing.
Signatures for hiding internals.
Functors for writing generic data
structures parameterised by element operations. Together they are
enough to structure a real OCaml project at scale.
Module 8 turns to two more advanced abstractions: monads, which sequence computations cleanly across effects (option, result, state, exceptions, IO), and GADTs, generalized algebraic data types, which let you encode richer constraints in the type system. Both are common in serious OCaml code; both reward the groundwork we have laid through Modules 1 through 7.
Reading
- Cornell CS3110, Functors (the functional queue is a worked example late in the chapter): https://cs3110.github.io/textbook/chapters/modules/functors.html
- Real World OCaml, Functors: https://dev.realworldocaml.org/functors.html
Sources
This lecture's prose, worked examples, and quizzes are original to
this course. Materials referenced during preparation are listed in
the Reading section above; Cornell CS3110 and Real World OCaml
are CC BY-NC-ND-licensed and have not been derivatively reused.
See LICENSES.md
at the repository root for the full source posture.