Tutorial for Module 7

Functional Programming with OCaml

Tutorial: a queue functor

Module 7 · Lecture 9

KC Sivaramakrishnan
IIT Madras

This tutorial pulls together the machinery from the rest of the module. We build a small functional queue (FIFO: first-in, first-out) using the classic two-stack trick. We package it as a module with a signature that hides the representation. Then we turn it into a functor parameterised by the element type, with a printer attached. By the end of the lecture you will have used every piece of vocabulary from Module 7 in one worked example.

The two-stack queue is a small classic of functional programming. A queue is a FIFO: you push onto one end and pop from the other. If you implement it naively as a single list, push and pop can't both be O(1): one of them has to walk to the far end. The trick is to keep two lists, one for the front (in normal order) and one for the back (in reverse order). enqueue conses onto the back; dequeue pulls from the head of the front; when the front runs out, we reverse the back to become the new front.

This is the same trick that backs many real-world queue implementations in functional languages. It is the kind of small, elegant data structure that is exactly the sort of thing a module system is designed for: a few operations on an abstract type, maintaining an invariant we do not want callers to see or break.

What this tutorial does

A picture first: how the two lists work

Before the code, trace three enqueues and a dequeue. enqueue always conses onto back, so back holds the newest elements in reverse arrival order. dequeue pops the head of front; when front is empty, it reverses back onto front, which flips those elements back into arrival order, so the oldest comes out first.

            front (pop here)        back (push here)
enqueue 1   []                      [1]
enqueue 2   []                      [2; 1]
enqueue 3   []                      [3; 2; 1]

dequeue     front is empty: reverse back onto front
            [1; 2; 3]               []
            pop the head, returns 1
            [2; 3]                  []

After the reverse, the next dequeues are plain list-head pops: that is the seed of the amortised-O(1) argument we make below. Both enqueue and dequeue are amortised O(1): enqueue is a single cons, and each element is reversed onto the front only once in its lifetime.

Tracing the two-stack queue

            front (pop here)        back (push here)
enqueue 1   []                      [1]
enqueue 2   []                      [2; 1]
enqueue 3   []                      [3; 2; 1]

dequeue     front empty: reverse back onto front
            [1; 2; 3]               []
            pop the head, returns 1
            [2; 3]                  []

The implementation, unsealed

We start with the raw implementation, with no signature attached.

type 'a queue = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } let q = empty |> enqueue 1 |> enqueue 2 |> enqueue 3 let _ = dequeue q (* = Some (1, ...) *)

Let's walk through each piece.

The type 'a queue is a record with two fields, both 'a list. The invariant (a property the implementation maintains and the caller relies on) is: if there is anything in the queue, the front list is the oldest elements in FIFO order, and the back list is the newest elements in reverse FIFO order. Equivalently, the conceptual queue is front @ List.rev back.

empty is the queue with both lists empty.

is_empty checks both lists. A queue is empty only if both are empty: if the front is empty but the back is not, there are still elements waiting (they will get moved to the front on the next dequeue).

enqueue x q adds x to the new end: it conses onto the back. This is O(1).

dequeue is the interesting one. Three cases:

The third case is where the amortised analysis kicks in. A single List.rev is O(n), so in the worst case a single dequeue takes O(n). But each element is only reversed once in its lifetime in the queue: it gets pushed onto the back, sits there, then is reversed onto the front exactly once. Over the total lifetime of the queue, each element pays O(1) of reversal cost. Amortised, dequeue is O(1). Worst-case, it is O(n).

The implementation

type 'a queue = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] }

The implementation: a run

let q = empty |> enqueue 1 |> enqueue 2 |> enqueue 3 let _ = dequeue q (* = Some (1, ...) *)

The signature

We have a working queue, but the representation is exposed. A caller can construct a record { front = [1; 2]; back = [3; 4] } directly, possibly violating our invariant. Even worse, a caller can come to rely on the two-list shape, so that any future change to the representation breaks their code.

The fix from the signatures lecture: a signature with an abstract type.

module type QUEUE = sig type 'a t val empty : 'a t val is_empty : 'a t -> bool val enqueue : 'a -> 'a t -> 'a t val dequeue : 'a t -> ('a * 'a t) option end module Queue : QUEUE = struct type 'a t = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } end

Running it from outside the module:

let q = Queue.empty |> Queue.enqueue 1 |> Queue.enqueue 2 |> Queue.enqueue 3 let _ = Queue.dequeue q (* = Some (1, <abstr>) *) let _ = Queue.is_empty Queue.empty (* = true *)

The dequeue now returns <abstr> in place of the record: from outside Queue, the second component is an opaque 'a t.

The QUEUE signature lists exactly the operations callers can use. The type 'a t is abstract: outside Queue, you cannot see that it is a two-list record. The constructors empty and enqueue are how you build queues; dequeue is how you take them apart; is_empty lets you check. That is the entire surface.

The signature

module type QUEUE = sig type 'a t val empty : 'a t val is_empty : 'a t -> bool val enqueue : 'a -> 'a t -> 'a t val dequeue : 'a t -> ('a * 'a t) option end

Sealing the implementation

module Queue : QUEUE = struct type 'a t = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } end

Sealed: the run

let q = Queue.empty |> Queue.enqueue 1 |> Queue.enqueue 2 |> Queue.enqueue 3 let _ = Queue.dequeue q (* = Some (1, <abstr>) *)

Why hide the representation?

The two reasons we hid internals, applied here.

Why hide the representation?

Two reasons we've seen before:

Invariants. Our queue depends on front containing the elements in FIFO order and back containing them in reverse. If callers could write { front = [3; 2]; back = [1] } directly, they could create a queue that violates the invariant; subsequent operations would return elements in the wrong order. The signature prevents this: the only way to make a queue is to start from Queue.empty and use Queue.enqueue, which maintain the invariant.

Change. Maybe later we want to switch to a different implementation: a Dynarray, a doubly-linked list, an array ring buffer. As long as the new implementation provides empty, is_empty, enqueue, dequeue with the same types, no caller needs to change. The signature is the contract; we are free to change anything below it.

Turning it into a functor

Suppose now we want a queue parameterised by the element type, with a typed printer attached. (Maybe we want a debugger view, or a logger that prints queue contents.) The element type can no longer be free: we need a way to turn an element into a string.

The mechanism from the functors lecture: a functor. We start by writing a signature describing what we need from the element type.

module type ELT = sig type t val to_string : t -> string end module Make (E : ELT) = struct type elt = E.t type t = { front : elt list; back : elt list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } let print q = let items = q.front @ List.rev q.back in print_endline ("[" ^ String.concat ", " (List.map E.to_string items) ^ "]") end

Apply it to int and run:

module IQ = Make (struct type t = int let to_string = string_of_int end) let q = IQ.empty |> IQ.enqueue 1 |> IQ.enqueue 2 |> IQ.enqueue 3 let () = IQ.print q (* prints: [1, 2, 3] *)

Notice print shows the queue as a single FIFO sequence, not the internal two-list split: it joins front @ List.rev back, exactly the conceptual queue. The printer respects the abstraction it sits behind: callers see [1, 2, 3], never the front / back representation. Internally front is empty and back is [3; 2; 1] here, but that never reaches the output.

Turning it into a functor: the parameter

module type ELT = sig type t val to_string : t -> string end

The functor body

module Make (E : ELT) = struct type elt = E.t type t = { front : elt list; back : elt list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } let print q = let xs = q.front @ List.rev q.back in print_endline ("[" ^ String.concat ", " (List.map E.to_string xs) ^ "]") end

Applying the functor

module IQ = Make (struct type t = int let to_string = string_of_int end) let q = IQ.empty |> IQ.enqueue 1 |> IQ.enqueue 2 |> IQ.enqueue 3 let () = IQ.print q (* prints: [1, 2, 3] *)

A few things to read out of this. The Make functor takes a parameter E of signature ELT: any module with a type t and a to_string : t -> string. Inside Make, we reference E.t (the element type) and E.to_string (the printer). The resulting module type fixes elt = E.t, so IQ.elt is int once we instantiate with the int module.

This is the same shape as Map.Make: a constraint on the element type (via the parameter signature), and a generic implementation parameterised by that constraint.

What is notable about the functor

What's notable about the functor

A string queue: a run

module String_queue = Make (struct type t = string let to_string s = s end) let sq = String_queue.empty |> String_queue.enqueue "a" |> String_queue.enqueue "b" let () = String_queue.print sq (* prints: [a, b] *)

The functor is specialised once you apply it: IQ.elt is exactly int. Trying to enqueue a string into IQ is a type error caught at compile time, the same way trying to add a string to an int list is. But the implementation of the queue operations is written once: the body of Make does not know or care what the element type is, except through E.to_string.

To build a string queue you write one line, then use it like any other queue:

module String_queue = Make (struct type t = string let to_string s = s end) let sq = String_queue.empty |> String_queue.enqueue "a" |> String_queue.enqueue "b" let () = String_queue.print sq (* prints: [a, b] *)

This is exactly how Map.Make, Set.Make, and Hashtbl.Make in the standard library work: one implementation, many specialised instantiations. You write the data structure once and reuse it forever.

A quick check

In the two-stack queue, what is the worst-case time complexity of a single dequeue operation?

Why: when the front list is empty and the back has n elements, dequeue calls List.rev on the back, which is O(n). Amortised across many operations the cost is O(1) per element (each element is reversed only once), but a single dequeue can be O(n).

Given module Make (E : ELT) = struct ... end, what happens if we try Make (struct type t = int end) (forgetting to_string)?

Why: the functor argument must satisfy ELT, which requires both t and to_string. Missing one is rejected at the functor application, at compile time, before any code runs.

Activity

Activity

Add length : 'a t -> int to the queue. Update the signature. What does the compiler require?

Add a length operation to the queue. Both the signature and the struct need updating. The starter has the unsealed version; your job is to add length everywhere.

module type QUEUE = sig type 'a t val empty : 'a t val is_empty : 'a t -> bool val enqueue : 'a -> 'a t -> 'a t val dequeue : 'a t -> ('a * 'a t) option end module Queue : QUEUE = struct type 'a t = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } end let queue_length _q : int = failwith "not implemented"
Show reference solution

Activity solution: the signature

module type QUEUE = sig type 'a t val empty : 'a t val is_empty : 'a t -> bool val length : 'a t -> int val enqueue : 'a -> 'a t -> 'a t val dequeue : 'a t -> ('a * 'a t) option end

One new val: length : 'a t -> int.

Show reference solution

Activity solution: the module

module Queue : QUEUE = struct type 'a t = { front : 'a list; back : 'a list } let empty = { front = []; back = [] } let is_empty q = q.front = [] && q.back = [] let length q = List.length q.front + List.length q.back let enqueue x q = { q with back = x :: q.back } let rec dequeue q = match q.front, q.back with | [], [] -> None | x :: rest, _ -> Some (x, { q with front = rest }) | [], back -> dequeue { front = List.rev back; back = [] } end let queue_length q = Queue.length q let q = Queue.empty |> Queue.enqueue 1 |> Queue.enqueue 2 |> Queue.enqueue 3 let _ = queue_length q (* = 3 *)
  • The starter's queue_length stub forwards to the new Queue.length; the tests call queue_length.

The compiler enforces both sides

This is the value of the signature in action. Adding a feature requires touching both files (or both halves of an inline declaration): the signature gets a val, the implementation gets a let. The compiler checks both directions. If you add to the signature but forget the implementation, you get Signature mismatch: missing value 'length'. If you add to the implementation but forget the signature, the new function exists but is inaccessible from outside. The compiler keeps the contract between interface and implementation consistent.

The implementation length q = List.length q.front + List.length q.back is itself O(n) (each List.length walks its list). For a queue that you query often, you might cache the length in a third field of the record, updated by each enqueue and dequeue. The signature would not change; only the implementation would. This is the kind of optimisation the abstract type lets you do silently.

Activity: a queue of queues

The element type of 'a Queue.t is unconstrained, so 'a can be any type, including another queue. Instantiate a queue of queues of integers, type int Queue.t Queue.t, with two inner queues, then reach in: dequeue the first inner queue, and dequeue its first element.

Activity: a queue of queues

Show reference solution

Activity solution: nesting comes for free

let inner1 = Queue.empty |> Queue.enqueue 1 |> Queue.enqueue 2 let inner2 = Queue.empty |> Queue.enqueue 3 let qoq = Queue.empty |> Queue.enqueue inner1 |> Queue.enqueue inner2 (* qoq : int Queue.t Queue.t *) let first_elt = match Queue.dequeue qoq with | Some (first, _) -> (match Queue.dequeue first with Some (x, _) -> Some x | None -> None) | None -> None (* = Some 1 *)
  • No new code: the abstract 'a t already works for any 'a, including another queue.

What you should be able to do now

What you should be able to do now

After Module 7 you can:

Next: Module 8 covers monads and GADTs.

You have now seen every piece of the imperative and modular OCaml toolkit. Refs and arrays for mutation when the algorithm wants it. Exceptions for unexpected failures, alongside option and result for predictable ones. Modules for grouping and namespacing. Signatures for hiding internals. Functors for writing generic data structures parameterised by element operations. Together they are enough to structure a real OCaml project at scale.

Module 8 turns to two more advanced abstractions: monads, which sequence computations cleanly across effects (option, result, state, exceptions, IO), and GADTs, generalized algebraic data types, which let you encode richer constraints in the type system. Both are common in serious OCaml code; both reward the groundwork we have laid through Modules 1 through 7.

Reading

Sources

This lecture's prose, worked examples, and quizzes are original to this course. Materials referenced during preparation are listed in the Reading section above; Cornell CS3110 and Real World OCaml are CC BY-NC-ND-licensed and have not been derivatively reused. See LICENSES.md at the repository root for the full source posture.