Functors
We have been writing functions that take values and return values all course. Modules, the last two lectures introduced, are collections of values, types, and helpers grouped under a name. This lecture introduces the natural next step: functions from modules to modules. In OCaml these are called functors, and they are how the language expresses generic data structures and algorithms.
The name comes from category theory;
in OCaml the practical meaning is narrower: a functor takes one or
more modules and produces a new module. We will keep saying
"function" for one informally, even though strictly speaking OCaml
functors live at the module level, not the value level: you cannot
store one in a list or return one from an if. They are
parameterised modules, with module-level parameters substituting for
value-level ones.
The standard library uses functors everywhere. Map.Make,
Set.Make, and Hashtbl.Make are all functors. When you want a
map keyed by strings, you write module M = Map.Make(String); the
result is a full map module specialised for string keys. This
lecture explains how that works and shows you how to write your
own.
Why we need them
Most of the parameterisation we have done so far has been
parametric polymorphism: a function or type that works for any
element type without knowing anything about it.
List.map: ('a -> 'b) -> 'a list -> 'b list
is parametric in 'a and 'b; it treats elements as opaque tokens
and never looks inside them.
But many data structures cannot get away with that. A binary
search tree of 'a needs to compare values of 'a to decide
which side of a node to put them on. A hash table needs to hash
keys and compare them for equality. A sorted set needs an
ordering. Parametric polymorphism alone does not give you a
comparison function or a hash function: those would have to come
from somewhere.
Take Map specifically, since it is the example we use below. A
map answers find k: is there a binding for the key k, and
what value does it carry? To make that fast, the standard library
keeps the bindings in a balanced binary search tree ordered by
key. Looking up k walks down from the root, and at each node it
compares k with that node's key to decide whether to go left,
go right, or stop because it found it. That is O(log n)
comparisons rather than scanning all n bindings one by one. But
comparing two keys needs a compare function on the key type, and
parametric polymorphism cannot supply one: a polymorphic map would
treat its keys as opaque tokens and never look inside them. The
comparison has to be handed in from outside. That is precisely
what the argument module to Map.Make provides.
Here is where functors come in. A functor takes as input a module
providing the required operations (compare, hash, whatever)
and produces a new module: the data structure specialised to that
input. The input module is the parameter; the output module is
the instantiation. You apply the functor by passing it an
argument module, exactly like calling a function.
The pattern: Map.Make
The standard library's Map module is internally structured
around a functor called Make. Here is the everyday use:
A few things to read out of that code. Map.Make is the functor.
Its argument is the inline module struct type t = int; let compare = Int.compare end. The argument module provides the
key type and a comparison function on that type. Map.Make
returns a full map module specialised to that key type: empty,
add, find, find_opt, mem, and all the other standard map
operations. We bind the result to the name Int_map.
The values of the map are still polymorphic. Int_map.t is
parameterised: an int -> string map and an int -> bool map
are different types, but both come from the same Int_map
module. The key type is fixed by the functor application; the
value type is parametric.
Strings as keys
The standard String module already provides what Map.Make
needs: a type t (which happens to be string) and a
String.compare function. So we can pass it directly:
String has the right shape because the standard library is
designed for it: String.t = string and String.compare has the
required type string -> string -> int. The standard Int,
Char, and Bytes modules all satisfy the same interface; you
can use them as the argument to Map.Make directly.
What does the functor look like inside?
Conceptually, Map.Make is defined something like this:
module Make (Key : sig
type t
val compare : t -> t -> int
end) = struct
type key = Key.t
type 'a t = ... (* balanced tree implementation *)
let empty = ...
let add k v m = ...
let find k m = ...
let find_opt k m = ...
end
(Make lives inside the standard library's Map module, so you
write Map.Make.) Make takes a parameter Key of an inline
signature: any module
with a type t and a compare : t -> t -> int. Inside Make,
the type Key.t and the function Key.compare are available;
the body of the functor implements the map (with a balanced
binary search tree, for the standard library's Map), referring
to Key.compare for ordering comparisons.
The actual implementation in the OCaml standard library is a few hundred lines of balanced-binary-search-tree code. But the interface is the shape above. The functor decides "I need a key type and a comparison on it"; the user supplies those; the functor returns a full map.
Writing your own functor
Here is a toy Set functor, building on the ORDERED signature
we wrote at the
end of the previous lecture.
A working sorted-list set in about twenty lines, parameterized
over any ordered type. The list is kept in ascending order
(smallest first), which is what lets add stop at the right
insertion point and mem give up as soon as it passes where x
would be.
Notice the syntax: module SetLite (E : ORDERED) = struct ... end declares a functor SetLite. The parameter E has signature
ORDERED; the parentheses around (E : ORDERED) are required. The
body of the functor is a structure, like any other module. Inside,
we reference E.t (the element type, abstract here) and
E.compare (the comparison function). The functor returns a
module with elt, t, empty, mem, add.
To apply the functor, write SetLite (M) where M is some
module satisfying ORDERED. We use an inline module to make
Int_set. The same trick works for any ordered type:
SetLite (String) would give us a string set,
SetLite (Int_ord) likewise if we had defined an Int_ord
module. One caveat if you do define such a module: sealing it as
module Int_ord : ORDERED = ... hides t, so the resulting
set's element type would be abstract and you could never build
an element to insert. Real code keeps t usable with a
signature constraint, ORDERED with type t = int, which the
practice problems cover.
This is the pattern. Define a signature describing what your data
structure needs from the element type (compare, or hash, or
equal + hash, or zero + +); write a functor parameterized
by a module of that signature; instantiate the functor for each
concrete element type you want.
Functors versus Java generics
A natural question: how does this compare to List<E> in Java or
Vec<T> in Rust?
In Java, the standard library's TreeSet<E> requires E to
implement the Comparable<E> interface. That is the constraint
on the element type. The user writes TreeSet<String> and the
compiler checks that String implements Comparable. The same
information OCaml carries in a module (a type t and a
compare function), Java carries in an interface and a class
that implements it.
The two approaches are roughly equivalent in expressiveness for the data-structure use case. The OCaml approach scales more cleanly when you want to combine multiple constraints (an ordered, hashable type with a default value, say): you just expand the signature. It is also more flexible because a single type can satisfy multiple module signatures in different ways (provide two different orderings on the same type by writing two modules). The Java approach is more familiar to programmers arriving from C++ or C#, where generics are the everyday mechanism.
Including a functor's output
A common pattern: you want a Map-like module that has all the
operations of a standard Map, plus a few of your own. The
include keyword from the previous lecture combines with functor
application to make this easy.
include Map.Make(Int) runs the functor, gets back a full
int-keyed map module, and copies all its definitions into the
enclosing structure. The let pp ... then adds a printer on top.
The resulting Int_map is "everything Map.Make(Int) would have
given you, plus our pp."
This pattern is the way to extend a standard library data structure with project-specific helpers. You see it constantly in real OCaml codebases.
A quick check
Why is the standard library's Set module a functor (Set.Make)
rather than a plain Set type parameterized by 'a?
- To save memory.
- Because OCaml does not support polymorphic types.
- Because a set needs an ordering on its elements, and parametric polymorphism alone does not provide one.
- To make the API more complicated.
Why: a balanced binary search tree, the standard
implementation of Set, needs to compare elements to keep itself
sorted. A polymorphic 'a Set.t would have no way to know how to
compare two 'as. The functor takes the comparison as part of its
argument module, fixing the element type and the ordering in one
go.
What is the result of this snippet? (Map.Make(String) orders
its keys with String.compare, the usual lexicographic
ordering.)
21Not_found- error
Why: the map associates "a" with 1 and "b" with 2.
M.find "a" m returns the value bound to "a", which is 1.
The order of add calls does not matter for which value "a"
is bound to: add is a functional update (returns a new map),
and the most recent binding for a key wins.
Activity
Fill in add and count. The representation is an association
list pairing each distinct element with its current count.
Show reference solution
The point of the activity is that one functor argument
(ORDERED) supports many data structures: the same compare we
used for a set drives a multiset, a map, a priority queue, and so
on. Writing Bag as a functor (rather than fixing elt = int)
means it works for strings, dates, or any type you can order,
exactly like SetLite.
For comparison, the standard library's Map.Make we saw earlier
chains its inserts with the |> operator,
and each add returns a new map sharing structure with the old
one (persistent balanced trees, O(log n) per insert). The
association-list Bag here is simpler but O(n); swapping in a
balanced-tree representation later would not change its interface,
which is the whole point of sealing a module behind a
signature.
What's next
The module tutorial is next. We build a small functional queue using the classic two-stack trick: keep two lists, one for the front and one for the back in reverse order; push onto the back and pop from the front, refilling the front from the (reversed) back when it runs out. We package the queue as a module with a signature hiding the representation, then turn it into a functor parameterised by the element type. It is a worked example that touches every piece of machinery from this module.
Reading
- Cornell CS3110, Functors: https://cs3110.github.io/textbook/chapters/modules/functors.html
- Real World OCaml, Functors: https://dev.realworldocaml.org/functors.html
Sources
This lecture's prose, worked examples, and quizzes are original to
this course. Materials referenced during preparation are listed in
the Reading section above; Cornell CS3110 and Real World OCaml
are CC BY-NC-ND-licensed and have not been derivatively reused.
See LICENSES.md
at the repository root for the full source posture.