map: transform every element
map takes a function f and a list xs, and produces a new list
where every element is f applied to the corresponding element of
xs. Its type signature, ('a -> 'b) -> 'a list -> 'b list, says
the same thing in formal language: give me a function from 'a to
'b, give me a list of 'as, and I will give you a list of 'bs
with the same length and order.
It is by some distance the most-used higher-order function in
everyday OCaml. Any time you have a list and you want a new list of
"the same things, but transformed somehow," map is the right tool.
This lecture goes through map carefully because the patterns we
develop here (extracting a walk, writing recursive higher-order
functions, thinking about polymorphic types)
carry over to filter, fold,
and everything else in this module.
Writing map from scratch
We saw a hint of map at the end of Lecture 1.
Here is the full derivation. Suppose we want two list functions:
double_each doubles every element of an int list. string_lengths
turns a string list into the int list of their lengths. Notice
the second function's input and output element types differ; the
output list is still a list, the same length, but built from values
computed from each input.
The two functions share a skeleton: dispatch on the list, return
[] on [], and on a cons, compute a fresh head and cons it onto
the recursive call. The only thing that differs is what each
function does to the head. Pull that out as a parameter f:
That is map. Nine lines, including the example. The original
double_each and string_lengths collapse into one-liners on top of
it:
Each is a partial application
of map: we have supplied the function argument and left the list
argument unbound. The result of each partial application is a
function that, given a list of the right element type, returns the
transformed list. This is exactly the
Abstraction Principle
from Lecture 1, made concrete.
The OCaml standard library calls this function List.map. The
implementation is essentially what we just wrote. From now on, use
List.map rather than redefining it; the only reason we wrote it
ourselves is to see clearly what it does.
The type signature is a contract
Look at the type signature carefully:
val map : ('a -> 'b) -> 'a list -> 'b list
It says four useful things at once.
First, input and output element types can differ. Look at the
'a and 'b: nothing forces them to be the same. So we can map
int list to string list with string_of_int:
Second, the function is polymorphic. A single List.map works
for int list -> int list, int list -> string list, string list -> int list, anything. The compiler infers the specific 'a and
'b from the function and the list you pass in.
Third, the result is a list. The output is always a list, never
some other shape. map does not collapse a list to a number, or
sort it, or split it; it just transforms each element in place.
Fourth, the result has the same length as the input. The
implementation makes one output element per input element. No
duplications, no omissions. If you want a different length, you
want a different function: filter (drops
elements), filter_map
(drops and transforms), or fold_left
(returns anything you want).
Reading types this carefully pays off. Once you internalise what a type signature is telling you, you can predict the rough shape of a function before reading its body. With higher-order functions, this is most of the battle: the type signature does much of the documenting work.
Examples
The first call doubles every element. The second uses the standard
library function string_of_int directly as the function argument:
because OCaml functions are first-class values, we pass the function
by name without writing a lambda. The third call uses String.length
similarly to turn a list of strings into a list of their lengths.
Partial application + map
Combining the operator-as-function trick
from Lecture 1 with map gives some of the most compact OCaml in
the standard library:
(+) 10 is (+) (the addition function) applied to one of its two
arguments, leaving a function int -> int that adds 10. Pass that
function to List.map, and you get the list with 10 added to every
element. The whole expression has no anonymous functions in it, yet
it does the same work as List.map (fun x -> x + 10) [1; 2; 3]. The
shorter form takes a little getting used to but quickly becomes
natural to read.
You will see this idiom often. It is one of the small payoffs of a language where operators are values and partial application is free.
map does not change the length
A property worth saying out loud:
map is the right tool only when input length and output length
should match. If you find yourself reaching for map and then
filtering the result to discard some elements, the right tool was
probably filter_map
(Lecture 3); if you want to collapse the list to a single value, the
right tool is fold (Lecture 4). Picking the
right tool is half the job; this is the easy bit.
Tail recursion and List.map
Our map from earlier is not tail-recursive:
The recursive call map f t is not the last thing the function
does. After it returns, we still have to cons f h onto its result.
So the call sits in the call stack waiting for the recursion to
return, just like the naive recursive sum we saw in
Module 3.
For lists of a few thousand elements this is fine: OCaml's default stack size is generous. For lists of millions of elements, the naive version eventually overflows.
You might think the obvious fix is to introduce an accumulator and recurse tail-fashion:
This is tail-recursive, but it is awful. The expression acc @ [f h]
is a list append, which walks the entire acc to find its end. So
each recursive step does linear work, and there are n steps:
total O(n^2). What was a linear-time function is now quadratic.
Tail-recursive, sure, but at a brutal cost.
The cleaner fix is to cons onto the accumulator (which is constant time), accepting that the accumulator will end up reversed, and reverse it at the end:
Two passes through the list (the fold-style traversal, then the
reverse), but each pass is linear and tail-recursive. Total: still
O(n) time, O(1) stack.
Where does List.rev come from? It is itself a small
tail-recursive function with the same accumulator-and-cons shape
as our tail-recursive map:
No magic: walk the list left to right, cons each element onto
acc (so each element moves to the front of the accumulator),
return acc at the end. The same tail-recursion pattern we just
applied to map.
What does the standard library do? On modern OCaml (5.1 and
later), List.map keeps the natural cons-onto-the-recursive-call
definition, but the compiler recognises that shape and turns it
into a loop; the technique is called tail recursion modulo cons
(TMC), and it makes List.map stack-safe even on very long lists.
On older OCaml, List.map really could overflow, and the standard
workaround was List.rev (List.rev_map f xs): List.rev_map is
tail-recursive but builds the result reversed, so you reverse it
back, paying a second pass.
The bigger point is that higher-order functions hide these
tradeoffs from the caller. You write List.map f xs and stop
thinking about it; the library (and compiler) authors did the
worrying about stack safety once, so every caller benefits.
map on options
List.map is the most familiar instance of a deeper pattern. Any
container-like type that "holds" elements can have its own map.
OCaml's option type
is the simplest example after lists. An 'a option is either None
(no value) or Some x (a value of type 'a). The standard library
provides Option.map:
The first call returns Some 6: the function is applied to the
contained value. The second returns None: there is no contained
value, so the function is not called and the None passes through.
This is exactly the same pattern as list map: walk the structure,
transform the elements, preserve the structure.
map on trees
For your own data types, you can define map yourself. Here is a
binary tree with
values at internal nodes:
The result has the same shape as the input; only the values are
transformed. We will return to this generalisation in
Module 7 when we look at how libraries like
Map and Set package these patterns into reusable abstractions.
Worked example: producing nicely-formatted strings
The first map is projection: pull a field out of every record.
This is so common that some libraries (Jane Street's Core, for
instance) define a shorthand. The second map is transformation:
combine fields of each record into a derived string. Both are the
same map; only the per-element function differs.
A quick check
What is the result of List.map String.length ["hi"; "hello"; ""]?
["2"; "5"; "0"][2; 5; 0][2; 5]3
Why: String.length takes a string and returns an int. So
this is mapping string list to int list. The empty string has
length 0, so it is not dropped (that would be filtering).
List.map always preserves length.
What is the type of List.map fst?
('a * 'b) -> 'a('a * 'b) list -> 'a list('a -> 'b) -> 'a list -> 'b list'a list * 'b list -> 'a list
Why: fst : 'a * 'b -> 'a extracts the first component of a
pair. Partially applying List.map to fst gives a function
('a * 'b) list -> 'a list. So List.map fst [(1,"a"); (2,"b")]
returns [1; 2].
Now a code challenge:
map walks a list of values and applies one function to each.
Flip the roles: write
apply_all : ('a -> 'b) list -> 'a -> 'b list that walks a list
of functions and applies each one to a single value x,
collecting the results in order. So
apply_all [f; g; h] x = [f x; g x; h x].
Show reference solution
Reference solution:
let rec apply_all fs x =
match fs with
| [] -> []
| f :: rest -> f x :: apply_all rest x
The skeleton is exactly the map walk; what changed is which side
is the data. The list now holds the functions, and the single value
x plays the role the function f played in map. Functions are
values, so a list of functions is as ordinary as a list of ints.
Activity
Show reference solution
What's next
map transforms but never drops. The next lecture is
filter: the higher-order function for
dropping elements based on a predicate.
Reading
- Cornell CS3110, Map: https://cs3110.github.io/textbook/chapters/hop/map.html
- Real World OCaml, Lists and patterns: https://dev.realworldocaml.org/lists-and-patterns.html
Sources
This lecture's prose, worked examples, and quizzes are original to
this course. Materials referenced during preparation are listed in
the Reading section above; Cornell CS3110 and Real World OCaml
are CC BY-NC-ND-licensed and have not been derivatively reused.
See LICENSES.md
at the repository root for the full source posture.