Guards: when-clauses on patterns
A pattern by itself matches on shape: this constructor, that literal, a wildcard. Sometimes you want to filter further on a computation: not just "the head of the list is some integer", but "the head of the list is a positive integer." Pure patterns cannot express this. They cannot compare two bound names. They cannot ask "is this string longer than 5 characters?" The pattern language is deliberately restricted because the restriction is what lets the compiler check exhaustiveness and compile the dispatch efficiently.
The escape hatch is the guard: a when-clause attached to a
pattern. A guard is a boolean expression evaluated after the
pattern matches. If the pattern matches and the guard is true,
the clause fires. If either fails, the matcher moves on to the
next clause.
Guards extend what you can express in a match, but they come at
a cost: they suppress exhaustiveness checking for the guarded
clause. We will see why and what to do about it in this lecture,
and revisit the trade-off in Lecture 5.
A first example
Read the first clause: "match the value against n (which
matches anything and binds it to n), and check that n > 0."
If both hold, return "positive". If the pattern matched but the
guard was false, this clause does not fire and the matcher
proceeds to the next clause.
Three things to notice. First, the same name n is bound in
each of the first two clauses. Each binding is local to its
clause: the n in clause 1 is unrelated to the n in clause 2.
Second, the order matters: positive is checked before negative,
and the wildcard catches everything else (i.e., zero). Third,
the guard is an arbitrary OCaml expression of type bool, not a
new mini-language. You can call functions, do arithmetic,
compare strings, anything that returns a bool.
Without guards, you would write sign with nested if:
The two versions compute the same thing. The guarded match
version is preferred when:
- You are already pattern matching for other reasons (the function dispatches on a variant, say, and you also need a numerical predicate).
- You want the cases lined up vertically, with pattern and predicate side by side.
If the entire body is just a chain of threshold comparisons (as
in sign), nested ifs are fine. Reach for guards when the
pattern part is doing real work too.
Guards on more interesting patterns
The pattern can do most of the work, with the guard filtering the rest:
The first clause has a guard x = y that compares the two
components of the pair. This is the kind of test pure patterns
cannot express. A pattern can say "the first component is the
literal 0", but it cannot say "the first component equals the
second", because the pattern language has no way to refer to
another binding from itself. The guard solves this with a clean
escape hatch: bind both pieces with the pattern, then compare
them in the guard.
Notice the second clause does not compare the first component
to the second; it compares the first component to the literal
0. We could equivalently write | (0, _) -> "on the y-axis"
with a pure literal pattern. The guarded version generalises:
swap x = 0 for x < 0 or x mod 2 = 0 and you have a
predicate-on-a-bound-value that pure patterns can not write.
Guards see what the pattern bound
A guard can use any name that the pattern in the same clause introduces. It cannot peek at bindings from other clauses, and it cannot use names that have not been bound yet.
In the second clause, the pattern x :: _ binds x to the head
of the list. The guard x < 0 references that binding. If the
list is empty, the pattern fails and we never get to the guard.
If the list is non-empty but the head is non-negative, the
pattern matches but the guard fails, and the matcher moves on.
If the matcher reaches the third clause (wildcard), then either
the input was empty (caught by clause 1) or it was non-empty
with a non-negative head (clause 2 pattern matched, guard
failed). Either way, we return false. Clause 1 returns
false for empty too, so the final wildcard is only ever
reached when the head was non-negative.
A subtle point: when a guard fails, the matcher continues to the
next clause; it does not re-attempt the same pattern with
different bindings. So x :: _ when x < 0 either fires (if both
match) or moves on; there is no backtracking.
Exhaustiveness with guards is conservative
Here is the rub. Consider this match, where the two guards together logically cover every integer:
Lines 2-4, characters 5-38:
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
All clauses in this pattern-matching are guarded.
The guards n > 0 and n <= 0 between them handle every n.
We can see that. The compiler cannot, and it still warns
(warning 8).
The exhaustiveness checker reasons about patterns, not about arbitrary boolean expressions. A guard might call a function, read from a file, depend on state; from the compiler's point of view every guard is a black box that "might fail at runtime." So even though we know the two guards partition the integers, the checker treats both clauses as "might be skipped" and flags the whole match: "All clauses in this pattern-matching are guarded." Note that it names no witness value; with guards in play it cannot point at a concrete uncovered input.
When the gap is real, the compiler is right; when the guards happen to be total, the compiler errs on the side of warning anyway. Both produce the same warning text, and from the warning alone you cannot tell which case you are in. Compare a version with a genuine gap:
Lines 2-4, characters 5-33:
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
All clauses in this pattern-matching are guarded.
Here zero really is uncovered: at runtime classify 0 raises
Match_failure. Same warning, real bug. The fix in both cases
is the same: add an unguarded clause that covers the rest.
The cost of allowing arbitrary computation in guards is that the checker has to be conservative. It proves what it can prove (which constructors and shapes the patterns cover); the rest is up to you.
So the rule for guards is: add an unguarded catch-all to close the match, unless you are certain the guarded clauses are total (and you do not mind the warning).
Don't reach for guards when a pattern would do
Guards are powerful, but they are not free. Each guard suppresses some compiler help. So a useful discipline: do not reach for a guard when a pure pattern would express the same thing.
This works, but it is unnecessarily heavy. The pattern can do the comparison directly:
The pure-pattern version is shorter, more obviously correct, and
keeps exhaustiveness checking intact. Reach for when only when
the pattern language is genuinely not enough:
- Numeric inequalities:
n > 0,n <= max_size. - String predicates:
String.length s > 5,String.starts_with ~prefix:"foo" s. - Relationships between two bound names:
x = y,a < b. - Calls to external predicates:
is_valid_id name.
These are things the pattern language cannot encode. For anything else, prefer the pattern.
Guards and side effects
A guard is just an expression, and an expression can have side effects. Resist the urge.
The toplevel prints checking n > 0, then checking n < 0,
then returns "zero". For input 0 the matcher tries the
first clause's guard, finds it false, tries the second clause's
guard, finds it false, and only then takes the wildcard. Two
guard evaluations for a single call.
If the guards had been pure boolean tests, that re-evaluation would be invisible. With prints in them, the order and count of calls become an observable mess that depends on the value, the clause ordering, and the compiler's matrix decomposition. Debugging is a small nightmare.
The discipline: guards should be pure, i.e., return a bool
without observable side effects. If you need a side effect,
sequence it before the match or inside the right-hand side.
Compound guards and short-circuiting
A guard is just bool, so you can combine multiple conditions
with && and ||:
The Boolean operators short-circuit, so this is well-behaved:
low <= n is evaluated first, and n <= high only if the first
was true. Standard rules apply.
For complex guards, you can also call a named predicate:
If the predicate gets large, lift it to a named function and
call it from the guard. Keeps the match readable.
Two checks
What does this evaluate to?
"positive""non-positive"- Compile error
Match_failure
Why: the first clause's guard n > 0 is false when n = 0, so
the clause does not fire. The wildcard catches everything else
and returns "non-positive".
Why does the compiler warn about this match as non-exhaustive?
let classify = function
| n when n >= 0 -> "non-negative"
| n when n < 0 -> "negative"
- The patterns
nandnclash. - The wildcard is missing.
- The compiler does not reason about arbitrary boolean guards; it cannot prove the two guards together are total.
- Order is wrong.
Why: every guarded clause is treated as "may fail" by the
exhaustiveness checker. Even though n >= 0 and n < 0 cover
all integers between them, the compiler will not check arithmetic
facts. The fix is an unguarded | _ -> ... to close the match.
A code task:
A football scoreline is a pair (ours, theirs) of goal counts.
Write match_result : int * int -> string that returns:
"win"when we scored more than they did,"loss"when they scored more than we did,"draw"otherwise.
Use a tuple pattern with when-guards that compare the two bound
names. Make sure the match is exhaustive (no warning 8).
Show reference solution
The shape: tuple pattern binds the two counts, guards compare the
bindings (ours > theirs, then ours < theirs), and an unguarded
wildcard returns "draw" so the checker sees the match is closed.
Common pitfalls
Pitfall 1: omitting the catch-all after guards. The compiler
warns; do not silence the warning, add a wildcard or unguarded
clause. Otherwise, an input that no guard covers raises
Match_failure at runtime.
Pitfall 2: using a guard where a pattern would do. Pure patterns preserve exhaustiveness; guards do not. Save guards for predicates the pattern language cannot express.
Pitfall 3: side effects in guards. A guard may be evaluated
zero, one, or more times depending on the matcher. Keep guards
pure: return a bool, do nothing else.
Pitfall 4: assuming the matcher backtracks. It does not. If a pattern matches but its guard fails, the matcher moves to the next clause; it does not retry the same pattern with different bindings.
Activity
Try it before reading on.
Show reference solution
The first guard catches the all-three-equal case. The second
catches "any pair equal," which the previous clause having
already ruled out the equilateral case makes into "exactly two
equal." The wildcard catches everything else, i.e., all three
different. Without the wildcard, the compiler warns that triples
like (1, 2, 3) are unmatched, and the call would crash with
Match_failure
at runtime.
What's next
We have now seen four pattern forms (literal, variable,
wildcard, structured) and one extension (when guards).
Lecture 5 zooms in on the static
check that has been hovering in the background: exhaustiveness. Why
it matters, how the compiler proves it, what to do when it warns,
and why it is the single biggest argument for using
variants in your designs.
Reading
- Cornell CS3110, Pattern matching guards: https://cs3110.github.io/textbook/chapters/data/pattern_matching.html
- Real World OCaml, Lists and patterns (guards): https://dev.realworldocaml.org/lists-and-patterns.html
Sources
This lecture's prose, worked examples, and quizzes are original to
this course. Materials referenced during preparation are listed in
the Reading section above; Cornell CS3110 and Real World OCaml
are CC BY-NC-ND-licensed and have not been derivatively reused.
See LICENSES.md
at the repository root for the full source posture.