What memory safety is, and why it is a security story

Functional Programming with OCaml

What memory safety is, and why it is a security story

Module 10 · Lecture 1

KC Sivaramakrishnan
IIT Madras

Nine modules of OCaml have built up an intuition that, when your program type-checks, certain whole classes of bugs simply cannot happen. You have not seen a segmentation fault. You have not seen a "use-after-free." You have not seen a buffer overflow. That is not because we were careful; it is because the language rules them out by construction.

This module asks the question that intuition leaves open: what exactly are those classes of bugs, and what are they worth? To answer the first half honestly, we have to look at the canonical unsafe language, which is C, and survey the family of memory bugs that real C programs write every day and ship to production. To answer the second half, we look at what those bugs cost the world: the security incidents, the industry numbers, and the national policy that has grown up around them.

This lecture is deliberately C-heavy, and the C is runnable. There is a terminal embedded in the page with a small Linux machine and a C compiler; you can build each buggy program and watch it misbehave. That is the point of the module: to see, concretely, what OCaml's design is contrasting against.

Where we are

What this module covers (5 lectures)

What "undefined" means

The C standard distinguishes three related but importantly different kinds of "we are not telling you exactly what this does":

  1. Unspecified behaviour. The standard lists several possible behaviours and lets the implementation pick one, without requiring documentation. Example: the order of evaluation of subexpressions inside f(g(), h()) is unspecified; the compiler may call g before h or h before g, and either is legal.
  2. Implementation-defined behaviour. The implementation must pick a behaviour from a documented set, and document its choice. Example: the size of int, or whether right-shift of a negative integer is arithmetic or logical, is implementation-defined.
  3. Undefined behaviour. The standard places no requirement on what the program does. The program is said to have no defined meaning at all.

The first two are awkward but manageable: a portable C program avoids relying on unspecified order, and consults the implementation manual where it must depend on implementation-defined choices. The third, undefined behaviour, is qualitatively different. It is the dangerous one.

Three flavours of "this is not quite specified"

Flavour What the standard requires Example
Unspecified Pick from a list, no documentation Order of args in f(g(), h())
Implementation-defined Pick from a list, document it sizeof(int), signedness of char
Undefined No requirement at all Null deref, signed overflow, use-after-free

Only the third one is the real problem.

The reason undefined behaviour is qualitatively different is that the compiler is allowed to assume it does not happen. A modern C compiler does not check, at runtime, "did you just trigger UB?" because the standard says you cannot. Instead, the compiler propagates this assumption backwards through its optimisation passes, frequently in ways the programmer did not anticipate.

A famous category of jokes goes: "if your program has undefined behaviour, the compiler is allowed to format your hard drive or summon nasal demons." That is funny but slightly misleading; in practice the compiler does not actively try to misbehave. What it does is subtler: it optimises code on the assumption that UB never occurs, and the resulting program then does something the programmer did not intend. From the perspective of someone debugging the failure, it looks like the compiler "misbehaving"; the compiler's defence is that the source program was already broken.

A concrete UB-driven miscompile

Consider this C function, simplified from a real Linux kernel patch:

struct sock *tun = ...;
struct sock *sk = tun->sk;
if (!tun) return POLLERR;
return POLLIN;

A C programmer reads this as: read tun->sk into sk, then check whether tun is null; if it is null, return an error, otherwise return the OK code. The intent is defensive.

Now read what the compiler sees. On line 2 the code dereferences tun (to read tun->sk). Dereferencing a null pointer is undefined behaviour. The compiler concludes that, since the program is well-defined, tun cannot have been null at line 2. Working backwards, by line 3 tun is known to be non-null. Therefore the test if (!tun) is dead code: it can never be true. The compiler removes the check.

The defensive null check is silently deleted. The bug appears later, when an attacker arranges for tun to actually be null. This was CVE-2009-1897, a real Linux kernel vulnerability. The bug is not a compiler bug: the C standard makes this legal. The bug is the source code's reliance on UB, a check written after the dereference that the standard reads as a promise the check is unnecessary.

The smaller, self-contained version of the same effect is an overflow check. Signed-integer overflow is also undefined, so the compiler may assume x + 100 never wraps below x, and delete a guard written that way. Here is the whole program, deleted_check.c:

#include <stdio.h>
#include <limits.h>

/* Returns 1 if adding 100 to x would "overflow" (wrap negative). */
int will_overflow(int x) {
    return (x + 100) < x;          /* UB when x is near INT_MAX */
}

int main(void) {
    int x = INT_MAX - 50;
    if (will_overflow(x))
        printf("guard fired: %d + 100 would overflow\n", x);
    else
        printf("no overflow reported for %d + 100\n", x);
    return 0;
}

will_overflow is the guard a careful C programmer might write: if x + 100 comes out smaller than x, the addition must have wrapped around, so report it. With x = INT_MAX - 50, the sum genuinely does wrap to a negative number at runtime, so the guard should fire. But x + 100 < x can only be true if signed overflow happened, and signed overflow is undefined, so the compiler is entitled to assume it never does, concludes x + 100 < x is always false, and deletes the whole if. You can watch it happen in the terminal below. Compiled normally the guard is gone and the message never prints; compiled with -fwrapv, which tells the compiler to define signed overflow as wraparound, the UB assumption is gone, the guard can be true, and it fires. Both builds use -O2; the only difference is whether overflow is undefined.

ocaml-vm:~/m10# make check_ub check_safe
cc -O2 -o check_ub deleted_check.c
cc -O2 -fwrapv -o check_safe deleted_check.c
ocaml-vm:~/m10# ./check_ub
no overflow reported for 2147483597 + 100
ocaml-vm:~/m10# ./check_safe
guard fired: 2147483597 + 100 would overflow

Same source, same input, opposite behaviour. The default build is not wrong by the standard: the program promised, by writing the overflow, that the overflow never happens.

"Defensive" code, miscompiled

struct sock *tun = ...;
struct sock *sk  = tun->sk;     // dereferences tun
if (!tun) return POLLERR;       // dead code per the compiler
return POLLIN;

Live: the optimiser deletes a check

ocaml-vm:~/m10# ./check_ub
no overflow reported for 2147483597 + 100
ocaml-vm:~/m10# ./check_safe
guard fired: 2147483597 + 100 would overflow

Try it: build and run the C demos

Why compilers cling to UB

If undefined behaviour is dangerous, why does the C standard have it at all? Two reasons.

First, performance. C was designed to be a thin layer over the hardware. Many CPU operations behave differently on different platforms: integer overflow wraps on x86 but traps on some embedded chips; unaligned loads are fast on x86 but illegal on older ARM. If the standard demanded one specific behaviour everywhere, the compiler would have to insert checks, or emulate the "wrong" behaviour, on every platform where the hardware disagreed. By calling these cases undefined, the standard tells the compiler "do whatever the hardware does, insert no checks, optimise hard."

Second, optimisation. Dead-code elimination, loop unrolling, and pointer-aliasing assumptions all rely on knowing what the program is and is not allowed to do. UB carves out the cases the optimiser can assume away. A compiler that took every "could this be UB?" question at face value would generate much slower code.

These are real reasons. C earned its place partly because it runs nearly as fast as hand-written assembly. The price is that the language gives you many ways to write code with no defined meaning, and the compiler will silently optimise on the assumption that you did not.

The categories of UB that matter

The C standard lists hundreds of specific UB cases. They cluster into four broad categories:

  1. Memory. Reading or writing memory the program does not own. This is the category that becomes most of the world's CVEs.
  2. Integer and arithmetic. Signed-integer overflow, division by zero, shifts by negative or oversized amounts.
  3. Aliasing and concurrency. Reading a value through a pointer of the wrong type, and data races (two threads touching the same memory without synchronisation; the subject of a later lecture).
  4. Lifetime. Using a pointer after the memory it points to has been freed, or after a stack frame has been destroyed.

This module's primary target is the first and the fourth. The named memory bugs we survey next live there.

Four UB categories

Categories 1 and 4 host the four named bugs we survey next.

The memory-safety zoo

Within the memory-and-lifetime cluster, four named bugs come up again and again in real systems. They split cleanly along two axes: is the access in the wrong place (spatial) or at the wrong time (temporal)?

The four canonical memory bugs

The four memory bugs by spatial vs temporal

Use-after-free (temporal)

In C, the programmer requests memory with malloc and returns it with free. Once you free a block, the allocator may hand it to the next malloc. If you keep using the original pointer after that, you are reading bytes the program no longer owns. The pointer is to a valid address; that address simply no longer belongs to you.

The program uaf.c shows the shape: fill a heap-allocated record, free it, then read it again through the old pointer.

struct session { int user_id; char role[16]; };

struct session *s = malloc(sizeof *s);
s->user_id = 1001;
strcpy(s->role, "guest");
printf("before free: id=%d role=%s\n", s->user_id, s->role);

free(s);                       /* the block goes back to the heap */

/* s is now a DANGLING pointer: reading through it is UB. */
printf("after free:  id=%d role=%s\n", s->user_id, s->role);

After free(s), the allocator may hand that block to the next malloc, but s still holds its address. The second printf reads through that dangling pointer. Run it in the terminal:

ocaml-vm:~/m10# make uaf && ./uaf
before free: id=1001 role=guest
after free:  id=1001 role=guest

The read after free still returns 1001: the bytes are still there, and nothing crashed. That silence is what makes use-after-free dangerous. The block is now on the allocator's free list; the next allocation of that size may hand it out, and then the same read returns someone else's data. What happens depends on the C library, the load, and, in an attack, the attacker, which is the heap-spray step in the exploit pipeline below.

Real-world incident. The Chromium team reported in 2020 that 70 percent of high-severity Chrome bugs were memory-safety issues, and the largest single category, about 36 percent, was use-after-free.

Use-after-free: it does not crash

ocaml-vm:~/m10# make uaf && ./uaf
before free: id=1001 role=guest
after free:  id=1001 role=guest

Buffer overflow (spatial)

In C, arrays do not carry their length at runtime. A copy into a buffer trusts the caller to pass a length that fits; if the length is wrong (or attacker-controlled), the copy walks past the buffer's end and overwrites whatever was next in memory. overflow.c copies its argument into a 16-byte buffer with strcpy, which does no bounds check, and keeps a marker value (canary) just past it:

volatile int canary = 0x600d;  /* sits just past `buf` */
char buf[16];

strcpy(buf, argv[1]);          /* no bounds check: copies verbatim */

printf("buf    = %s\n", buf);
printf("canary = 0x%x (expected 0x600d)\n", canary);

A short argument fits and the canary reads back 0x600d. A long argument runs past the 16 bytes and overwrites the neighbouring canary (and, further out, the return address), so the program prints a corrupted canary and then crashes:

ocaml-vm:~/m10# ./overflow hello
buf    = hello
canary = 0x600d (expected 0x600d)
ocaml-vm:~/m10# ./overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAA
buf    = AAAAAAAAAAAAAAAAAAAAAAAAAAAA
canary = 0x41414141 (expected 0x600d)
Segmentation fault

The 0x41414141 is four As (0x41) sitting where the canary used to be: the write went straight off the end of buf.

Real-world incident. This bug class is as old as the networked internet. The 1988 Morris worm, one of the first internet worms, spread in part by smashing a fixed stack buffer in the BSD finger daemon: the daemon read a request with gets, which has no length argument and no bounds check, so an over-long request ran off the end and overwrote the return address. It disrupted a large fraction of the machines then on the internet and led directly to the founding of the first CERT. The canonical CVE on the read side of buffer overflow (an out-of-bounds read, or over-read) is Heartbleed (CVE-2014-0160) in OpenSSL, where a length field was used to copy bytes from a buffer without checking it against the buffer's real size. We walk it end to end in the tutorial that closes this module. Thirty-six years separate the two; the bug is the same.

Buffer overflow: off the end

ocaml-vm:~/m10# ./overflow hello
canary = 0x600d (expected 0x600d)
ocaml-vm:~/m10# ./overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAA
canary = 0x41414141 (expected 0x600d)
Segmentation fault

Uninitialised read (spatial)

A C local, when allocated, holds whatever bytes were last at that location. Reading it before writing returns those leftover bytes. This does not crash and does not corrupt anything; it just leaks. Of the four, this is the one that fits the spatial/temporal split least well: the access is in bounds and the memory is alive, so it is neither out-of-bounds (spatial) nor out-of-lifetime (temporal). It is really a read before write; we group it under spatial only because it reads bytes that are physically present but not logically yours. If the bytes encode a secret (a key, a password, an address that defeats ASLR), the read leaks the secret. uninit.c writes a recognisable pattern into a buffer in one function, then reads a different, never-written buffer over the same stack slot in the next:

void stash_secret(void) {
    char secret[32];
    for (int i = 0; i < 32; i++)
        secret[i] = "DEADBEEF"[i % 8];   /* leave a pattern behind */
}

void read_uninitialised(void) {
    char leak[32];                        /* never written to */
    for (int i = 0; i < 16; i++) putchar(leak[i]);
    putchar('\n');
}

stash_secret returns, but its stack bytes are not wiped; read_uninitialised declares leak over the same slot and reads it before writing, so it prints the leftover pattern:

ocaml-vm:~/m10# make uninit && ./uninit
uninitialised buffer: DEADBEEFDEADBEEF

In a real program the leftover bytes are not a friendly string; they are whatever the last caller left there, which is exactly how kernel info-leaks expose secrets.

Real-world incident. Uninitialised reads in Linux kernel networking code leaked kernel-stack bytes to userspace through packet padding; for example CVE-2017-7472.

Uninitialised read: leftover bytes

ocaml-vm:~/m10# make uninit && ./uninit
uninitialised buffer: DEADBEEFDEADBEEF

Double-free (temporal)

A program frees the same block twice:

char *buf = malloc(64);
free(buf);
free(buf);   /* UB: the block is already on the free list */

The second free corrupts the allocator's bookkeeping, often in ways an attacker can steer to make a future malloc return a pointer they chose. CVE-2021-3711, a double-free in OpenSSL's SM2 decryption, is a recent example.

Double-free

char *buf = malloc(64);
free(buf);
free(buf);   /* UB: the block is already on the free list */

So what? The industry numbers

A memory bug, in isolation, sounds like a reliability problem: a crash, some garbage printed. The reason the industry treats these as a strategic risk is the scale at which they occur, and what an attacker can do with one. First the scale.

For most of the 2010s there was an argument about whether memory-safety bugs were really that dominant in shipping software. It ended around 2019, when several large vendors independently published the same number from their own internal data.

Microsoft's Security Response Center reported that roughly 70 percent of the high-severity bugs Microsoft assigned a CVE to were memory-safety issues, with twelve years of data and the proportion flat, despite hundreds of millions of dollars spent on C tooling. Chromium reported the same 70 percent. The lesson everyone drew: "be more careful" had been tried, at enormous expense, and the proportion had not moved.

The industry numbers

Memory-safety share of severe bugs across four studies

Same team, two languages

The cleanest evidence is Google's Android team. Android ships a large layer of managed code (Java, Kotlin) over a lower layer of native code (C and C++), and from 2019 the team began moving new code to memory-safe languages, with Rust entering the native layer. Their report Memory Safe Languages in Android 13 (Google security blog, 2022) shows what happened: memory-safety bugs dropped from 76 percent of Android's vulnerabilities in 2019 to 35 percent in 2022, tracking the fall in new C and C++ code, and the team had found zero memory-safety vulnerabilities in Android's Rust code. Same product, same team, same release process. The variable that explains the difference is the language, not the discipline. The bugs that remain are still the worst ones: in 2022, memory-safety issues were a minority of Android's bugs but 86 percent of its critical-severity vulnerabilities.

A fourth data point comes from Google's Project Zero, which tracks zero-day exploits detected in actual use. Its review of 2021 (The More You Know, The More You Know You Don't Know, 2022) found that 39 of the 58 in-the-wild 0-days that year, about 67 percent, were memory-corruption bugs. That is a different selection: these are the bugs attackers actually chose to invest in. They keep choosing memory-safety bugs because they keep working.

Same team, two languages

From a memory bug to arbitrary code execution

The other half of "so what" is that a memory bug is rarely just a crash. With modest attacker effort it becomes arbitrary code execution: the attacker gets your program to run code of their choosing, with your program's privileges. Once that happens, every other security boundary is bypassed.

The chain has a stable five-step shape, using a use-after-free as the example. It does not require writing any exploit code to understand.

A free leaves a dangling pointer (the bug). The attacker heap-sprays, forcing many allocations whose bytes they control, so the freed slot refills with their data. The program's next access through the dangling pointer reads those bytes as the original type (type confusion), typically following an attacker-chosen function pointer. To run on a system that marks data non-executable, the attacker uses return-oriented programming: a chain of addresses into existing code fragments, each ending in ret, assembled into the computation they want. The final payload runs.

Mitigations exist (address randomisation, non-executable data, stack canaries, control-flow integrity) and each makes this harder, but none close the class: the attacker pays the engineering cost once and exploits thousands of installations, while the defender pays the runtime cost everywhere, forever. After three decades of layered mitigations, Microsoft's 70 percent line has not moved.

One bug becomes code execution

Five-step exploit pipeline from free to payload

The policy turn

This is why memory safety is no longer only a technical argument. In December 2023, five national cyber agencies (the US CISA, NSA, and FBI, with the UK, Australia, Canada, and New Zealand) published The Case for Memory Safe Roadmaps, urging vendors to publish dated plans for moving components to memory-safe languages. Its named list of memory-safe languages includes Rust, Go, Java, C#, Swift, Python, JavaScript, and the ML family, including OCaml. Two months later the White House Office of the National Cyber Director published Future Software Should Be Memory Safe, making the same case from the altitude of the largest single customer of commercial software in the world. Neither document hedges: they name the language choice as the highest-leverage intervention available.

This is the context in which an OCaml course talks about memory safety. The course's job here is to make the mechanism precise, so you can read those reports and know exactly what guarantee you are buying.

The policy turn (2023-2024)

How OCaml stands, and safety by design

Every bug in this lecture is impossible in safe OCaml, by construction. There is no free, so use-after-free and double-free cannot be written. Indexing a string, bytes, or array is bounds-checked and raises Invalid_argument rather than walking off the end. Every binding is initialised at the point of binding, so there is no uninitialised read. The next lecture makes each of these precise and shows where in the runtime the rule is enforced.

This is the security-flavoured version of the argument from the earlier "why functional programming" lecture: pure functions and immutable data eliminate whole categories of bugs not by being careful but by removing the means to write them. Here we eliminate UAF by removing free, overflow by mandating a bounds check, uninitialised reads by requiring binding-time initialisation. We do not need to be careful; we need the unsafe operation to be un-writable.

Safety by design

Rust as a different answer

OCaml is not the only language that rules out the four bugs by construction. Rust does it without a garbage collector, using a borrow checker: a static analysis in the type system that tracks ownership and borrowing. Each value has one owner; you may borrow it immutably many times or mutably once, not both; when the owner goes out of scope, the value is freed. This is a compile-time discipline with no runtime machinery, but the cost moves to the programmer, who must structure the program so ownership is statically expressible. Cyclic or shared structures need extra machinery (Rc, RefCell, Arc, Mutex).

The trade-off in one line: OCaml has a small constant runtime overhead and no proof obligation on the programmer; Rust has no runtime overhead and a proof obligation the borrow checker must accept. A later module returns to this, adding a type-level discipline on top of OCaml's GC.

Rust: another answer

Property OCaml Rust
Discipline GC + runtime checks borrow checker (compile-time)
Runtime cost small constant none for the safety story
Proof obligation none borrow checker must accept your code
Cyclic / shared structures natural extra machinery (Rc, Arc)

Both rule out the four-bug zoo. Different placement of the discipline: OCaml at runtime, Rust at type-check time.

Activity

A C program contains this code, where x is a signed int:

if (x + 1 < x) {
  printf("overflow happened!\n");
}

On a typical optimising compiler with -O2, what happens?

Why: signed-integer overflow is undefined behaviour in C. The compiler may assume it never occurs. Under that assumption x + 1 > x is always true, so x + 1 < x is always false, the body is dead code, and the compiler removes it. The terminal demo (deleted_check.c) shows exactly this: the guard is deleted by default and reappears only under -fwrapv, which defines overflow and removes the UB assumption. The correct way to write the check is to compare against INT_MAX - 1 before adding.

Which of the following bugs is impossible in safe OCaml?

Why: OCaml's GC eliminates the lifetime question: memory is freed only when nothing reachable refers to it, so "reading from freed memory" cannot occur in safe code. The other three are bugs the type system does not catch: non-exhaustive matching warns but can be opted out of; infinite loops are undecidable in general; typos that type-check are exactly what testing (the previous module) is for.

The exploit pipeline for a use-after-free typically involves several conceptual steps. Which ordering is correct?

Why: the chain starts with a free that leaves a dangling pointer (the bug). The attacker heap-sprays to fill the freed slot with controlled bytes; the next access reads them as the original type (type confusion), often following an attacker-chosen function pointer; return-oriented programming assembles execution from existing code on a non-executable-data system; the payload then runs.

Common pitfalls

"UB is a compiler bug." It is not. The standard explicitly permits the compiler to do anything when UB is triggered. The bug is in the source program.

"If it works on my machine, it has no UB." Many UB-driven bugs are input- or timing-sensitive, and a line that "works" on one compiler version can miscompile on the next as the optimiser gets smarter. The deleted_check.c demo is exactly this: same source, two optimisation levels, opposite behaviour.

"Memory-safe languages are slow." OCaml's native compiler is typically within a small constant factor of C, and the GC adds a fraction of a percent in most workloads. The performance argument for C is real but small; the safety argument against it is large.

"Just be careful when writing C." Decades of trying, at the cost of hundreds of millions of dollars, did not move Microsoft's 70 percent. "Be more careful" is not a working mitigation at scale.

"Memory safety means secure." It does not; it closes one large class, not all of them. Log4Shell (CVE-2021-44228), the 2021 remote-code-execution flaw in the ubiquitous Java logger Log4j, was not a memory bug at all: a crafted log string triggered a JNDI lookup that fetched and ran attacker-controlled code, in a fully memory-safe language. The 70 percent that memory safety removes is the highest-leverage single intervention, but the other 30 percent, injection, logic errors, misused crypto, is still yours to get right.

What's next

We now have a precise catalogue of the bugs and a sense of what they cost. The next lecture makes the OCaml side precise: which language construct rules out which bug, where in the runtime the rule lives, and what it costs. That is where the 63-bit-int aside from the literals lecture finally pays off: tagged pointers, block headers, and the GC's job.

What's next

Reading

Sources

This lecture's prose, worked examples, C demos, and quizzes are original to this course. The industry reports (Microsoft MSRC, Chromium, Google Android, Google Project Zero) and government memoranda (White House ONCD, the CISA/NSA/FBI joint publication) are public documents authored by their respective agencies and vendors; we quote and link to them rather than reproducing them. The exploit description is deliberately conceptual and includes no working exploit code. See LICENSES.md at the repository root for the full source posture.