A tour of OCaml
The previous two
lectures argued for functional programming in general and for OCaml
in particular. This lecture is different: it is the quick whirlwind
tour. By the end you will have seen, in working form, the basic
building blocks of an OCaml program: literals, arithmetic, booleans,
strings, let bindings, functions, and the type inference that
holds the whole thing together. You will not have mastered any of
these; Module 2 onwards goes deep on each.
The goal here is to get the shape into your head, so the rest of
the course has a frame to hang on.
Every cell on this page is runnable. The first cell takes a few seconds to spin up the in-browser OCaml runtime; after that, each cell evaluates instantly. Click Run on every single cell as you read. Edit them. Try variations. The fastest way to learn a language is to play with it, and the in-browser cells exist precisely to make that play frictionless.
The toplevel
Most OCaml work happens in two places: an editor (where you write files of code that get compiled) and a toplevel (an interactive REPL where you type expressions and see results immediately). The cells on this page are a toplevel. You type an expression, click Run, and the toplevel responds with the value and its type.
The response is - : int = 3, read as "the result has no name (the
-), it has type int, and it equals 3." The toplevel does this
for every expression. Type, then value. This double answer (type +
value) is unusual for a REPL; most languages just print the value.
OCaml prints the type because the type is information and a key
part of how you understand what your program does.
In a desktop install of OCaml, you would start the toplevel from
the shell by running ocaml (the basic one) or
utop (a nicer-to-use
version with editing and syntax highlighting). On this website,
the toplevel runs entirely in your browser via
x-ocaml; there is no server,
no installation, no account. Everything stays on your
machine. The same OCaml runtime that ships with the language is
compiled to JavaScript and runs locally; the only network call is
the initial download of that runtime.
Integers
OCaml has a first-class integer type, int, with the standard
arithmetic operators.
Operator precedence works as you would expect from school: * binds
tighter than +, so this is 2 + 12 = 14, not (2 + 3) * 4 = 20.
The expression evaluates to 14, of type int.
Integer division uses /, but in OCaml as in C and Java (and unlike
in Python 3), it truncates: it throws away any fractional part.
Later cells in this lecture write let _ = ... in front of an
expression; the _ is a "don't-care" name that just lets the
toplevel print the value without binding it to anything. We cover
the pattern in
the lecture on let bindings.
The result is 3, not 3.4 and not 4. The companion operator
mod gives you the remainder:
Result is 2, because 17 = 3 * 5 + 2. The identity a = (a / b) * b + a mod b holds whenever b is positive.
OCaml's int is a machine integer: on a 64-bit machine it is 63
bits wide (not 64). The missing bit is used by the runtime to
distinguish int values from pointers, which is part of how the
garbage collector stays fast. We will see the full story in a
later module on memory safety; for now, just know that the range
is about ±4.6 × 10^18, which is plenty for almost
any practical computation. If you need bigger, the
zarith library gives you
arbitrary precision.
Floats
OCaml has a separate type, float, for IEEE-754 double-precision
floating-point numbers. The arithmetic operators on float are
different from the ones on int: addition is +., multiplication
is *., and so on. The trailing . is part of the operator name.
The result is 3.5, of type float. Try this without the dots:
This refuses to compile with an error message like "The constant
1.0 has type float but an expression was expected of type int". OCaml
is telling you that + expects two int arguments, and 1.0 is a
float, so the call is ill-typed. If you wanted float addition,
you had to use +..
This will catch you out at first. Most other languages overload +
to do "whatever makes sense" given the types of its arguments:
integer add for two ints, float add for two doubles, string
concatenation for two strings in Python and JavaScript. OCaml does
not do this. Every operator has one meaning, fixed by the symbol.
The reason is that operator overloading complicates type inference
and makes type errors hard to read. If you read a + b in OCaml,
you immediately know both a and b are int and the result is
int. If you read a + b in C++, you have to know the types of
a and b to know what the operator does. This is a trade-off
between concise syntax (overloaded + is shorter) and clarity
(separate operators are unambiguous), and OCaml comes down on the
clarity side. We will see this design philosophy repeated for
strings (^ for concatenation, not +) and for many other
operators.
If you genuinely want to mix integer and float arithmetic in one expression, you convert explicitly:
float_of_int is the OCaml function that turns an int into a
float. There is also int_of_float, which goes the other way
(and truncates). The standard library exposes these directly; in
recent versions they are also available as Float.of_int and
Float.to_int with friendlier names.
Booleans and comparison
OCaml's boolean type is bool with the values true and false.
Comparison operators return bool.
Returns true, of type bool. The full set: <, <=, >, >=
for ordering, = for equality, <> for inequality.
Note this is =, not ==. In OCaml, = is structural equality:
it compares values by content, recursively. Two strings are = if
they have the same bytes; two lists are = if they have the same
elements in the same order; two records are = if their fields are
correspondingly =. This is the operator you almost always want.
There is also ==, which is physical equality (pointer
comparison). It exists for advanced uses we will see later; the
short version is: do not use == in your code unless you are sure
you specifically want pointer comparison. The companion negation is
<> for structural inequality and != for physical inequality. We
revisit this distinction in the operators lecture
of Module 2.
&& and || are short-circuit, as in C and Java. The right
argument is only evaluated if needed.
Repeat after me: in OCaml, the everyday equality operator is =,
with one equals sign. The other one, ==, is reserved for advanced
cases (pointer-identity comparison) and almost never what beginners
want. Mixing them up compiles fine and silently returns the wrong
answer, which is why it tops the list of beginner gotchas. We will
return to this in the operators lecture.
Strings
Strings are sequences of bytes, written between double quotes.
The concatenation operator is ^, not +. Same logic as for
numeric operators: each operator has one meaning, and string
concatenation is a different operation from numeric addition, so it
gets a different operator.
String.length returns the number of bytes (which equals the number
of characters for ASCII text, but not for multibyte UTF-8 sequences).
The String module is the standard library's collection of string
operations: String.length, String.get, String.sub,
String.concat, and so on. We will use these throughout the course.
A practical note on Unicode: OCaml's string is byte-oriented, not
codepoint-oriented. String.length "café" (where é is the UTF-8
two-byte sequence) returns 5, not 4. For Unicode-aware string
processing, you reach for an external library like
uutf for parsing,
uucp for character
properties, or Camomile
for older codebases. Most string code you write will not need any
of these; plain String is fine for concatenating, slicing, and
searching bytes. We will revisit this when we cover modules in
Module 7.
Let bindings
let is how you give a name to a value.
After this, pi is in scope and refers to the value 3.14159. The
toplevel reports val pi : float = 3.14159. The keyword val is the
toplevel's way of saying "here is a name binding"; it is not part of
the OCaml source code.
Same syntax for functions. let name args = body defines a function
name taking args and returning the value of body. There is no
separate function keyword, no def, no void, no public static.
Calling a function is just juxtaposition: area_of_circle 2.0. No
parentheses around the argument. No commas. This is the
single-biggest syntactic surprise to people coming from C or Python.
Two important things to internalise about let bindings:
Bindings are immutable by default. When you write let pi = 3.14159, you are not declaring a variable that you will later
reassign. You are introducing a name for a value, and that name
refers to that value, period. There is no pi = 3.14160 later to
"update" it. If you wanted to refer to a different number, you
introduce a new name. This is one of the cultural shifts from
imperative programming; we discussed it in
the previous lecture.
Bindings are not addresses. In C, int pi = 3 allocates a slot
in memory and stores 3 there; later, pi = 4 writes 4 to that
slot. In OCaml, let pi = 3.14159 does not allocate anything;
it just introduces a name. The compiler is free to inline the value
wherever the name is used, share it across uses, or skip allocating
anything for it at all. Names are not storage.
Let in expressions
You can introduce a name local to an expression with let ... in:
The name r_sq is in scope from the in keyword to the end of the
enclosing expression (in this case, the body of circle_area).
Outside that scope, r_sq does not exist.
This is the local-binding form. let ... in is an expression: the
whole thing let r_sq = r *. r in 3.14159 *. r_sq is an
expression that evaluates to the value of its body, with r_sq
bound during evaluation. You can nest these freely.
There are two let forms and they look very similar; do not confuse
them. At the top level (in the toplevel or at the start of a file),
let name = value introduces a global binding that is visible
from that point onward. Inside an expression, let name = value in expr introduces a local binding that is in scope only inside
expr. They are different things: the first is a declaration, the
second is an expression.
Module 2 spends a full lecture on this distinction.
Shadowing
You can introduce a binding with a name that already exists in scope. The new binding shadows the old one for any subsequent reference to that name, but the old binding is still there (immutability!); you just cannot reach it by name anymore.
After these three lines, y = 2. Read the second line carefully:
let x = x + 1. On the right-hand side of the =, the name x
still refers to the old binding (where x = 1), so x + 1 = 2.
After the binding completes, the name x now refers to this new
value, 2. The third line, let y = x, picks up the new x,
so y = 2.
The phrase "old binding is still there" is worth dwelling on,
because it captures a subtle but important point. If, before the
second let x = x + 1, you defined a function that captured the
first x, that function continues to see x = 1 forever. The
binding it captured is unchanged; the name x now points to
something different, but the original value is alive as long as the
function holds it. This is one of the consequences of immutability:
captures are stable.
Type inference
This is OCaml's signature feature. The compiler works out the types of your expressions and functions automatically; you do not have to write them down.
The toplevel reports val add : int -> int -> int = <fun>. This is
the type of add: it takes two ints and returns an int. The
notation int -> int -> int is read right-associatively: it is
"function from int to (function from int to int)". We will
unpack this thoroughly in Module 3 when we cover currying.
How did OCaml know that x and y are int? It saw the +
operator. + requires both arguments to be int, so the
expression x + y forces both x and y to have type int. The
result of + on two ints is an int, so add x y returns
int. The compiler chains these constraints together and reports
the resulting type.
Type inference is what makes OCaml feel as light to write as
Python, even though the language is statically typed. You get the
safety of a strong type system without the syntactic burden of
writing types everywhere. This is a much bigger deal than it might
sound. In Java, you write int x = 5; because the compiler insists.
In OCaml, you write let x = 5 and the compiler figures out x : int itself. Multiply this by ten thousand bindings in a real
program and the verbosity savings are large.
The toplevel reports val add_f : float -> float -> float = <fun>.
Same inference, different constraint: +. requires float
arguments, so the function has type float -> float -> float.
A useful pedagogical point: the operator drives the inference.
When you look at OCaml code and want to know the types, look at the
operators. + says int. +. says float. ^ says string. &&
says bool. :: (which we have not seen yet) says list. Each
operator has a fixed type, and the rest of the inference falls out
from there. Reading OCaml type errors well is largely about
identifying which operator created the constraint that the compiler
is complaining about.
Type annotations, when you want them
You can write explicit type annotations. They are not required and are usually omitted, but they are sometimes useful.
The annotation (x : int) says that x is int; the : int after
the parameter list says the return type is int. The compiler
checks that the annotations agree with what it would have inferred;
if you write a wrong annotation, you get a type error.
Same idea, a different syntax: a top-level type annotation on the
whole function. The fun x -> ... syntax is OCaml's
lambda;
we will come back to it in Module 3.
The OCaml community convention is to leave type annotations off
local helpers and put them on top-level functions in a module's
public interface (its .mli file, which we will cover in
Module 7).
Annotations on public APIs are documentation: they tell the reader
what the function expects without forcing the reader to read its
body. Annotations on private code clutter without paying their way,
because the compiler already knows the types.
Putting it together
A worked example combining what we have seen:
Walk through what the toplevel reports for each binding:
kelvin_of_celsius : float -> float = <fun>: takes a float (the operator+.forces this), returns a float.celsius_of_kelvin : float -> float = <fun>: same pattern.boiling_kelvin : float = 373.15: afloatvalue, the result of100.0 + 273.15.back_to_celsius : float = 100.0: round-trips, as expected.
This is the rhythm of OCaml work: define small functions, call them on values, look at the toplevel's response, build up. Names compose into expressions, expressions become values, values become arguments to the next function. There are no statements; there is no main; there is just expression after expression, each producing a value.
A quick check
What does the toplevel print for let pi = 3.14?
pi : float = 3.14val pi : float = 3.14let pi = 3.14- Nothing; bindings are silent.
Why: the toplevel reports new bindings with the val keyword,
followed by the name, type, and value. The shape is exactly val NAME : TYPE = VALUE. This is a formatting convention of the
toplevel itself, not part of the OCaml source.
Now a small code challenge. Try to make this pass:
Define a function fahrenheit_of_celsius : float -> float that
converts Celsius to Fahrenheit using the formula F = C * 9/5 + 32.
Watch the operators: you are working with floats.
If you got it: well done. If you got a "this expression has type
int but an expression was expected of type float" error, that is
the operator-mismatch error: you probably wrote 9 / 5 (integer
division, which truncates to 1) instead of 9.0 /. 5.0.
Activity
What is the type of let f x = x +. 1.0 in OCaml?
int -> intint -> floatfloat -> float- OCaml cannot infer this without an annotation.
Why: +. is the float addition operator. Its left operand
must be float (and so must its right). The constant 1.0 is
already a float. Therefore x must be float, and x +. 1.0
returns a float. So f : float -> float. If we had written +
instead, OCaml would have inferred int -> int (the operator
determines the type).
This is the punchline of the lecture, and it is worth memorising: the operator drives the inference. OCaml's type system has no "this could be either an int or a float, you decide" notion. Every expression has exactly one type, determined by its operators. When the compiler complains about a type mismatch, the first thing to look at is which operator forced which type.
What's next
Module 1 has two more lectures: a hello-world walkthrough
and the Module 1 tutorial. After
those, Module 2 zooms in on expressions:
how let bindings work in depth, how type inference handles more
complex cases, how if/then/else is an expression (not a
statement), and how all of these compose into real, if small,
programs.
Reading
- Real World OCaml, A Guided Tour: free online, covers very similar ground at a more leisurely pace: https://dev.realworldocaml.org/guided-tour.html
- Cornell CS3110, OCaml syntax and semantics: the textbook treatment of the same material: https://cs3110.github.io/textbook/chapters/basics/basics.html
- John Whitington, OCaml from the Very Beginning, Chapter 1: even gentler pace if you want a step-by-step introduction.
Sources
This lecture's prose, worked examples, and quizzes are original to
this course. Materials referenced during preparation are listed in
the Reading section above; Cornell CS3110 and Real World OCaml
are CC BY-NC-ND-licensed and have not been derivatively reused.
See LICENSES.md
at the repository root for the full source posture.