Your first OCaml program

Functional Programming with OCaml

Your first OCaml program: hello, world (and beyond)

Module 1 · Lecture 4

KC Sivaramakrishnan
IIT Madras

This lecture: hello, world

In the previous lecture we toured the values and expressions of OCaml: numbers, booleans, strings, let bindings, type inference. We have not yet written anything that runs in the sense of "does something visible to the world." This short lecture closes that gap. We will write the canonical first program in any language, talk about what makes it work in OCaml specifically, and introduce two concepts you will use in every program from here on: the unit type and the sequencing operator ;.

The lecture is short because the material is simple. The reason it deserves its own lecture is that OCaml's "hello world" looks slightly different from the equivalent in C, Java, or Python, and the difference is worth understanding properly before you move on. The shape of an OCaml program is a sequence of let bindings, and that shape is not obvious if you arrive from a language where programs look like int main() { ... } or if __name__ == "__main__": ....

Hello, world

Here is the shortest interesting OCaml program. Click Run.

let () = print_endline "hello, world"

The cell prints hello, world on its own line, and that is all: there is no val ... report. Every let binding you ran in the previous lecture earned an echo like val x : int = 42, but a let () binding binds no names, so there is nothing for the toplevel to report. The output of the program is the string; the absence of a val line tells us the binding exists purely for its side effect.

Hello, world

let () = print_endline "hello, world"

Click Run.

We just wrote an executable program.

That single line is doing more than it looks. Let's read it carefully, because every piece is doing real work and you will see this shape in essentially every OCaml file you ever write.

Reading the line

let () = print_endline "hello, world"

Parsed left to right:

Reading the line

let () = print_endline "hello, world"

Parsed left to right:

Read together, the line says: "evaluate print_endline "hello, world" (which writes the string, has return type unit), and match the resulting () against the pattern () (which always succeeds, because the right-hand side is unit and () is the only unit value)." The match-against-() part may sound silly ("of course it succeeds; there is nothing else it could be"); the point is that we are checking the type. If you accidentally wrote let () = 42, the compiler would reject it because 42 is not unit. The pattern serves as a type assertion.

This is the standard idiom for "a side-effecting expression executed at the top level." The shape is let () = EXPR where EXPR is something whose return value you do not care about (typically a print or a write). It tells anyone reading your code: this line is here for its effect.

What is unit?

OCaml has a type called unit whose only value is written (). The notation is suggestive: () is the empty tuple. Just as (1, 2) is a pair and (1, 2, 3) is a triple, () is a "zero-tuple", and the type of zero-tuples is unit. There is exactly one such value, so the type carries no information beyond its own existence.

If unit carries no information, why does the language bother to have it? The answer is that OCaml is an expression-based language: every construct, including ones with side effects, has a value of some type. Even print_endline, which seems to be "all side effect, no value", has to return something so it has a type. That something is (), of type unit. The convention is that any function whose purpose is the side effect (not the return value) returns unit.

What is unit?

let _ = print_endline "first" let _ = print_endline "second" let _ = print_endline "third"

The closest analogue in other languages is C's void, Java's void, Python's None, Rust's () (which Rust took from the ML family). All of these mean "this function does not return a useful value." The difference from void is that in OCaml (as in Rust), unit is a real type with a real value (). You can store () in a list ([(); (); ()] is a valid OCaml list of length 3) or return () from a function. Storing and returning are unusual; the value carries no information so it is hard to do anything useful with one. Taking () as an argument, on the other hand, is everyday OCaml: fun () -> ... is the standard idiom for a thunk, a computation deferred until someone calls it. You will see this all over the standard library and the ecosystem.

The phrase "no useful value" is doing a lot of work. unit is the absence of information, but it is the absence-as-a-value, not the absence-as-a-type-system-feature. This distinction will matter again when we get to Option (Module 4), which is the way OCaml handles "this function might or might not return a useful value." Option is not the same as unit; the two answer different questions.

Which of these expressions has type unit?

Why: unit is the type of expressions that exist for their side effect: their only useful behaviour is what they do, not what they return. print_endline writes its argument to stdout and returns () (the only value of type unit). The other three return values of types int, string, and int respectively. Note: it is not that 42 cannot be unit; the literal 42 has type int, period. You cannot "cast" a non-unit value to unit (though you can discard a value with let _ = ... or ignore ..., which we will see soon).

A program is a sequence of let bindings

Now that we have the shape of one binding, we have the shape of an entire OCaml program. A program is, at the top level, a sequence of let bindings, evaluated in order from top to bottom.

let greeting = "hello, " let name = "NPTEL" let () = print_endline (greeting ^ name)

Three lines, three bindings. The first two bind names to string values; the third binds the side-effect of print_endline to (). Each later binding can refer to names introduced by earlier ones.

A program is a sequence of let bindings

let greeting = "hello, " let name = "NPTEL" let () = print_endline (greeting ^ name)

Compare this to the Java equivalent, which would be something like:

public class Hello {
    public static void main(String[] args) {
        String greeting = "hello, ";
        String name = "NPTEL";
        System.out.println(greeting + name);
    }
}

The Java version has substantial scaffolding: a class declaration, a main method, a special argument convention. The OCaml version is just three lines of bindings. There is no main, no class, no boilerplate. The compiler treats the file as a sequence of bindings; when you compile and run the program, those bindings are evaluated in order. Side effects (like printing) happen as their bindings are evaluated.

This is more like Python or a Bash script in that respect: a file of statements that run top-to-bottom. The difference from Python is that every "statement" in OCaml is really a let binding (or sometimes a module declaration, which we will see in Module 7). There are no bare statements; every line of code is binding something to a name. Even let () = print_endline "x" is "binding the value () to the pattern () (i.e. type-checking it as unit) while evaluating the right-hand side for its effect."

Worked example with names

A slightly less trivial program:

let pi = 3.14159 let radius = 5.0 let area = pi *. radius *. radius let () = print_endline (Printf.sprintf "area = %.4f" area)

Four bindings. The first three are pure value bindings: pi, radius, and area get their values. The fourth is the printing side effect. Printf.sprintf is like printf from C, but it returns the formatted string instead of writing to stdout; we then pipe that string into print_endline. The format specifier %.4f means "a float, four digits after the decimal point," same as in C's printf.

Naming a value, then using it

let pi = 3.14159 let radius = 5.0 let area = pi *. radius *. radius let () = print_endline (Printf.sprintf "area = %.4f" area)

The Printf module provides type-safe formatted output: the compiler reads the format string at compile time, infers the types required by each % specifier, and checks them against the arguments you pass. If you write Printf.sprintf "%d" 3.14, OCaml rejects it at compile time, because %d expects an int but 3.14 is a float. This is unusual; in C, format-string mismatches are runtime bugs (or, with newer compilers, lint warnings). OCaml puts them in the type system. We will not dwell on Printf now; just know that it works and that the format specifiers are the same as C's.

What if you forget let ()?

The toplevel is more permissive than a compiled file. You can type a bare expression and the toplevel will evaluate it.

print_endline "hello"

This prints "hello" and the toplevel reports - : unit = () (no binding name). At the file level, when you save this to a .ml file and compile it, you would get a warning ("this expression should have type unit") unless you wrap it in let () = .... The warning exists because a bare expression at the file level usually indicates a mistake: you wrote a computation and then forgot to do anything with the result.

A useful habit: always wrap top-level side-effecting calls in let () = ..., even in the toplevel where you don't strictly need to. It documents intent and catches accidents.

What happens if you forget let ()?

print_endline "hello"

Why bother: let () = 42 is refused

let () = 42

The let () = 42 example above is genuinely useful as a teaching case. Press Run on the cell; the compiler refuses with an error message like "The constant 42 has type int but an expression was expected of type unit." This is the pattern-match-as-type-check property doing its job. You said the result should be (); the compiler checked that you actually produced a unit; you didn't (you produced an int); the compiler complains.

If you ever genuinely want to discard a non-unit value at the top level, the idiom is let _ = EXPR. The _ pattern matches anything and the compiler does not complain about the type. This is fine for discarding the result of a function whose effect you wanted but whose return value you do not need.

Sequencing with ;

A common need: do several side-effecting things in order. The OCaml operator for this is ; (a single semicolon). It sequences two expressions: evaluate the first (which must have type unit), then evaluate the second, and the whole expression has the type of the second.

let square x = x * x let cube x = x * x * x let () = Printf.printf "square 5 = %d\n" (square 5); Printf.printf "cube 5 = %d\n" (cube 5); Printf.printf "square 5 + cube 5 = %d\n" (square 5 + cube 5)

A small interactive program

let square x = x * x let cube x = x * x * x let () = Printf.printf "square 5 = %d\n" (square 5); Printf.printf "cube 5 = %d\n" (cube 5); Printf.printf "square 5 + cube 5 = %d\n" (square 5 + cube 5)

Two function definitions and then one let () = ... whose right-hand side is three printing expressions sequenced together with ;. Run it; you should see three lines.

There is a small subtle thing about ; worth knowing: its left operand has to be unit. If you write e1; e2 where e1 does not have type unit, the compiler warns you: "this expression should have type unit". The warning exists because semicolon-sequencing is for side effects; if e1 produces a value that is not unit, you are throwing that value away, which is almost always a mistake.

If you genuinely want to throw a non-unit value away (rare, but it happens), use ignore e1; e2. The ignore function takes anything and returns (), suppressing the warning.

Single ; versus double ;;

You may have seen ;; in OCaml tutorials or books, and wondered how it differs from ;. They are very different things.

Two semicolons are different from one

let () = print_endline "first"; print_endline "second"

If you copy-paste code from an old OCaml tutorial that uses ;; between every declaration, the code still compiles, but the ;; is redundant: you can leave it in or take it out without changing the meaning. Modern OCaml code rarely uses ;;. In this course we will not use it.

A small code challenge

Define greet : string -> unit that prints hello, NAME! where NAME is the function's argument. Each greeting on its own line.

let greet name = failwith "not implemented"

Hint: the body of greet should use print_endline together with ^ (string concatenation) or Printf.sprintf. If you used Printf.sprintf, you may want "hello, %s!" as the format string.

Activity

Activity

What does this program print?

let () = print_endline "A"; print_endline "B" let () = print_endline "C"

Predict before pressing Run.

What does the following OCaml program print?

let () = print_endline "A"; print_endline "B" let () = print_endline "C"

Why: top-level let bindings execute in source order, from top to bottom. The first let () = ... body uses ; to sequence two prints; both run, A first then B. The second let () = ... runs after, printing C. The same shape works for any OCaml file: the top-level bindings are a sequence, evaluated in order.

Activity discussion

A, then B, then C.

This is what an OCaml program is, in the simplest form: a file of let bindings, evaluated top to bottom, where some bindings have side effects. There is no main(). There is no entry point. The file is the program; its bindings are the steps. Later, when we introduce modules (Module 7), we will see how to organise larger programs into structured units, but the basic shape stays the same.

What's next

We have covered the surface mechanics of writing a program. The next lecture is Module 1's tutorial: worked temperature-conversion problems end to end. After that, next week (Module 2) we slow down and look at the type system: what int, float, string, bool really are, how type inference works, how if/then/else is an expression (not a statement), and how all of these compose into more interesting programs. By the end of Module 2 you will be writing real (if small) functions in OCaml comfortably.

What's next

Reading

Sources

This lecture's prose, worked examples, and quizzes are original to this course. Materials referenced during preparation are listed in the Reading section above; Cornell CS3110 and Real World OCaml are CC BY-NC-ND-licensed and have not been derivatively reused. See LICENSES.md at the repository root for the full source posture.