Introduction to Caml

These lectures cover The Transcripts of Caml sessions run in class are available.

Background on ML

ML is Versions of ML We are using OCaml because it is currently the best-supported version of ML and it has a very fast compiler. Unfortunately almost all the English books are the Standard ML dialect.

ML Novelties

Has some novel features relative to Java (and C++). These are all conceptually more advanced ideas.

Caml Basics

Read the Core Language section of the manual to suppplement these notes. For more depth and examples, read (mainly) Chapter 2 of the online O'Reilly Book.

The top loop

OCaml's top loop: Here is a Caml top-loop session, starting from a UNIX prompt. (In these notes, blue typewriter font is typed by the user, and maroon typewriter font is the computer reply.)
% ocaml
        Objective Caml version 3.06

# let x = 3+4;;
val x : int = 7
# x+4;;
- : int = 11
let syntax allows for local variable declarations in Caml:
# let x = 4 in x+3;;
- : int = 7
Declarations typed in at the top level are like an open-ended let:

# let x=4;;
val x : int = 4
# let y = x+3;;
val y : int = 7
# x*x;;
- : int = 16
Notice how the types are being inferred for us (pretty simple to do here but harder for more complex programs).

Simple Types

int, float, string, char, bool types:

4, 4.3, "hi", 'c', true

These types have all the standard kinds of operations on them, in the libraries.

Libraries

Int/Real non-Overloading

Caml is pure in that it does not overload the meaning +/* etc to work on integers and floats.
# 2.3 * 5.7;;
Characters 6-9:
This expression has type float but is here used with type int

# 2.3 *. 5.7;;
- : float = 13.110000
Caml also never performs implicit coercions, all coercions must be explicit.
# 2.3 *. 5;;
Characters 7-8:
This expression has type int but is here used with type float

Why is Caml doing this?

Caml is Expression-based

Caml is expression-based, there are no pure "commands" like in Java/C++; instead, commands are also expressions, they return values.
# (if (2=3) then 5 else 6) + 1;;
- : int = 7
One sometimes annoying consequence of the above is the two branches of the if need to return the same type.
if (2=3) then 5 else 6.5;;
Characters 21-24:
This expression has type float but is here used with type int

Let

let allows local declarations to be defined.

let x = ... in .... is somewhat analogous to local variable definition


{
int x = ...;
...
}
in C, but the Caml variable x is given a value that is immutable (can never change).

The top-loop is analogous to an open-ended let: each entry defines a new nested let-scope with a new declaration, and the scope can never be closed.

Built-in simple datatypes: lists, tuples

Caml's built-in list/tuple data objects are implicitly allocated for you: no need to malloc/free them. A Garbage collector automatically collects them when they are unreachable and thus no longer used.

Lists

Lists are ... lists of Caml values. Defining a new list is a triviality, even easier than in Java.
# [2;1+2;4];;
- : int list = [2; 3; 4]
# ["e"; String.concat "" ["f";"g"] ;"h"];;
- : string list = ["e"; "fg"; "h"]
Notice how the function call String.concat ["f";"g"] does not require ( ... ) around the function's arguments, and how they are space- and not comma-separated. Thats because Lists must be uniform in their type ("homogenous").
# [3;"e"];;
Characters 3-6:
This expression has type string but is here used with type int
List operations are numerous.
# let x = [2;3];;
val x : int list = [2; 3]
# let y = 1::x;; (* this is called "consing" an element on to a list *)
val y : int list = [1; 2; 3]
# x;;
val x : int list = [2; 3] (* y is a NEW list; the
list x is IMMUTABLE  and didn't change *)
# x @ y;;        (* appending lists *)
- : int list = [2; 3; 1; 2; 3]
# List.hd x;; (* head: first element of a list *)
- : int = 2
# List.tl x;;  (* the tail or rest of a list *)
- : int list = [3]
The hd/tl operations are not in the core library (the module name must be used when referring to them) because you should generally not use them. (And, we will penalize any use of them in homeworks). Instead, use pattern matching:
# match x with h::t -> h;;
Characters 0-22:
Warning: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
[]
- : int = 2
Tail is very similar:
# match x with h::t -> t;;
Characters 0-22:
Warning: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
[]
- : int list = [3]

Tuples

Tupes are fixed-length lists, but the fields may be of differing types ("heterogenous").
# let x = (2,"hi");;
val x : int * string = 2, "hi"
# match x with (y,z) -> y;;
- : int = 2
Note tuple type syntax: int * string, etc.

All the data mentioned so far is immutable - it is impossible to change an entry in an existing list, tuple, or record!
Also, all variables are immutable. Thus the language described so far is a pure functional subset of ML.
Mutable features will be discussed later.

Caml Functions

Here is a simple recursive fibonnaci function definition.
#let rec fib n =
   if n < 2 then 1 else fib(n-1) + fib(n-2);;
val fib : int -> int = <fun>
#   fib 33;;
- : int = 5702887
Other important aspects of Caml functions Multiple-argument functions are not built-in; generally use Currying to define them.
# let rec comb n m = (* assumes 0 <= m <= n *)
                        if m=0 or m=n then 1
                        else comb (n-1) m + comb (n-1) (m-1);;
   val comb : int -> int -> int = <fun>
# comb 10 4;;
- : int = 210
Look at the type of comb: int -> int -> int which is int -> (int -> int): it takes an integer and returns a function which expects another integer! Lets test this:
# let comb10 = comb 10;;
val comb10 : int -> int = <fun>
# comb10 4;;
- : int = 210
# comb10 3;;
- : int = 120
Indeed, we can give comb only one argument, in which case it returns a function that we can later use. More on Currying below.

Mutually recursive functions must be defined simultaneously:

let rec
        take(l) = match l with [] -> []
                | hd :: tl ->  hd::skip tl
and
        skip(l) = match l with [] -> []
                | hd :: tl -> take tl;;
          val take : 'a list -> 'a list = <fun>
val skip : 'a list -> 'a list = <fun>
# take [1;2;3;4;5;6;7;8;9;10];;
- : int list = [1; 3; 5; 7; 9]
# skip [1;2;3;4;5;6;7;8;9;10];;
- : int list = [2; 4; 6; 8; 10]
This example also shows a pattern match with multiple cases, either empty list or nonempty list. More on patterns now.

Patterns

Patterns make function definitions much more succinct, as we just saw.

let rec rev l =
  match l with [] -> []
  |   x::xs -> rev xs @ [x];;
The pattern matching process Patterns also can be defined in let to attribute values to multiple variables:

let l = [1;2;3;4;5;6;7;8;9;10];;
val l : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
# let (evens,odds) = (skip l, take l);; (* skip and take defined above *)
val evens : int list = [2; 4; 6; 8; 10]
val odds : int list = [1; 3; 5; 7; 9]
Similarly patterns can be used in function definitions.

# let add (x,y) = x + y;;
val add : int * int -> int = <fun>
# add (2,3);;
- : int = 5
This looks like a function of two arguments, but its a function of one argument which matches a pair pattern. Note, in Caml it is better to use Curried function definitions for multiple-argument functions, not tuples.

Immutable Declarations

Don't forget let is immutable; its bound to screw you up at some point.

let x = 5 in
  let f y = x + 1 in
    let x = 7 in f 0 ;;
- : int = 6 (* old value of x is what f refers to *)
Here's the one that will mess with your mind: the same thing as above but with the declarations typed into the top loop (the top loop is conceptually an open-ended series of lets which never close).
# let x = 5;;
val x : int = 5
# let f y = x + 1;;
val f : 'a -> int = <fun>
# f 0;;
- : int = 6
# let x = 7;; (* not an assignment to above x -- a new declaration *)
val x : int = 7
# f 0;;
- : int = 6
Programming moral: When interactively editing a group of functions that call each other, re-submit all of the functions to the top loop when you change any one of them.
Here is another example of let:

let hundredthPower x =
   let four = x*.x*.x*.x in
   let twenty = four*.four*.four*.four*.four in
        twenty*.twenty*.twenty*.twenty*.twenty;;
          val hundredthPower : float -> float = <fun>
# hundredthPower(2.0);;
- : float = 1.26765060023e+30

Higher-Order Functions

ML is highly tuned to allowing higher-order functions, functions that either take other functions as argument or return functions as results, or both.

Higher-order functions are an important component of a programmer's toolkit.

The classic example of a function that takes another function as argument is the map function on lists. It takes a list and a function and applies the function to every element of the list.
let rec map f l =
match l with []    -> []
      |      x::xs -> f(x) :: map f xs;;
val map : ('a -> 'b) -> 'a list -> 'b list = <fun>
The 'a/'b types are polymorphic ("any") type, more on them below.
# map (function x -> x*10) [4;2;7];;
- : int list = [40; 20; 70]
map is so common it is built into Caml as List.map.

Perhaps the simplest higher-order function is the composer, in mathematics expressed as g o f. it takes two functions and returns a new function which is their composition:


let circle g f = (function x -> g(f(x)));;
val circle : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>
Here is an example of circle in action.
# let plus3times2 = circle (function x -> x*2) (function x -> x+3);;
val plus3times2 : int -> int = <fun>
# plus3times2 10;;
- : int = 26
As we have seen before, functions are just expressions so can also be immediately applied after being defined:

# circle (function x -> x*2) (function x -> x+3) 10;;
- : int = 26
Oddly enough, this looks like application of a three-argument function circle, but circle only takes two arguments! All three arguments to circle are in fact Curried: one is applied, the result is a function, and then that function is immediately applied to the next argument. Let us focus on this topic now.

Currying

Currying is an important concept of functional programming; it is named after logician Haskell Curry. Multi-argument functions as defined thus far are Curried, lets look at what is really happening.

Here is a two-argument function defined in our usual manner.


# let myadd x y = x + y;;
val myadd : int -> int -> int = <fun>
# myadd 3 4;;
- : int = 7

Here is another completely equivalent way to define the same function:

# let myadd x =
  function y  -> x + y;;
  val myadd : int -> int -> int = <fun>
    (* the -> type constructor associates to the RIGHT: int -> (int -> int) *)
# myadd 3 4;; (* parenthesized as (myadd 3) 4 *)
- : int = 7

# let inc3 = myadd 3;;
val inc3 : int -> int = <fun>

# inc3 4;;
- : int = 7 (* same result as myadd 3 4 in the end *)
The main observation is myadd is a function returning a function, so the way we supply two arguments is

Here is a third equivalent way to define myadd, as an anonymous function returning another anonymous function.


#let myadd = function x -> function y -> x + y;;
val myadd : int -> int -> int = <fun>
 
Note thus far we have curried only two-argument functions; in general, n-argument currying is possible.

Functions can also take pairs as arguments to achieve the effect of a two-argument function:

# let mypairadd (x,y) = x+y;;
val mypairadd : int * int -> int = <fun>
# mypairadd (2,3);;
- : int = 5
So, either we can Curry or we can pass a pair. We can also write higher-order functions to switch back and forth between the two forms.
# let curry f = function x -> function y -> f (x,y);;
val curry : ('a * 'b -> 'c) -> 'a -> 'b -> 'c = <fun>
# let uncurry f = function (x,y) -> f x y;;
val uncurry : ('a -> 'b -> 'c) -> 'a * 'b -> 'c = <fun>
# uncurry myadd;;
- : int * int -> int = <fun>
# curry mypairadd;;
- : int -> int -> int =  <fun>
# uncurry map;; (* map defined above *)

- : ('_a -> '_b) -> '_a list -> '_b list = <fun>

# curry(uncurry myadd);; (* a no-op *)
- : int -> int -> int = <fun>
Look at the types: these mappings in both directions in some sense "implement" the well-known isomorphism on sets: A * B -> C = A -> B -> C

A bigger example

Here is a more high-powered example of the use of currying.


# let rec foldr f l y =
match l with [] -> y 
  |   x::xs -> f x (foldr f xs y);;
     val foldr : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b = <fun>
# let prod = foldr (function a -> function x -> a * x);;
val prod : int list -> int -> int = <fun>
# let prod0 = prod [1;2;3;4];;
val prod0 : int -> int = <fun>
# (prod0 1, prod0 2);;
- : int * int = 24, 48
Here is an analysis of this recursive function.
for the arbitrary 2-element list [x1;x2], the call
foldr f [x1; x2] y
computes to
f x1 (foldr f [x2]
y)
which in turn computes to
f x1 (f x2 (foldr f []
y)))
which computes to
f x1 (f x2 y)
From this we can assert that the general result returned from foldr f [x1;x2;...;xn] y is

f x1 (f x2  f ...(f xn y)...))))
Currying allows us to specialize foldr to a particular function f, as with prod above.

Proving program properties by induction

We should in fact be able to prove this property by induction. Its easier if we reverse the numbering of the list.

Lemma. foldr f [xn;...;x1] y computes to f xn (f xn-1 f ...(f x1 y)...))) for n greater than 0.
Proof. Proceed by induction on the length of the list [xn;..;x1].
Base Case n=1, i.e. the list is [x1]. The function computes to f x1 (foldr f [] y) which computes to f x1 y as hypothesized.
Induction Step. Assume

foldr f [xn;...;x1] y
computes to
f xn (f xn-1  f ...(f x1 y)...)))
and show
foldr f [xn+1;xn;...;x1] y
computes to
f xn+1 (f xn  f ...(f x1 y)...))))
Computing
foldr f [x1;x2;...;xn;xn+1] y
, it matches the pattern with x being xn+1 and xs being [xn;...;x1].
Thus the recursive call is
foldr f [xn;...;x1] y
which by our inductive assumption computes to
f xn (f xn-1  f ...(f x1 y)...)))
And, given this result for the recursive call, the whole function then returns
f xn+1 (...result of recursive call...)
which is
f xn+1 (f xn (f xn-1  f ...(f x1 y)...)))
which is what we needed to show.
QED.

The above implementation is inefficient in that f is explicitly passed to every recursive call. Here is a more efficient version with identical functionality.

let efficientfoldr f =
 let rec localfun l y =
      match l with [] -> y 
      |   x::xs -> f x (localfun xs y)
 in localfun
;;
This function also illustrates how functions may be defined in a local scope. Observe localfun is defined locally but then exported since it is the return value of f.

Question: How does the return value localfun know where to look for f when its called??

# let summate = efficientfoldr (function a -> function x -> a+x);;
val summate : int list -> int -> int = <fun>
# summate [1;2;3;4] 0;;
- : int = 10
--summate is just localfun, but somehow it "knows" that f is (function a -> function x -> a+x), even though f is undefined at the top level:
# f;;
Characters 0-1:
Unbound value f
localfun in fact knew the right f to call, so it must have been kept somewhere: in a closure.

At function definition point, the current values of variables not local to the function definition are remembered in a closure.
Function values in ML are thus really a pair consisting of the function (pointer) and the closure.

Without making a closure, higher-order functions will do unexpected things. Java, C++, C can pass and return function (pointers), but all functions are defined at the top level so they have no closures.

Submitting functions to the top loop

You should never type code directly in the top loop! Its impossible to fix errors. Instead, you should edit in a file. There are several reasonable modes to interact with caml:
  1. Use any editor, and save each group of interlinked functions in a separate file, for example "myfunctions.ml". Then, from the top loop type
    # #use "myfunctions.ml";;
    -- this will submit everything in the file to the top loop. Note its #use, not just use.
  2. Use any editor, and copy-and-paste code into caml. This is great for smaller functions but eventually you want to use the method above.
  3. (best if you are using UNIX; may even work under Windows now:) Use the emacs editor and the caml mode documented on the course emacs web page. Edit in the manner of 1. above, with groups of functions in a file. Then, use the
    
    C-c C-e		caml-eval-phrase
    C-c C-r		caml-eval-region
    C-c C-s		caml-show-subshell
    C-c `		caml-goto-phrase-error
    commands of caml mode to submit the current definition (phrase) to caml, the selected region to caml, to show the actual shell window, and to find your errors respectively. When you are playing with little examples, just type a small bit in any emacs buffer and click on it and use the caml-eval-phrase command to send it to caml. caml-eval-phrase at the end of a file submits all the expressions in the file.

Print


print_string("hi\n");;
hi
- : unit = ()
Caml has a print_x function for the atomic types x. Again there is no overloading of meaning here.

Caml Types

We have generally been ignoring the type aspect of Caml up to now. Its time to focus on typing in more detail.

Type Declarations

Caml infers types for you, but you can add explicit type declarations if you like.

let myadd (x:int) (y:int) = x + y;;
val myadd : int -> int -> int = <fun>
You can in fact put type assertions on any variable in an expression to clarify what type the variable has:

let myadd (x:int) (y:int) = (x:int) + y;;
val myadd : int -> int -> int = <fun>

Type abbreviations

You can also make up your own name for any type
# type intpair = int*int;;
type intpair = int * int
# let f (p : intpair) = match p with (l,r) -> l+r;;
val f : intpair -> int = <fun>
# f (2,3);;
- : int = 5

Polymorphic Types and Type Inference


# let id x = x;;
val id : 'a -> 'a = <fun>
# id 3;;
- : int = 3
# id true;;
- : bool = true
Parametric and object polymorphism The general intuition to have about the type inference algorithm is everything starts out as having arbitrary types 'a, 'b, etc, but then the operations infer constraints that "this thing has the same type as that thing".

Use of type-specific atomic operators obviously restricts polymorphism:

let doublenegate x = not (not x);;
val doublenegate = function : bool -> bool
When a function is defined via let to have polymorphic type, every use can be at a different type:

# let id = (function x -> x) in
    match id(true) with true -> id(3) | false -> id(4);;
  - : int = 3
Note that if the function is not declared via let, polymorphism is not allowed. The following code would run just like the code above (if it would typecheck):
# (function id -> match id(true) with true -> id(3) | false -> id(4))(function x -> x);;
                                                 ^
This expression has type int but is here used with type bool

Variant Type declarations

Variant types in ML are the analogue of union/variant types in C/Pascal. Following in the ML tradition of lists and tuples, they are not mutable.

ML variant types must be declared.
Why? Variant Types are often recursive, and recursive types cannot be inferred in ML.

Here is a really simple variant type declaration to get warmed up:


type height = Tall | Medium | Short
type height = Tall | Medium | Short
Three constructors have been defined. These are now official constants. Constructors must be capitalized, and variables must be lower-case in Caml.
Tall;;
- : height = Tall
The previous type is only an enumerated type. Much more interesting variant types can be defined. Lists could have been predefined as a variant type:

# type 'a mylist = Nil | Cons of 'a * 'a mylist;;
type 'a mylist = Nil | Cons of 'a * 'a mylist
# Cons (3, Cons (3+1,Nil));;
- : int mylist = Cons (3, Cons (4, Nil))
 
This form of type has several new features:

Trees:


type 'a tree = EmptyTree | Node of 'a * 'a tree * 'a tree
Patterns are also variant type-constructor-friendly:

let rec myappend p =
 match p with (Nil,l2)  -> l2
     |  (Cons(x,xs),l2) -> Cons (x, append(xs,l2))
  val myappend : 'a mylist * 'a mylist -> 'a mylist = <fun>
# let rec myreverse l =
 match l with Nil  -> Nil
     |  Cons(x,xs) -> myappend(myreverse xs , Cons (x,Nil));;
val myreverse : 'a mylist -> 'a mylist = <fun>
# myreverse (Cons (3, Cons(4, Cons(2, Nil))));;
- : int mylist = Cons (2, Cons (4, Cons (3, Nil)))

Variant types are extremely happy with polymorphism; observe how myreverse above is polymorphic over 'a mylists.

Record Declations

type onetwo =  {one : int; two : string};;
type onetwo = { one : int; two : string; }
# let x = {one = 2; two = "hi"};;
val x : onetwo = {one=2; two="hi"}
# x.one;;
- : int = 2
# match x with {one = x; two = s} -> x;;
- : int = 2

State

Variables in ML are never directly mutable themselves; they are only (indirectly) mutable if they hold an Indirect mutability means the variable itself can't change, but what it points to can.
And, items are immutable unless their mutability is explicitly declared.

References are the simplest unit of mutability.

# let x = ref 4;;
val x : int ref = {contents=4}
This allocates a fresh mutable location named x. Records can have mutable fields, and so a reference is in fact implemented in Caml as a little record with one mutable field, contents:
#type 'a ref = { mutable contents: 'a };; (* a type abbreviation; its how ref is defined in OCaml *)
type 'a ref = { mutable contents : 'a; } 
references are records so you can't directly operate on them.
# x+1;;
Characters 0-1:
This expression has type int ref but is here used with type int
# x.contents + 1;; (* x is a record *)
- : int = 5
# !x + 1;;    (* shorthand notation for the above *)
- : int = 5
ref variables may be modified by assignment:
# x.contents <- 6;;  (* mutate the contents record field *)
- : unit = ()
# x := 6;; (* equivalent shorthand notation for the above *)
- : unit = () (* (), the empty tuple of type unit, is the result *)
# !x + 1;;
- : int = 7
Only ref typed variables or mutable records may be assigned to. The notion of immutable variables is one of the great strengths of ML.

Note, let doesn't turn into a mutation operator with refs, either:


let x = ref 5;;
val x : int ref = {contents = 5}
let f () = !x;;
val f : unit -> int = 
let x = ref 6;; (* not an assignment to x *)
val x : int ref = {contents = 6}
f ();;
- : int = 5
 

Mutable records of general form can be created by putting mutable in front of mutable fields:


#type mutable_point = { mutable x: float; mutable y: float };;
type mutable_point = { mutable x : float; mutable y : float; } 
 
#let translate p dx dy =
   p.x <- p.x +. dx; p.y <- p.y +. dy;;
val translate : mutable_point -> float -> float -> unit = <fun>
 
#let mypoint = { x = 0.0; y = 0.0 };;
val mypoint : mutable_point = {x=0.000000; y=0.000000}
 
#translate mypoint 1.0 2.0;;
- : unit = ()
 
#mypoint;;
- : mutable_point = {x=1.000000; y=2.000000}
Now that we have references, the while loop construct becomes useful (without references, a while loop would either never execute or loop infinitely -# pretty useless!):
#  (x := 1; (while !x < 10 do x := !x + 1 done); !x);;
- : int = 10

Arrays

Caml arrays are fairly self-explanatory. Their syntax isn't the greatest, and they have to be initialized before you can use them.
# Array.make 10 "hi";; (* size and initial value are the params here *)
- : string array =
[|"hi"; "hi"; "hi"; "hi"; "hi"; "hi"; "hi"; "hi"; "hi"; "hi"|]
# let arr = [| 4; 3; 2 |];;
val arr : int array = [|4; 3; 2|]
# arr.(2);;
- : int = 2
# arr.(2) <- 55;;
- : unit = ()
# arr;;
- : int array = [|4; 3; 55|]

Exceptions

Exceptions are a criticial component of a modern programming language. Example uses

exception Foo;;
exception Foo
# let f _ = raise Foo;;
val f : 'a -> 'b = <fun>

# f ();;
Uncaught exception: Foo.
- 
As you can see, exceptions are top-level definable units. Exceptions can be handled. This means even though something bad happened, the program can detect this fact, recover, and continue:

# let g _ = try f () with Foo -> print_string("exception raised; returning 5\n"); 5 ;;
val g : 'a -> int = <fun>
# g();;
exception raised; returning 5
- : int = 5
The call f() followed by handle is syntax for handling errors that may arise when f() is called (including errors that may arise if f in turn calls some other function etc).

Exception foo is only handled when it happens in the call to f() in g.
Exception handling thus always has a scope.

Exceptions that pass up an argument.
Useful both for print diagnostics and error recovery.


exception Goo of string;;
exception Goo
-let f _ = raise (Goo "bad stuff");;
val f : 'a -> 'b = <fun>
# f ();;
Uncaught exception: Goo "bad stuff".
# let g () = try f () with Goo s ->
                  (print_string("exception raised: ");print_string(s);print_string("\n"));;
  val g : unit -> unit = <fun>
# g();;
exception raised: bad stuff
- : unit = ()

Examples

See Basic Examples for some simple programs.

Modules

Examples: Why? Fundamentals of module structure: Desirable features not always found: The C/C++ module system Problems with it The Java module system: packages Some languages with good module systems: Modula-2, ML, Ada.

The Caml module system

See The Caml manual Chapter 4.

Basic entites of Caml modules: structures and functors

Structures and Signatures

Structures Here is an example.
module Mapping = 
  struct

     exception NotFound

     (* create the empty mapping *)

     let create = []

     (* lookup(d,M) finds the range value r such that
        (d,r) is a pair in mapping M *)

     let rec lookup pr = 
       match pr with (d,[]) -> raise NotFound
       |               (d,(e,r)::es) -> 
	   if d=e then r
	   else lookup(d,es)

     (* insert(d,r,M) puts (d,r) in mapping M and removes
        any other pair (d,s) that was present in M *)

     let rec insert triple =
       match triple with (d,r,[]) -> [(d,r)]
       |   (d,r,(e,s)::es) ->
             if d = e then (d,r)::es
             else (e,s)::insert(d,r,es)
     end;;
(Note this example should probably use curried functions, not tuple arguments)
Signatures can be inferred if not declared. For the above, the signature inferred is:
module Mapping :
  sig
    exception NotFound
    val create : 'a list
    val lookup : 'a * ('a * 'b) list -> 'b
    val insert : 'a * 'b * ('a * 'b) list -> ('a * 'b) list
  end
Observe the syntactic details.

Structure internals may be referenced by qualified names Mapping.insert, etc:

# Mapping.insert(4,"tru",Mapping.create);;
- : (int * string) list = [4, "tru"]
# let m1 = Mapping.insert(4,"tru",Mapping.create);;
val m1 : (int * string) list = [4, "tru"]
# let m2 = Mapping.insert(44,"ru",m1);;
val m2 : (int * string) list = [4, "tru"; 44, "ru"]
#
The whole structure may be made available at the "top level" of the namespace by the declaration open Mapping:
# open Mapping;;
# insert;;
- : 'a * 'b * ('a * 'b) list -> ('a * 'b) list = <fun>
Built-in Caml modules: the standard library. Signatures can be explicitly declared as follows:
# module type INSERTMAPPING =
  sig
    exception NotFound
    val create : 'a list
    val insert : 'a * 'b * ('a * 'b) list -> ('a * 'b) list
  end;;
module type INSERTMAPPING =
  sig
    exception NotFound
    val create : 'a list
    val insert : 'a * 'b * ('a * 'b) list -> ('a * 'b) list
  end
 
This (stupid) signature leaves out the lookup operation -- this is how internal functions can be hidden. You probably wouldn't want to hide lookup but you get the idea.

Once you have declared a signature, you can restrict a structure to that signature as follows:

# module InsertOnlyMapping = (Mapping : INSERTMAPPING);;
module InsertOnlyMapping : INSERTMAPPING
# InsertOnlyMapping.insert;;
- : 'a * 'b * ('a * 'b) list -> ('a * 'b) list = <fun>
# InsertOnlyMapping.lookup;; (* this guy is hidden *)
Characters 0-24:
Unbound value InsertOnlyMapping.lookup
 
The above module includes some functions, a value, and an exception declared. Modules will often also include type declarations, as we will see later.

Functors


let lt(x,y) = String.lowercase x < String.lowercase y

module StringBST = struct

  type 'label btree =
      Empty |
         Node of 'label * 'label btree * 'label btree
      
  let rec lookup(x, tree) = 
    match tree with Empty -> false
    |   Node(y,left,right) ->
             if lt(x,y) then lookup(x, left)
             else if lt(y,x) then lookup(x, right)
             else (* x=y *) true

  let rec insert(x, tree) = 
    match tree with Empty -> Node(x,Empty,Empty)
     |   Node(y,left,right) as t ->
             if lt(x,y) then Node(y,insert(x,left),right)
             else if lt(y,x) then Node(y,left,insert(x,right))
             else (* x=y *) t (* do nothing; x was
                                  already there *)

     exception EmptyTree

     (* deletemin(T) returns a pair consisting of the least
        element y in tree T and the tree that results if we
        delete y from T.  It is an error if T is empty *)

     let rec deletemin l =
       match l with Empty -> raise EmptyTree
     |   Node(y,Empty,right) -> (y,right) (* The
                 critical case.  If the left subtree is empty,
                 then the element at current node is min. *)
     |   Node(w,left,right) ->
             let
                 (y,l) = deletemin(left)
             in
                 (y, Node(w,l,right))

     
     let rec delete(x, tree) = 
       match tree with Empty -> Empty
     |   Node(y,left,right) ->
             if lt(x,y) then Node(y,delete(x,left),right)
             else if lt(y,x) then Node(y,left,delete(x,right))
             else (* x=y *)
                 match (left,right) with
                     (Empty,r) -> r |
                     (l,Empty) -> l |
                     (l,r) ->
                         let
                             (z,r1) = deletemin(r)
                         in
                             Node(z,l,r1)

end;;
val lt : string * string -> bool = <fun>
module StringBST :
  sig
    type 'a btree = Empty | Node of 'a * 'a btree * 'a btree
    val lookup : string * string btree -> bool
    val insert : string * string btree -> string btree
    exception EmptyTree
    val deletemin : 'a btree -> 'a * 'a btree
    val delete : string * string btree -> string btree
  end

# module type STRINGLESS = 
 sig 
   val lt : string * string -> bool 
 end;;
module type STRINGLESS = sig val lt : string * string -> bool end

# module StringLess : STRINGLESS = (* can give signature at declaration time *)
struct
 let lt(x,y) = String.lowercase x < String.lowercase y
end;;
module StringLess : STRINGLESS

# module StringBSTFunctor=
  functor (StringLt : STRINGLESS) ->
  struct

  type 'label btree =
      Empty |
         Node of 'label * 'label btree * 'label btree
      
  let rec lookup(x, tree) = 
    match tree with Empty -> false
    |   Node(y,left,right) ->
             if StringLt.lt(x,y) then lookup(x, left)
             else if StringLt.lt(y,x) then lookup(x, right)
             else (* x=y *) true

  let rec insert(x, tree) = 
    match tree with Empty -> Node(x,Empty,Empty)
     |   Node(y,left,right) as t ->
             if StringLt.lt(x,y) then Node(y,insert(x,left),right)
             else if StringLt.lt(y,x) then Node(y,left,insert(x,right))
             else (* x=y *) t (* do nothing; x was
                                  already there *)

     exception EmptyTree

     (* deletemin(T) returns a pair consisting of the least
        element y in tree T and the tree that results if we
        delete y from T.  It is an error if T is empty *)

     let rec deletemin l =
       match l with Empty -> raise EmptyTree
     |   Node(y,Empty,right) -> (y,right) (* The
                 critical case.  If the left subtree is empty,
                 then the element at current node is min. *)
     |   Node(w,left,right) ->
             let
                 (y,l) = deletemin(left)
             in
                 (y, Node(w,l,right))

     
     let rec delete(x, tree) = 
       match tree with Empty -> Empty
     |   Node(y,left,right) ->
             if StringLt.lt(x,y) then Node(y,delete(x,left),right)
             else if StringLt.lt(y,x) then Node(y,left,delete(x,right))
             else (* x=y *)
                 match (left,right) with
                     (Empty,r) -> r |
                     (l,Empty) -> l |
                     (l,r) ->
                         let
                             (z,r1) = deletemin(r)
                         in
                             Node(z,l,r1)

end;;
module StringBSTFunctor :
  functor (StringLt : STRINGLESS) ->
    sig
      type 'a btree = Empty | Node of 'a * 'a btree * 'a btree
      val lookup : string * string btree -> bool
      val insert : string * string btree -> string btree
      exception EmptyTree
      val deletemin : 'a btree -> 'a * 'a btree
      val delete : string * string btree -> string btree
    end

Observations about functor declarations: What state are we in now, after defining this functor? Here is how you resolve the imports and make a structure.
# module StringBST = StringBSTFunctor (StringLess) (* apply functor to make structure *)
;;

module StringBST :
  sig
    type 'a btree =
      'a StringBSTFunctor(StringLess).btree =
        Empty
      | Node of 'a * 'a btree * 'a btree
    val lookup : string * string btree -> bool
    val insert : string * string btree -> string btree
    exception EmptyTree
    val deletemin : 'a btree -> 'a * 'a btree
    val delete : string * string btree -> string btree
  end

Type declarations are particularly useful in signatures: you can hide the details of a type by declaring

 type 'a btree
alone in the signature, hiding what 'a btree really is (this makes btree abstract):
# module type STRINGBST =
  sig
    type 'a btree  (* hide the details of the btree type in the signature *)
    val lookup : string * string btree -> bool
    val insert : string * string btree -> string btree
    exception EmptyTree
    val deletemin : 'a btree -> 'a * 'a btree
    val delete : string * string btree -> string btree
  end;;
module type STRINGBST =
  sig
    type 'a btree
    val lookup : string * string btree -> bool
    val insert : string * string btree -> string btree
    exception EmptyTree
    val deletemin : 'a btree -> 'a * 'a btree
    val delete : string * string btree -> string btree
  end
# module AbstractStringBST = (StringBST : STRINGBST);;
module AbstractStringBST : STRINGBST
#
In the above example, we applied the functor StringBSTFunctor and then hid some elements of the result. What you probably want to do is fix StringBSTFunctor to always produce the abstract type in the result. That is done as follows.

# module AbstractStringBSTFunctor = (StringBSTFunctor : functor(Lt:STRINGLESS) -> STRINGBST);;
module AbstractStringBSTFunctor : functor (Lt : STRINGLESS) ->
STRINGBST

Problems with functors

Separate Compilation

Coding in the separate compilation method in Caml is a lot closer to the C/Java programming style. Here is an example, the file stack.ml implementing stacks in the ocaml library:
type 'a t = { mutable c : 'a list }
exception Empty
let create () = { c = [] }
let clear s = s.c <- []
let copy s = { c = s.c }
let push x s = s.c <- x :: s.c
let pop s =
  match s.c with
    hd::tl -> s.c <- tl; hd
  | []     -> raise Empty
let top s =
  match s.c with
    hd::_ -> hd
  | []     -> raise Empty
let is_empty s = (s.c = [])
let length s = List.length s.c
let iter f s = List.iter f s.c
And, here is the interface file, stack.mli:
type 'a t
exception Empty
val create : unit -> 'a t
val push : 'a -> 'a t -> unit
val pop : 'a t -> 'a
val top : 'a t -> 'a
val clear : 'a t -> unit
val copy : 'a t -> 'a t
val is_empty : 'a t -> bool
val length : 'a t -> int
val iter : ('a -> unit) -> 'a t -> unit
Notice the interface hides the type 'a t and so outsiders will not be able to directly manipulate the stack data structure.

Compiling

Last modified: Fri Jan 29 08:19:19 EST 2010