Modules Basics
- We have seen OCaml modules in action, e.g.
List.map
,Float.(2 = 3)
,Fn.id
, etc. - We also covered how modules are collections of functions, values, types, and other modules
- Now we want to cover how individual
.ml
files define modules- and, how to hide some items in a module (think
private
of Java/C++) via.mli
file signatures
- and, how to hide some items in a module (think
- Also we will cover how most modules are libraries of auxiliary functions but how modules may also define executables.
- … there are also many more fancy module features which we will cover later
- We are going to use a running example to explain these concepts; see set-example.zip for the full example
.ml
files are modules
The contents of the file simple_set.ml
in the above is the following:
open Core
type 'a t = 'a list
let emptyset : 'a t = []
let add (x : 'a) (s : 'a t) : 'a t = (x :: s)
let rec remove (x : 'a) (s: 'a t) (equal : 'a -> 'a -> bool) : 'a t =
match s with
| [] -> failwith "item is not in set"
| hd :: tl ->
if equal hd x then tl
else hd :: remove x tl equal
let rec contains (x: 'a) (s: 'a t) (equal : 'a -> 'a -> bool) : bool =
match s with
| [] -> false
| hd :: tl ->
if equal x hd then true else contains x tl equal
- The above code defines module
Simple_set
since it is in the filesimple_set.ml
- Capitalize first letter (only) in file name and remove
.ml
to get module name
- Capitalize first letter (only) in file name and remove
- Modules are just collections of top-level definable things (things you could type into top loop)
- Assignment 1 file
submission.ml
is in fact making a module as well, namedSubmission
.dune utop
fires up OCaml with that module loaded;open Submission;;
then allows you to avoid typing long nameSubmission.fibonacci
etc.
- This particular set module is just a set implemented as a list; it is in fact a multiset
- The line
type 'a t = 'a list
is a type abbreviation,'a t
is a synonym for'a list
- below we will show how to hide the fact that it is a list.
- Naming a type just
t
is the standard for “the” underlying type of a module- When outsiders use this module the type will be
Simple_set.t
, read “Simple set’s type” Core
extensively uses this type naming convention in libraries:List.t
,Option.t
etc.
- When outsiders use this module the type will be
- Notice how the functions needing
=
we have to pass it in explicitly to be polymorphic- In
Core.Set
there is in fact a much better solution but involves fancier modules which we cover laterBuilding the library
- In
This file can be built as a library module with the dune file in src/dune
(remember to execute dune
from the project top-level, it automatically finds the build files in subdirectories)
(library
(name simple_set)
(modules simple_set)
(libraries core)
)
And if you want to play with your library module, command dune utop
from the same directory will load it into the top loop:
myshell $ dune utop
...
utop # Simple_set.add 4 Simple_set.emptyset;;
- : int list = [4]
- One thing potentially annoying here is the fact that we used a list to implement our set gets exposed here
- But, we can use type abstraction to hide this; next topic
Other ways to load a module into the top loop besides dune utop
- If you type
#use "simple_set.ml";;
it is just like copy/pasting the code of the file in – you won’t get a module. - If you want to “paste a file in the top loop as a module”, there is a command for that:
#mod_use "simple_set.ml";;
- And if that was not enough there is one more method: you can
#use_output "dune top"
- this runs the shell command
dune top
and pastes the output into the top loop; thatdune
command generates byte code files and then spits out a bunch of#load
commands to load all the libraries as well as your code.
- this runs the shell command
Information Hiding with Module Types aka Signatures
- Modules also have types, they are called either module types or signatures
- The latter term is used in math, e.g. “a DFA has signature D = (S, Σ, τ, s0, F)”
- When a module is defined in a file
simple_set.ml
, make a filesimple_set.mli
for its corresponding module type- the added “
i
” is for “interface”
- the added “
- You don’t need an
.mli
file if there is nothing to hide, the module type will be inferred- But, even if nothing is hidden the
.mli
is important as a document of what is provided to users - all assignments come with an
.mli
file so you can get used to that format.
- But, even if nothing is hidden the
So, here the simple_set.mli
file from the above zip after we have hidden the type of 'a t
by removing = 'a list
:
type 'a t (* can also hide the type here by not giving it in signature: remove the = 'a list *)
val emptyset : 'a t
val add: 'a -> 'a t ->'a t
val remove : 'a -> 'a t -> ('a -> 'a -> bool) -> 'a t
val contains: 'a -> 'a t -> ('a -> 'a -> bool) -> bool
Now if we dune utop
with this added file we get
myshell $ dune utop
...
utop # Simple_set.add 4 Simple_set.emptyset;;
- : int Simple_set.t = <abstr>
- Notice how the
int list
result type from before is nowint Simple_set.t
- it is the
t
type from moduleSimple_set
and the parameter'a
there is instantiated toint
.
- it is the
- Also notice that the value is
<abstr>
, not[4]
like before; since the type is hidden so are the values - This is both
- advantageous (program to interfaces, not implementations)
- not adventageous (sometimes hard to see what is going on, also can make it harder to test)
- We will come back to this topic later in the course
Making an OCaml executable
- So far all we have made is libraries; let us now make a small OCaml executable.
- We will make a main module
Set_main
(in fileset_main.ml
of course) which takes a string and a file name and looks for that line in the file.
Here is what we need to add to the dune
file along with the above to build the executable:
(executable
(name set_main)
(libraries simple_set core)
(modules set_main)
)
Running executables
- If you declared an executable in
dune
as above, it will make a filemy_main_module.exe
so in our case that isset_main.exe
- To exec it you can do
dune exec ./src/set_main.exe "open Core" src/simple_set.ml
- Which is really just
_build/default/src/set_main.exe "open Core" src/simple_set.ml
set_main.ml
- We will now inspect
set_main.ml
in VSCode so we can use the tool tips to check out various types
The Stdio.In_channel
library
set_main.ml
uses theIn_channel
module to read in file contents- (Note that I/O is a side effect, I/O functions do things besides the value returned)
- It is part of the
Stdio
module (which is itself included inCore
soCore.In_channel
is the same asStdio.In_channel
) - The Documentation is here; we will go through it to observe a few points
- First, now that we covered abstract types we can see there is an abstract type
t
here - As with our own set, it is “the underlinying data” for the module, in this case file handles
- It is hidden though so we don’t get access to the details of how “files are handled”
- If you are used to object-oriented programming you are looking for a constructor/new; in functional code look for functions that only return a
t
, that is making a newt
:create
here.
- First, now that we covered abstract types we can see there is an abstract type
Optional arguments tangent
- One topic we skipped over which is in many of these libraries is optional arguments
- They are named arguments but you don’t need to give them, indicated by a
?
before the name. - If you do give them, they are like named aguments, use
~name:
syntax - e.g. in
In_channel.create
,val create : ?binary:Base.bool -> Base.string -> t
- an optional flag
~binary:true
could be passed to make a binary file handle - example usage:
In_channel.create ~binary:false "/tmp/wowfile"
- an optional flag
- Many languages now support optional arguments (not so 10 years ago - newer feature)
Writing your own functions with optional arguments is easy: the value passed in is an option
type
# let f ?x y = match x with Some z -> z + y | None -> y;;
val f : ?x:int -> int -> int = <fun>
# f ~x:1 2;;
- : int = 3
# f 2;;
- : int = 2
- Use them when they are the right thing: will reduce clutter of passing often un-needed items.
The Sys
library
- We are using this library to read in the command line args, via
Sys.get_argv
. - We will also take a quick look at its documentation here
- Notice how this particular module has no carrier type
t
, it is just a collection of utility functions.
- Notice how this particular module has no carrier type
Modules within modules
- It is often useful to have modules inside of modules for further code “modularization”
- The way it is declared is in e.g.
foo.ml
(which itself defines the items for moduleFoo
using the above convention), addmodule Sub = struct let blah = ... ... end
where the
...
are the same kinds of declarations that are in files likefoo.ml
. - This syntax is also how we can directly define a module in
utop
without putting it in a file. - In the remainder of the file you can access the contents of
Sub
asSub.blah
, and outside of thefoo.ml
fileFoo.Sub.blah
will access. - Assignment 3 includes some nested modules, this time with more purpose; we will take a look.