Functional Programming and Software Engineering

Modules Basics

We have seen OCaml modules in action, e.g. List.map, Float.(2 = 3), Fn.id, etc.
We also covered how modules are collections of functions, values, types, and other modules
Now we want to cover how individual .ml files define modules
- and, how to hide some items in a module (think private of Java/C++) via .mli file signatures
Also we will cover how most modules are libraries of auxiliary functions but how modules may also define executables.
… there are also many more fancy module features which we will cover later
We are going to use a running example to explain these concepts; see set-example.zip for the full example

`.ml` files are modules

The contents of the file simple_set.ml in the above is the following:

open Core

type 'a t = 'a list
let emptyset : 'a t = []
let add (x : 'a) (s : 'a t) : 'a t = (x :: s)
let rec remove (x : 'a) (s: 'a t) (equal : 'a -> 'a -> bool) : 'a t =
  match s with
  | [] -> failwith "item is not in set"
  | hd :: tl ->
    if equal hd x then tl
    else hd :: remove x tl equal
let rec contains (x: 'a) (s: 'a t) (equal : 'a -> 'a -> bool) : bool =
  match s with
  | [] -> false
  | hd :: tl ->
    if equal x hd then true else contains x tl equal

The above code defines module Simple_set since it is in the file simple_set.ml
- Capitalize first letter (only) in file name and remove .ml to get module name
Modules are just collections of top-level definable things (things you could type into top loop)
Assignment 1 file submission.ml is in fact making a module as well, named Submission.
- dune utop fires up OCaml with that module loaded;
- open Submission;; then allows you to avoid typing long name Submission.fibonacci etc.
This particular set module is just a set implemented as a list; it is in fact a multiset
The line type 'a t = 'a list is a type abbreviation, 'a t is a synonym for 'a list
- below we will show how to hide the fact that it is a list.
Naming a type just t is the standard for “the” underlying type of a module
- When outsiders use this module the type will be Simple_set.t, read “Simple set’s type”
- Core extensively uses this type naming convention in libraries: List.t, Option.t etc.
Notice how the functions needing = we have to pass it in explicitly to be polymorphic
- In Core.Set there is in fact a much better solution but involves fancier modules which we cover later
  Building the library

This file can be built as a library module with the dune file in src/dune (remember to execute dune from the project top-level, it automatically finds the build files in subdirectories)

(library
 (name simple_set)
 (modules simple_set) 
 (libraries core)
)

And if you want to play with your library module, command dune utop from the same directory will load it into the top loop:

myshell $ dune utop
...
utop # Simple_set.add 4 Simple_set.emptyset;;
- : int list = [4]

One thing potentially annoying here is the fact that we used a list to implement our set gets exposed here
But, we can use type abstraction to hide this; next topic

Other ways to load a module into the top loop besides `dune utop`

If you type #use "simple_set.ml";; it is just like copy/pasting the code of the file in – you won’t get a module.
If you want to “paste a file in the top loop as a module”, there is a command for that:
#mod_use "simple_set.ml";;
And if that was not enough there is one more method: you can #use_output "dune top"
- this runs the shell command dune top and pastes the output into the top loop; that dune command generates byte code files and then spits out a bunch of #load commands to load all the libraries as well as your code.

Information Hiding with Module Types aka Signatures

Modules also have types, they are called either module types or signatures
- The latter term is used in math, e.g. “a DFA has signature D = (S, Σ, τ, s0, F)”
When a module is defined in a file simple_set.ml, make a file simple_set.mli for its corresponding module type
- the added “i” is for “interface”
You don’t need an .mli file if there is nothing to hide, the module type will be inferred
- But, even if nothing is hidden the .mli is important as a document of what is provided to users
- all assignments come with an .mli file so you can get used to that format.

So, here the simple_set.mli file from the above zip after we have hidden the type of 'a t by removing = 'a list:

    type 'a t  (* can also hide the type here by not giving it in signature: remove the = 'a list  *)
    val emptyset : 'a t
    val add: 'a -> 'a t ->'a t
    val remove : 'a -> 'a t ->  ('a -> 'a -> bool) -> 'a t
    val contains: 'a -> 'a t ->  ('a -> 'a -> bool) -> bool 

Now if we dune utop with this added file we get

myshell $ dune utop
...
utop # Simple_set.add 4 Simple_set.emptyset;;
- : int Simple_set.t = <abstr>

Notice how the int list result type from before is now int Simple_set.t
- it is the t type from module Simple_set and the parameter 'a there is instantiated to int.
Also notice that the value is <abstr>, not [4] like before; since the type is hidden so are the values
This is both
- advantageous (program to interfaces, not implementations)
- not adventageous (sometimes hard to see what is going on, also can make it harder to test)
We will come back to this topic later in the course

Making an OCaml executable

So far all we have made is libraries; let us now make a small OCaml executable.
We will make a main module Set_main (in file set_main.ml of course) which takes a string and a file name and looks for that line in the file.

Here is what we need to add to the dune file along with the above to build the executable:

(executable
  (name set_main)
  (libraries simple_set core)
  (modules set_main)
)

Running executables

If you declared an executable in dune as above, it will make a file my_main_module.exe so in our case that is set_main.exe
To exec it you can do dune exec ./src/set_main.exe "open Core" src/simple_set.ml
Which is really just _build/default/src/set_main.exe "open Core" src/simple_set.ml

`set_main.ml`

We will now inspect set_main.ml in VSCode so we can use the tool tips to check out various types

The `Stdio.In_channel` library

set_main.ml uses the In_channel module to read in file contents
- (Note that I/O is a side effect, I/O functions do things besides the value returned)
It is part of the Stdio module (which is itself included in Core so Core.In_channel is the same as Stdio.In_channel)
The Documentation is here; we will go through it to observe a few points
- First, now that we covered abstract types we can see there is an abstract type t here
- As with our own set, it is “the underlinying data” for the module, in this case file handles
- It is hidden though so we don’t get access to the details of how “files are handled”
- If you are used to object-oriented programming you are looking for a constructor/new; in functional code look for functions that only return a t, that is making a new t: create here.

Optional arguments tangent

One topic we skipped over which is in many of these libraries is optional arguments
They are named arguments but you don’t need to give them, indicated by a ? before the name.
If you do give them, they are like named aguments, use ~name: syntax
e.g. in In_channel.create, val create : ?⁠binary:Base.bool -> Base.string -> t
- an optional flag ~binary:true could be passed to make a binary file handle
- example usage: In_channel.create ~binary:false "/tmp/wowfile"
Many languages now support optional arguments (not so 10 years ago - newer feature)

Writing your own functions with optional arguments is easy: the value passed in is an option type

# let f ?x y = match x with Some z -> z + y | None -> y;;
val f : ?x:int -> int -> int = <fun>
# f ~x:1 2;;
- : int = 3
# f 2;;
- : int = 2

Use them when they are the right thing: will reduce clutter of passing often un-needed items.

The `Sys` library

We are using this library to read in the command line args, via Sys.get_argv.
We will also take a quick look at its documentation here
- Notice how this particular module has no carrier type t, it is just a collection of utility functions.

Modules within modules

It is often useful to have modules inside of modules for further code “modularization”
The way it is declared is in e.g. foo.ml (which itself defines the items for module Foo using the above convention), add
```
module Sub = struct 
 let blah = ...
 ...
end
```
where the ... are the same kinds of declarations that are in files like foo.ml.
This syntax is also how we can directly define a module in utop without putting it in a file.
In the remainder of the file you can access the contents of Sub as Sub.blah, and outside of the foo.ml file Foo.Sub.blah will access.
Assignment 3 includes some nested modules, this time with more purpose; we will take a look.

FPSE

Modules Basics

.ml files are modules

Building the library

Other ways to load a module into the top loop besides dune utop