Side effects

  • Side effects are operations which do more than return a result: mutate, I/O, exceptions, threads, etc.
  • So far we have not seen many side effects but a few have snuck in: printing, file input, exceptions
  • Principle of idiomatic OCaml (and style for this class): avoid effects, unless they are a real improvement, or a necessity (e.g. I/O).
  • Reminder: don’t use mutation on your homeworks, and limit use of other effects as well.

Side effects of OCaml include

  • Mutatable state - changing the contents of a memory location intead of making a new one
    • Three built-in sorts in OCaml: references, mutable record fields, and arrays.
    • Plus many libraries: Stack, Queue, Hashtbl, Hash_set, etc
  • Exceptions (we saw a bit of this already, failwith "ill-formed" etc)
  • Input/output (in basic modules lecture we looked at file input and results printing for example)
  • Concurrency and parallelism (will cover later)

State

  • Variables in OCaml are never directly mutable
  • But, they can hold a reference to memory that can be mutated
  • i.e. it is only indirect mutability - variable itself can’t change, but what it points to can.

Mutable References

  • References, mutable references, refs, reference cells, and cells are all more or less synomyms
let x = ref 4;; (* have to declare initial value when creating *)
val x : int ref = {contents = 4}

Meaning of the above: x forevermore (i.e. forever unless shadowed) refers to a fixed cell. The contents of that fixed cell, currently 4, can change, but not x.

# let x = ref 4;;
val x : int ref = {contents = 4}
# x + 1;;
Line 1, characters 0-1:
Error: This expression has type int ref but an expression was expected of type
         int
# !x + 1;; (* need !x to get out the value; parallels *x in C *)
- : int = 5
# x := 6;; (* assignment - x must be a ref cell.  Returns () - only performs side effect *)
- : unit = ()
# !x + 1;; (* Mutation happened to contents of cell x *)
- : int = 7
# 

Variables are still themselves immutable

  • To be clear, let doesn’t turn into a mutation operator with ref:
let x = ref 4;;
let f () = !x;;

x := 234;;
f();;

let x = ref 6;; (* shadows previous x definition, NOT an assignment to x !! *)
f ();; (* 234, not 6 *)

Null or Nil initial cell contents in OCaml, and Weakly Polymorphic types

  • If you don’t yet have a well-formed initial value, use an option:
    let x = ref None;;
    val x : '_weak1 option ref = {contents = None}
    
  • Note the type here, '_weak1 option ref, this is a weakly polymorphic type
  • Which really is not polymorphic at all - what it means is the type can be only a single type
    • which is not known yet
  • If you think about it, there is no other possibility, can’t put int and string in same cell
    • would not know the type when taking out of cell.
# x := Some 3;;
- : unit = ()
# !x;;
- : int option = Some 3 (* `'_weak1` = `int` now, permanently *)
  • At various points OCaml will infer only weak types on certain things
  • Most of the time it is because it would be incorrect not to
  • But occasionally OCaml is too dumb to realize things are not weak
    • there are advanced workarounds for this case which we will not cover

Mutable Records

  • Along with refs we can declare some record fields mutable
  • 'a ref is really implemented by a mutable record with one field, contents:
  • 'a ref in fact abbreviates the type { mutable contents: 'a }
  • The keyword mutable on a record field means it can mutate
let x = { contents = 4};; (* identical to x's definition above *)
x := 6;;
x.contents <- 7;;  (* same effect as previous line: backarrow updates a field *)

!x + 1;;
x.contents + 1;; (* same effect as previous line *)

Declaring Mutable Record Types

  • Default on each field is the value will be immutable
  • Put mutable qualifier on each field that you want to mutate
  • Principle of least mutability: only put mutable on fields you have to mutate
type mutable_point = { mutable x: float; mutable y: float };;
let translate p dx dy =
  p.x <- (p.x +. dx); (* observe use of ";" here to sequence effects *)
  p.y <- (p.y +. dy);;
let mypoint = { x = 0.0; y = 0.0 };;
translate mypoint 1.0 2.0;;
mypoint;;
  • Note here that the x and y fields of the point are mutable, but the point as a whole you cannot swap in a different point for.

Tree with mutable subtrees

(* version using ref: *)
type 'a mtree = MLeaf | MNode of 'a * 'a mtree ref * 'a mtree ref;;
(* But, use this type with mutable records - no `!` needed: *)
type 'a mtree = MLeaf | MNode of { data : 'a; mutable left : 'a mtree; mutable right : 'a mtree};;
  • Note that in this mtree we can only mutate the subtrees, not the data
  • Also, cannot replace a leaf at top of tree with a non-leaf.
  • The idea is to put mutablility only where you are doing mutation, no more no less.
  • So if the tree structure never changes but the node values can, only make the data mutable.

Example use: mutate right tree

# let mt = MNode {data = 3; left = MLeaf; right = MLeaf};;
val mt : int mtree = MNode {data = 3; left = MLeaf; right = MLeaf}
# match mt with 
| MLeaf -> ()
| MNode ({data;left;right} as r) -> r.left <- MNode {data = 5; left = MLeaf; right = MLeaf};;
- : unit = ()
# mt;;
- : int mtree =
MNode
 {data = 3; left = MNode {data = 5; left = MLeaf; right = MLeaf};
  right = MLeaf}
  • Note the use of the ... as r in the pattern, sometimes something needs a name that didn’t have one
  • And of course notice that mt actually changed here unlike with immutables

Physical equality

  • Occasionally in imperative programs you need to check for “same pointer”
  • phys_equal is Core notion for for “same pointer” (use == in non-Core)
# phys_equal 2 2;;
- : bool = true
# let x = ref 4;;
val x : int ref = {contents = 4}
# let y = x;;
val y : int ref = {contents = 4}
# phys_equal x y;;
- : bool = true (* same pointer *)
# let z = ref 4;;
val z : int ref = {contents = 4}
# phys_equal x z;;
- : bool = false (* different pointers *)

Control structures to help with mutution

  • Sequencing via “;” becomes useful with side effects
print_string "hi"; print_string "\n";;
  • Observe that operations that only have a side effect return () : unit
    • :=, <-, printing, etc
    • But sometimes operators will both have effects and return something
    • Sometimes need to sequence that and you may get an annoying warning if so:
# let incr = 
let count = ref 0 in 
let incr () = count := !count + 1; !count in incr;;
- : unit -> int = <fun>
# incr() ; incr();;
Line 1, characters 0-6:
Warning 10: this expression should have type unit.
...
  • Gives a warning since it is concerned that the first incr does not return unit.
  • This warning is actually good most of the time in fact, it means ; was used incorrectly
  • To silence warning (once you are clear you are doing the right thing):
# ignore(incr()) ; incr() (* or, can use let _ = incr() in incr() *)
  • for and while loops are useful with mutable state
  • But they are almost always a code smell in OCaml, usually a data structure iterator like map fold etc is better.
  • Here is a while .. do .. done loop; for syntax also standard
let x = ref 1 in
while !x < 10 do
  printf "count is %i ...\n" !x;
  x := !x + 1;
done;;
  • Fact: while loops are useless without mutation: would either never loop or infinitely loop
  • Same for e1 ; e2 – if e1 has no side effects may as well delete it, it is dead code!
  • Note that e1; e2 is exactly the same as writing let () = e1 in e2

Arrays

  • They are mutable, and they are also constant time to access nth element unlike lists
  • But, extending an array is inefficient: cannot share sub-array due to mutation
  • Entered and shown as [| 1; 2; 3 |] (added “|”) in top-loop to distinguish from lists.
  • Have to be initialized before using
    • In general, there is no such thing as “uninitialized” in OCaml.
    • If you need “undefined”/”null” array, make it an int option array and init to None’s.
let arrhi = Array.create ~len:10 "hi";; (* length and initial value *)
let arr = [| 4; 3; 2 |];; (* make a literal array *)
arr.(0);; (* access (unfortunately already used [] for lists so a bit ugly) *)
arr.(0) <- 55;; (* update *)
arr;;
(* Don't use for loops for arrays, use your favorite iterators: *)
Array.map ~f:(fun x -> x + 1) arr;; (* standard map - produces a new array *)
Array.map_inplace ~f:(fun x -> x + 1) arr;; (* This *changes* the array based on map-function *)
(* Here are some conversions *)
let a = Array.of_list [1;2;3];;
let l = Array.to_list a;;

Exceptions

  • As mentioned earlier, exceptions are powerful but dangerous
    • They are OK if they are always handled close to when they are raised
    • If the handler is far away it can lead to buggy code
    • We will aim for idiomatic use of OCaml exceptions in FPSE: local necessary ones only.
  • Core discourages over-use of exceptions in its library function signatures
    • Avoid the blah_exn library functions unless the handler is close by

There are a few simple built-in exceptions which we used some already:

failwith "Oops";; (* Generic code failure - exception is named Failure *)
invalid_arg "This function works on non-empty lists only";; (* Invalid_argument exception *)

Also there are library functions we covered that raise exceptions

# List.zip_exn [1;2] [2;3;4];;
Exception: (Invalid_argument "length mismatch in zip_exn: 2 <> 3")

OCaml syntax for defining raising and handling exceptions

  • New exception names need to be declared via exception like types needs to be declared
  • Unfortunately types do not include what exceptions a function may raise
    • an outdated aspect of OCaml; even Java has this
  • The value returned by an exception is very similar in looks to a variant.
exception Boom of string;; (* Note like with variants the `of` is optional, no payload required *)

let f _ = raise @@ Boom "keyboard on fire";; (* raise is ultimately how all exceptions are raised *)
f ();;

let g () =
  try f ()
  with
  | Boom s -> printf "exception Boom raised with payload string \"%s\"\n" s
;;
g ();;

Mutating data structures in Core

  • The Stack and Queue modules in Core are mutable data structures.
  • (There are no immutable libraries for stack/queue - just use lists)
  • (There is also Hash_set which is a (hashed) mutable set and Hashtbl which is a mutable hashtable; more on those later)
  • Here is a simple example of playing around with a Stack for example.
# let s = Stack.create();;
val s : '_weak3 t = <abstr> (* Stack.t is the underlying implementation and is hidden *)
# Stack.push s "hello";;
- : unit = ()
# Stack.push s "hello again";;
- : unit = ()
# Stack.push s "hello one more time";;
- : unit = ()
# Stack.to_list s;; (* a handy function to see what is there; top on left *)
- : string list = ["hello one more time"; "hello again"; "hello"]
# Stack.pop s;;
- : string option = Some "hello one more time"
# Stack.pop_exn s;; (* exception raised if empty here *)
- : string = "hello again"
# Stack.pop_exn s;;
- : string = "hello"
# Stack.pop s;;
- : string option = None
# Stack.exists s ~f:(fun s -> String.is_substring s "time");; (* Stack has folds, maps, etc too *)
- : bool = true

Summing Up Effects: A Parentheses Matching Function

  • To show how to use effects and some of the trade-offs, we look at a small example
  • See file matching.ml which has several versions of a simple parenthesis matching function
  • It shows uses of Stack, and some trade-offs of using exceptions vs option type.
  • Lastly there is a pure functional version which is arguably simpler
  • Yes you don’t need that mutation!