Side effects
- Side effects are operations which do more than return a result: mutate, I/O, exceptions, threads, etc.
- So far we have not seen many side effects but a few have snuck in: printing, file input, exceptions
- Principle of idiomatic OCaml (and style for this class): avoid effects, unless they are a real improvement, or a necessity (e.g. I/O).
- Reminder: don’t use mutation on your homeworks, and limit use of other effects as well.
Side effects of OCaml include
- Mutatable state - changing the contents of a memory location intead of making a new one
- Three built-in sorts in OCaml: references, mutable record fields, and arrays.
- Plus many libraries:
Stack
,Queue
,Hashtbl
,Hash_set
, etc - Faster because rebuilding avoided, but slower due to impossibility of sharing sub-components
- Exceptions (we saw a bit of this already,
failwith "ill-formed"
etc) - Input/output (in basic modules lecture we looked at file input and results printing for example)
- Concurrency and parallelism (will cover later)
State
- Variables in OCaml are never directly mutable
- But, they can hold a reference to memory that can be mutated
- i.e. it is only indirect mutability - variable itself can’t change, but what it points to can.
Mutable References
- References, mutable references, refs, reference cells, and cells are all more or less synomyms
'a ref
is a type, andval ref : 'a -> 'a ref
is a function that makes a ref cell.
# let x = ref 4;; (* have to declare initial value when creating *)
val x : int ref = {contents = 4}
Meaning of the above: x
forevermore (i.e. forever unless shadowed) refers to a fixed cell. The contents of that fixed cell, currently 4
, can change, but not x
.
# let x = ref 4;;
val x : int ref = {contents = 4}
# x + 1;;
Line 1, characters 0-1:
Error: This expression has type int ref but an expression was expected of type
int
- Addition with
(+)
works on integers, butx
is of typeint ref
. - Get the value from a ref cell with the
!
prefix operator.- It simply gets the (immutable) value that the ref cell points to.
- It does not get the memory location that it points to.
# !x + 1;; (* use !x to get out the value; similar to *x in C *)
- : int = 5
# x := 6;; (* assignment with (:=). x must be a ref cell. Returns () - only performs side effect *)
- : unit = ()
# !x + 1;; (* Mutation happened to contents of cell x *)
- : int = 7
And !
does not return the memory location to which a ref cell points, so this is a syntax error:
let x = ref 4
let !x = 5 (* syntax error. !x is a value, not a valid assignee *)
In this way, !x
in OCaml is not like *x
in C.
Tangent on unit
unit
is a terminal type. Only one value called()
has typeunit
, and it is totally useless.- All you can do is pass it around.
- So what is it good for?
Since ()
is useless, any function that returns it is either useless or performs a side effect. It is almost certainly the latter.
# print_endline;; (* returns unit; has the side effect of printing *)
- : string -> unit = <fun>
# (:=);; (* returns unit; has the side effect of assignment to LHS *)
- : 'a ref -> 'a -> unit = <fun>
# Hashtbl.set;; (* returns unit, so it has the side effect of assignment *)
- : ('a, 'b) Core.Hashtbl.t -> key:'a -> data:'b -> unit = <fun>
Hashtbl.set
returnsunit
so it must be a mutable data structure.- On the flip side, functions taking
unit
as argument are often also only performing side effects.
# Stack.create;; (* takes unit, so it is making a new mutable data structure *)
- : unit -> 'a Core.Stack.t = <fun>
# Stack.create ();; (* Note the convention of putting a space here *)
- : '_weak1 Core.Stack.t = <abstr> (* Its abstract, we can't see internals.. more on weak types soon *)
Variables are still themselves immutable
- To be clear,
let
doesn’t turn into a mutation operator withref
:
let x = ref 4;;
let f () = !x;;
x := 234;;
f();;
let x = ref 6;; (* shadows previous x definition, NOT an assignment to x !! *)
f ();; (* 234, not 6 *)
Null or Nil initial cell contents in OCaml, and Weakly Polymorphic types
- If you don’t yet have a well-formed initial value, use an
option
:
let x = ref None;;
val x : '_weak1 option ref = {contents = None}
- Note the type here,
'_weak1 option ref
, this is a weakly polymorphic type - Which really is not polymorphic at all - what it means is the type can be only a single type
- which is not known yet
- To the first order, a weakly polymorphic type is like a “Schrodinger’s type”.
- It is ready to be any (single) type until it is observed (i.e. used), after which it is fixed.
- If you think about it, there is no other possibility, can’t put int and string in same cell
- would not know the type when taking out of cell.
# x := Some 3;;
# !x;;
- : int option = Some 3 (* now we see '_weak1 was touched and its now forevermore an int *)
- At various points OCaml will infer only weak types on certain things
- Most of the time it is because it would be incorrect not to
- But occasionally OCaml is too dumb to realize things are not weak
- there are advanced workarounds for this case which we will not cover
The weak types are here so that we cannot do this:
let x = ref None (* Puts Schrodinger's cat in the box. It is weakly typed, not polymorphic. *)
let _ = x := Some 5 (* Observes Schrodinger's cat: fixes the weak type to be int *)
let _ = x := Some "hello" (* type error! x is not a string ref *)
Mutable Records
- Along with refs we can declare some record fields
mutable
'a ref
is really implemented by a mutable record with one field, contents:'a ref
in fact abbreviates the type{ mutable contents: 'a }
- And
ref
is a just a function to make creation convenient. - And
(:=)
is just a function to make assignment convenient. - And
(!)
is just a function to make reading convenient and explicit.
- And
- The keyword mutable on a record field means it can mutate
let x = { contents = 4 };; (* 100.0% identical to `let x = ref 4` *)
x.contents <- 7;; (* identical to `x := 6` *)
x.contents + 1;; (* identical to `!x + 1` *)
Declaring Mutable Record Types
- Default on each field is that the value is immutable
- Put
mutable
qualifier on each field that you want to mutate - Principle of least mutability: you should only put
mutable
on fields you have to mutate
type mutable_point = { mutable x : float ; mutable y : float };;
let translate p dx dy =
p.x <- (p.x +. dx); (* observe use of ";" here to sequence effects *)
p.y <- (p.y +. dy);;
let mypoint = { x = 0.0; y = 0.0 };; (* new mutable record *)
translate mypoint 1.0 2.0;; (* changes fields inside mypoint *)
mypoint;;
-
Here, the
x
andy
fields of the point are mutable, but the point as a whole you cannot swap in a different point for. -
Note that
;
is the standard sequencing operator- But in OCaml everything is an expression so its a bit non-standard
e ; e'
is roughly the same aslet () = e in e'
: evaluatee
, ignore result, evaluatee;
.(5 + 2); true
will give you a warning since5
is not of typeunit
- The reasoning here is if you are using
;
the first thing must be a side effect- and, as we covered above those functions will nearly always return
unit
.
- and, as we covered above those functions will nearly always return
Tree with mutable subtrees
(* version using ref: *)
type 'a mtree_ref = MLeaf | MNode of 'a * 'a mtree ref * 'a mtree ref;;
(* But, use this type with mutable records - no `!` needed: *)
type 'a mtree = MLeaf | MNode of { data : 'a ; mutable left : 'a mtree ; mutable right : 'a mtree };;
- Note that in this
mtree
we can only mutate the subtrees, not the data - Also, cannot replace a leaf at top of tree with a non-leaf.
- The idea is to put mutablility only where you are doing mutation, no more no less.
- So if the tree structure never changes but the node values can, only make the
data
mutable.
Example use: mutate right tree
# let mt = MNode { data = 3 ; left = MLeaf ; right = MLeaf };;
val mt : int mtree = MNode {data = 3; left = MLeaf; right = MLeaf}
# match mt with
| MLeaf -> ()
| MNode ({data;left;right} as r) -> (* "as" keyword captures the whole pattern under one name *)
r.left <- MNode {data = 5; left = MLeaf; right = MLeaf};;
- : unit = ()
(* Lets now verify that mt mutated *)
# mt;;
- : int mtree =
MNode
{ data = 3
; left = MNode { data = 5 ; left = MLeaf ; right = MLeaf }
; right = MLeaf }
Physical equality
- Occasionally in imperative programs you need to check for “same pointer”.
- It’s also useful in functional programming for fast comparison when data is shared.
- There’s no need to compare entire structures if their memory addresses are identical.
phys_equal
isCore
’s notion for for “same pointer” (use==
in non-Core).
# phys_equal 2 2;; (* memory layout of 2 is always the same *)
- : bool = true
# let x = ref 4;;
val x : int ref = {contents = 4}
# let y = x;; (* make y an alias for x *)
val y : int ref = {contents = 4}
# phys_equal x y;;
- : bool = true (* same pointer *)
# let z = ref 4;; (* new cell. totally different from x and y *)
val z : int ref = {contents = 4}
# phys_equal x z;;
- : bool = false (* different pointers *)
We can use phys_equal
to see that data is shared in functional data structures.
# let big_list = List.init 10000 ~f:Fn.id ;;
# let x = 10 :: big_list ;;
# let y = 11 :: big_list ;;
# phys_equal x y ;;
- : bool = false
# phys_equal (List.tl_exn x) (List.tl_exn y) ;;
- : bool = true (* the tails are shared! They're both big_list *)
Control structures to help with mutution
- As mentioned above, side effecting operations usually return
unit
- But occasionally they don’t, and you might want to use
;
with them which OCaml will complain about:
# let incr =
let count = ref 0 in
let incr () = count := !count + 1; !count in
incr;;
- : unit -> int = <fun>
# incr() ; incr();; (* Increment twice *)
Line 1, characters 0-6:
Warning 10: this expression should have type unit.
...
- Gives a warning since first
incr()
does not returnunit
- To silence warning (once you are clear you are doing the right thing):
# ignore (incr ()); incr () (* or, let _ = incr () in incr () *)
for
andwhile
loops are useful with mutable state- But they are almost always a code smell in OCaml, usually a data structure iterator like map fold etc is better.
- Here is a
while .. do .. done
loop;for
syntax also standard
let x = ref 1 in
while !x < 10 do
printf "count is %i ...\n" !x;
x := !x + 1;
done;;
- Fact:
while
loops are useless without mutation: would either never loop or infinitely loop - Same for
e1 ; e2
– ife1
has no side effects, you may as well delete it. It is dead code! - Remember that
e1; e2
is exactly the same as writinglet () = e1 in e2
Arrays
- They are mutable, and they are also constant time to access nth element unlike lists
- But, extending an array is inefficient: cannot share sub-array due to mutation
- And, sub-components of different arrays cannot be shared since they may change
- Entered and shown as
[| 1; 2; 3 |]
(added “|
”) in top-loop to distinguish from lists. - Have to be initialized before using
- In general, there is no such thing as “uninitialized” in OCaml.
- If you need “undefined”/”null” array, make it an
int option array
and init toNone
’s.
let arrhi = Array.create ~len:10 "hi";; (* length and initial value *)
let arr = [| 4; 3; 2 |];; (* make a literal array *)
arr.(0);; (* access (unfortunately already used [] for lists so a bit ugly) *)
arr.(0) <- 55;; (* update like with mutable record fields *)
arr;; (* see that arr has changed *)
(* Don't use for loops for arrays, use your favorite iterators: *)
Array.map ~f:(fun x -> x + 1) arr;; (* standard map - produces a new array *)
Array.map_inplace ~f:(fun x -> x + 1) arr;; (* This *changes* the array using the map function *)
(* Here are some conversions *)
let a = Array.of_list [1;2;3];;
let l = Array.to_list a;;
Exceptions
- As mentioned earlier, exceptions are powerful but dangerous
- They are OK if they are always handled close to when they are raised
- If the handler is far away it can lead to buggy code
- We will aim for idiomatic use of OCaml exceptions in FPSE: local necessary ones only.
Core
discourages over-use of exceptions in its library function signatures- Avoid the
blah_exn
library functions unless the handler is close by
- Avoid the
There are a few simple built-in exceptions which we used some already:
failwith "Oops";; (* Generic code failure - exception is named Failure *)
invalid_arg "This function works on non-empty lists only";; (* Invalid_argument exception *)
Also there are library functions we covered that raise exceptions
# List.zip_exn [1;2] [2;3;4];;
Exception: (Invalid_argument "length mismatch in zip_exn: 2 <> 3")
OCaml syntax for defining raising and handling exceptions
- New exception names need to be declared via
exception
liketype
s needs to be declared - Unfortunately, oCaml types do not include what exceptions a function may raise
- an outdated aspect of OCaml; even Java has this with
raises
on method declarations
- an outdated aspect of OCaml; even Java has this with
- The value returned by an exception is very similar in looks to a variant.
- under the hood, the
exn
type is an “extensible variant”
- under the hood, the
Extend the exn
type with your exception using the exception
keyword.
- Everything following the
exception
keyword is just like a variant constructor declaration. - There is no need for
of
if you don’t want data in your exception, just like a variant with no payload (e.g.None
).
exception Boom of string;;
let f _ = raise @@ Boom "keyboard on fire";; (* raise is ultimately how all exceptions are raised *)
f ();; (* this raises the exception *)
let g () =
try f ()
with
| Boom s -> printf "exception Boom raised with payload string \"%s\"\n" s
;;
g ();;
Mutating data structures in Core
- The
Stack
andQueue
modules inCore
are mutable data structures. - (There are no immutable stack/queue libraries in
Core
- just uselist
s) - (There is also
Hash_set
which is a (hashed) mutable set andHashtbl
which is a mutable hashtable; more on those later) - Here is a simple example of playing around with a
Stack
.
# let s = Stack.create();;
val s : '_weak3 t = <abstr> (* Stack.t is the underlying implementation and is hidden *)
# Stack.push s "hello";;
- : unit = () (* returns unit because s is mutated *)
# Stack.push s "hello again";;
- : unit = ()
# Stack.push s "hello one more time";;
- : unit = ()
# Stack.to_list s;; (* a handy function to see what is there; top on left *)
- : string list = ["hello one more time"; "hello again"; "hello"]
# Stack.pop s;;
- : string option = Some "hello one more time"
# Stack.pop_exn s;; (* exception raised if empty here *)
- : string = "hello again" (* s changed from the last pop, so this pop is different! *)
# Stack.pop_exn s;;
- : string = "hello"
# Stack.pop s;;
- : string option = None
# Stack.exists s ~f:(fun s -> String.is_substring s "time");; (* Stack has folds, maps, etc too *)
- : bool = false
Summing Up Effects With an Example: A Parentheses Matching Function
- To show how to use effects and some of the trade-offs, we look at a small example
- See file matching.ml which has several versions of a simple parenthesis matching function
- It shows uses of
Stack
, and some trade-offs of using exceptions vs option type. - Lastly there is a pure functional version which is arguably simpler
- Yes, you don’t need that mutation!