List programming
- First we will do a few more recursive functions over lists
- Then we will show how the
Core.List
library functions allow a great many (most?) operations to be written without recursion - This is combinator programming, solve a task by composing operations on a few standard combinators (“combiners”)
- (Note for Assignment 1 Part I you will not be able to use the library combinators, they are for Part II)
Reversing a list
- Let us write a somewhat more interesting function, reversing a list.
- Lists are immutable so it is going to create a completely new list, not change the original.
- This style of programming is called “Data structure corresponds to control flow” - the program needs to touch and reconstruct the whole data structure as it runs.
let rec rev l =
match l with
| [] -> []
| hd :: tl -> rev tl @ [hd]
;;
rev [1;2;3];; (* recall input list is the tree 1 :: ( 2 :: ( 3 :: [])) *)
- Correctness of a recursive function by induction: assume recursive call does what you expect in arguing it is overall correct.
- For this example, can assume
rev tl
always reverses the tail of the list,- (e.g. in computing
rev [1;2;3]
we matchhd
=1
andtl
=[2;3]
and can assumerev [2;3]
=[3;2]
)
- (e.g. in computing
- Given that fact,
rev tl @ [hd]
should clearly reverse the whole list- (e.g.
[3;2] @ [1]
=[3;2;1]
for the example)
- (e.g.
- QED, the function is proved correct! (actually partially correct, this induction argument does not rule out infinite loops)
Of course rev
is also in Core.List
since it is a common operation:
# List.rev [1;2;3];;
- : int list = [3; 2; 1]
Another Example: zero out all the negative elements in a list of numbers
- C solution:
for
-loop over it and mutate all negatives to 0 - OCaml immutable list solution: recurse on list structure, building the new list as we go
let rec zero_negs l =
match l with
| [] -> []
| hd :: tl -> (if hd < 0 then 0 else hd) :: zero_negs tl
in
zero_negs [1;-2;3];;
Core.List library functions
- We already saw a few of these previously, e.g.
List.rev
andList.nth
. List
is a module, think fancy package. It contains functions plus values plus types plus even other modulesList
is itself in the moduleCore
so the full name forrev
isCore.List.rev
- but we put an
open Core
in our.ocamlinit
(and in the template for A1) so you can just write e.g.List.rev
- but we put an
- (Note that
List.hd
is also available, but you should nearly always be pattern matching to take apart lists; don’t useList.hd
on the homework.) - Let us peek at the documentation
Core.List
to see what is available; we will cover a few of them now.
Some simple but very handy List
library functions
List.length ["d";"ss";"qwqw"];;
List.is_empty [];;
List.last_exn [1;2;3];; (* get last element; raises an exception if list is empty *)
List.join [[1;2];[22;33];[444;5555]];;
List.append [1;2] [3;4];; (* Usually the infix @ syntax is used for append *)
… And their types
- The types of the functions are additional hints to their purpose, get used to reading them
- Much of the time when you mis-use a function you will get a type error
- Recall that
'a list
etc is a polymorphic aka generic type,'a
can be any type
# List.length;;
- : 'a list -> int = <fun>
# List.is_empty;;
- : 'a list -> bool = <fun>
# List.last_exn;;
- : 'a list -> 'a = <fun>
# List.join;;
- : 'a list list -> 'a list = <fun>
# List.append;;
- : 'a list -> 'a list -> 'a list = <fun>
# List.map;; (* We will do this one below *)
- : 'a list -> f:('a -> 'b) -> 'b list = <fun>
- We coded
nth
andrev
before, here is one more,join
:
let rec join (l: 'a list list) = match l with
| [] -> [] (* "joining together a list of no-lists is an empty list" *)
| l :: ls -> l @ join ls (* "by induction assume (join ls) will turn list-of-lists to single list" *)
OCaml tuples and some List
library functions using tuples
- Along with lists
[1;2;3]
OCaml has tuples,(1,2.,"3")
- It is like a fixed-length list, but tuple elements can have different types
- You can also pattern match on tuples
# (1,2.,"3");;
- : int * float * string = (1, 2., "3")
# [1,2,3];; (* a common error, parens not always needed so this is a singleton list of a 3-tuple, not a list of ints *)
- : (int * int * int) list = [(1, 2, 3)]
- Here is a simple function to break a list in half using the
List.split_n
function- a pair of lists is returned by
split_n
, dividing it at the nth position
- a pair of lists is returned by
let split_in_half l = List.split_n l (List.length l / 2);;
split_in_half [2;3;4;5;99];;
- Now, using the
List.cartesian_product
function we can make all possible pairs of (front,back) elements- (Also observe how these OCaml combinators have overlap with math combinators we already knew)
let all_front_back_pairs l =
let front, back = split_in_half l in
List.cartesian_product front back;; (* observe how let can itself pattern match pairs *)
val all_front_back_pairs : 'a list -> ('a * 'a) list = <fun>
# all_front_back_pairs [1;2;3;4;5;6];;
- : (int * int) list =
[(1, 4); (1, 5); (1, 6); (2, 4); (2, 5); (2, 6); (3, 4); (3, 5); (3, 6)]
- Fact: lists of pairs are isomorphic to pairs of lists (of the same length)
- zipping and unzipping library functions can convert between these two equivalent forms.
List.unzip @@ all_front_back_pairs [1;2;3;4;5;6];;
- Note the use of
@@
here, recall it is function application but with “loosest binding”, avoids need for parens - Here is an even cooler way to write the same thing, with pipe operation
|>
(based on shell pipe|
)
[1;2;3;4;5;6] |> all_front_back_pairs |> List.unzip;;
- In a series of pipes, the leftmost argument is data, and all the others are functions
- The data is fed into first function, output of first function fed as input to second, etc
- This is exactly what the shell
|
does with standard input / standard output. -
Please use pipes as much as possible on Part II of Assignment 1 - will make the code more readable
List.zip
is the opposite of unzip: take two lists and make a single list pairing elements
List.zip [1;2;3] [4;5;6];;
- : (int * int) list List.Or_unequal_lengths.t =
Core.List.Or_unequal_lengths.Ok [(1, 4); (2, 5); (3, 6)]
- The strange result type is dealing with the case where the lists supplied may not be same length
- This type and value are hard to read, let us take a crack at it.
((int * int) list) List.Or_unequal_lengths.t
is the proper parentheses.List.Or_unequal_lengths.t
is referring to the typet
found in theList.Or_unequal_lengths
module (a small module within theList
module)- one of the great things about modules is they can also contain types (like C .h files but more principled)
- We can use the
#show_type
directive in the top loop to see whatt
actually is:
# #show_type List.Or_unequal_lengths.t;;
type nonrec 'a t = 'a List.Or_unequal_lengths.t = Ok of 'a | Unequal_lengths
- This means the value is either
Ok(..)
orUnequal_lenghts
, very similar toresult
oroption
- (Why don’t they just use one of those two here instead?? No idea!)
- The
'a
here is the type parameter, more on those later so don’t sweat it now - The latter case is for zipping lists of different lengths:
List.zip [1;2;3] [4;5];;
- : (int * int) list List.Or_unequal_lengths.t =
Core.List.Or_unequal_lengths.Unequal_lengths
- In the original same-length case we got the result from the first clause in this type,
Core.List.Or_unequal_lengths.Ok [(1, 4); (2, 5); (3, 6)]
. - Note
List.zip_exn
will just raise an exception for unequal-length lists, avoiding all of this wrapper ugliness- but in larger programs we want to avoid exceptions at a distance so it is often worth the suffering
zip/unzip and Currying
We should be able to zip and then unzip as a no-op, one should undo the other (we will use the _exn
version to avoid the above error wrapper issue).
List.unzip @@ List.zip_exn [1;2] [3;4];;
And the reverse should also work as it is an isomorphism:
List.zip_exn @@ List.unzip [(1, 3); (2, 4)];;
Line 1, characters 16-43:
Error: This expression has type int list * int list
but an expression was expected of type 'a list
- Oops! It fails. What happened here?
List.zip_exn
takes two curried arguments, lists to zip (its type is'a list -> 'b list -> ('a * 'b) list
), whereasList.unzip
returns a pair of lists.- No worries, we can write a wrapper (an adapter) turning
List.zip_exn
into a version taking a pair of lists:
let zip_pair (l,r) = List.zip_exn l r in
zip_pair @@ List.unzip [(1, 3); (2, 4)];;
[(1, 3); (2, 4)] |> List.unzip|> zip_pair ;; (* Pipe equivalent form *)
- Congratulations, we just wrote a fancy no-op function 😁
- The general principle here is a curried 2-argument function like
int -> int -> int
is isomorphic toint * int -> int
- The latter form looks more like a standard function taking multiple arguments and is the uncurried form.
- And we sometimes need to interconvert between the two representations
- This conversion is called uncurrying (curried to pair/triple/etc form) or currying (putting it into curried form)
Curry/Uncurry are themselves functions
- We can even write combinators which generically convert between these two forms - !
curry
- takes in uncurried 2-arg function and returns a curried versionuncurry
- takes in curried 2-arg function and returns an non-curried version
let curry f = fun x -> fun y -> f (x, y);;
let uncurry f = fun (x, y) -> f x y;;
Observe the types themselves in fact fully define their behavior:
curry : ('a * 'b -> 'c) -> 'a -> 'b -> 'c
uncurry : ('a -> 'b -> 'c) -> 'a * 'b -> 'c
We can now use our new combinator to build zip_pair
directly:
let zip_pair = uncurry @@ List.zip_exn;;
One last higher-order function: compose
Composition function g o f: take two functions, return their composition
let compose g f = (fun x -> g (f x));;
compose (fun x -> x+3) (fun x -> x*2) 10;;
- The type says it all again,
('a -> 'b) -> ('c -> 'a) -> 'c -> 'b
- Equivalent ways to code
compose
in OCaml:
let compose g f x = g (f x);;
let compose g f x = x |> f |> g;; (* This is the readability winner: feed x into f and f's result into g *)
let compose = (fun g -> (fun f -> (fun x -> g(f x))));;
- We can express the Zip/unzip composition explicitly with
compose
:
# (compose zip_pair List.unzip) [(1, 3); (2, 4)];;
- : (int * int) list = [(1, 3); (2, 4)]
List
module functions which take function arguments
- So far we have done the “easier” functions in
List
; the real meat are the functions taking other functions - Think of these as “recursion patterns”, they will recurse over the list so you don’t have to
let rec
- makes functional code a lot easier to read once you are familiar with these “recursion combinators”
- Lets warm up with
List.filter
: remove all elements not meeting a condition which we supply a function to check
List.filter [1;-1;2;-2;0] (fun x -> x >= 0);;
- Cool, we can “glue in” any checking function (boolean-returning, i.e. a predicate) and
List.filter
will do the rest - Note though that we got a strange warning on the above, “label f was omitted” - ??
- This is because
List.filter
has type'a list -> f:('a -> bool) -> 'a list
– thef:
is declaring a named argument - OCaml 5 gives warnings if you leave off a name so please always use them
- We can put args out of order if we give name via
~f:
syntax:
List.filter ~f:(fun x -> x >= 0) [1;-1;2;-2;0];;
- And, since OCaml functions are Curried we can leave off the list argument to make a generic remove-negatives function.
let remove_negatives = List.filter ~f:(fun x -> x >= 0);;
remove_negatives [1;-1;2;-2;0];;
Note that you can either inline the function as a fun
or can declare it in advance:
let gtz x = x >= 0;;
List.filter ~f:gtz [1;-1;2;-2;0];;
Usually for short functions it is better to inline them, it makes the code more readable.
Let us use filter
to write a function determining if a list has any negative elements:
let has_negs l = l |> List.filter ~f:(fun x -> x < 0) |> List.is_empty |> not;;
- The example shows the power of pipelining, it is easier to see the processing order with
|>
- This is a common operation so there is a library function for it as well: does there exist any element in the list where predicate holds?
let has_negs l = List.exists ~f:(fun x -> x < 0) l;;
Similarly, List.for_all
checks if it holds for all elements.
List.map
List.map
is super powerful, apply some operation we supply to every element of a list:
# List.map ~f:(fun x -> x + 1) [1;-1;2;-2;0];;
- : int list = [2; 0; 3; -1; 1]
# List.map ~f:(fun x -> x >= 0) [1;-1;2;-2;0];;
- : bool list = [true; false; true; false; true]
List.map ~f:(fun (x,y) -> x + y) [(1,2);(3,4)];; (* turns list of number pairs into list of their sums *)
Folding
- Some of the most powerful combinators are the folds:
fold_right
, andfold_left
aka simplyfold
. - They “fold together” list data using an operator.
- Think of fold as something you feed just the “base case” and the “recursive case” code to and it makes a recursive function for you.
- Here for example is how we can summate all elements in a list using
fold_right
# List.fold_right ~f:(+) ~init:0 [3; 5; 7];; (* this computes 3 + (5 + (7 + 0)) *)
- : int = 15
- The
~f
parameter is the binary operation to put between list elements,+
in this example; - The
~init
is the base case of the recursion. It is needed here because+
is a binary operator so an initial value is needed to fold for 0/1 length lists
Note that the +
is not symmetrical in this sequence, fold_right
feeds the individual elements (elt
) into the left side of +
and the accumulated value (accum
) is on the right, more clear in this equivalent syntax:
List.fold_right ~f:(fun elt accum -> elt + accum) ~init:0 [3; 5; 7];;
accum
here gets the accumulated value so far (think result of the recursive call fold is doing for you)elt
is the most recent element to fold in to the accumulated result.- For
fold_right
the+
is parenthesized in right-associative form, that is the “right”. There is also afold_left
, see below
Left folding
List.fold_left
akaList.fold
(use the latter syntax) associates left:
# List.fold ~f:(+) ~init:0 [3; 5; 7];; (* this is ((0 + 3) + 5) + 7 *)
- : int = 15
Which being more explicit about the elt
and accum
of the +
is
# List.fold ~f:(fun accum elt -> accum + elt) ~init:0 [3; 5; 7];; (* this is ((0 + 3) + 5) + 7 *)
- : int = 15
- Here since it is a fold left the accumulator is on the left (compare with folding right above)
- the arguments to
f
are swapped to make that more clear
- the arguments to
- Note that for
~f:(+)
, folding left or right gives the same answer;- but, that is only because
+
happens to be commutative and associative.
- but, that is only because
- For example,
List.fold ~f:(-) ~init:0 [1;2]
is(0 - 1) - 2
is-3
andList.fold_right ~f:(-) ~init:0 [1;2]
is1 - (2 - 0)
is-1
More general use of folds
- The type of
List.fold
is'a list -> init:'accum -> f:('accum -> 'a -> 'accum) -> 'accum
- Note how the two parameters of
f
need not be the same type - The
'a
here is the type of the list elements - The
'accum
is the type of the result being accumulated, which could be different than the type of list elements
Here is a simple fold example where these types are different (here 'a
is char
and 'accum
is string
):
List.fold ['a';'b';'c'] ~init:"" ~f:(fun accum -> fun elt -> accum^(Char.to_string elt));;
- As we saw above,
elt
is the list element andaccum
is the accumulator, the result thus far.- Think inductively about folding: you can assume
accum
properly accumulated the result you want on the tail of the list, just take the head (elt
) and combine withaccum
in thef
body to get the final result and you are good.
- Think inductively about folding: you can assume
We can also compute the same thing with fold_right
:
utop # List.fold_right ['a';'b';'c'] ~init:"" ~f:(fun elt -> fun accum -> (Char.to_string elt)^accum);;
- Notice that the arguments to
f
are swapped here compared to thefold
left version- in
fold
f
should be afun accum elt -> ...
of type('accum -> 'a -> 'accum)
and - in
fold_right
it should befun elt accum -> ...
of type('a -> 'accum -> 'accum)
- in
- Also notice we shifted where the
^accum
is- in the fold left, accum is accumulating the left elements of the list so far, so put new char on back
- in fold right
accum
is the accumulation of the right (tail) elements so put new char on front
A few more fold examples
We can implement List.exists
above with map and fold:
let exists l ~f = (* Note: ~f is **declaring** a named argument f *)
List.map ~f l (* ~f as an argument is shorthand for ~f:f *)
|> List.fold ~f:(||) ~init:false;;
# exists ~f:(fun x -> x >= 0) [-1;-2];;
- : bool = false
# exists ~f:(fun x -> x >= 0) [1;-2];;
- : bool = true
Or, just with a fold:
let exists l ~f =
List.fold l ~f:(fun accum elt -> accum || f elt) ~init:false;;
let map l ~f = List.fold ~f:(fun accum elt -> accum @ [f elt]) ~init:[] l
This function is identical to List.map
described above.
If you wanted to use fold_right
to build map it would be similar but not the same:
let map_right l ~f = List.fold_right ~f:(fun elt accum -> [f elt] @ accum) ~init:[] l;;
This is similar to the char list to string operation above.
Folding and efficiency
Here is possible code for fold_right
to help understand it:
let rec fold_right ~f l ~init =
match l with
| [] -> init
| hd::tl -> f hd (fold_right ~f tl ~init) (* observe it is invoking f after the recursive call *)
Code for fold
aka fold_left
:
let rec fold l ~init ~f =
match l with
| [] -> init
| hd::tl -> fold tl ~init:(f init hd) ~f (*observe f is invoked before the call -- accumulating left-first *)
- Note that the first parameter to f in
fold
is the accumulated value passed down and the second parameter is the current list value - In
fold_right
on the other hand thef
computation happens after the recursive call is complete. - Fold left/right are good example contrasts of how you can accumulate a value up (
fold_right
) vs down (fold
) the recursion
fold
is in fact more efficient than fold_right
so it is preferred all things being equal:
- Observe how the value of the
fold
function above is what is directly returned from the recursion - Such a function is tail recursive
- The compiler doesn’t need to use a call stack for such functions since nothing happens upon return
- so it replaces push/pop with jumps in and one jump out when done – its just a loop.
- Important when lists get really long that you don’t use stack unless required.
- Observe that
fold_right
is not tail recursive, so it needs the stack and will be slower
fold_until
- Let us end on perhaps the most powerful
List
combinator of all,fold_until
. - This is an extention to
fold
adding the functionality ofbreak
of C etc looping but in a functional style.
let summate_til_zero l =
List.fold_until l ~init:0
~f:(fun acc i -> match i, acc with
| 0, sum -> Stop sum
| _, sum -> Continue (i + sum))
~finish:Fn.id
let stz_example = summate_til_zero [1;2;3;4;0;5;6;7;8;9;10]
- The
Stop
variant is like break, here takesum
as the final value Continue
wraps the continue-folding case, which addsi
to runningsum
here.~finish
can post-process the result if theStop
case was not hit;Fn.id
isfun x -> x
, no additional processing here.