Variants
- Variants build or-data (this-or-this-or-this); records build and-data (this-and-this-and-this)
- All data combination is fundamentally either and or or, similar to the fundamental operators of boolean logic.
- We start with variants (or), and then do records (and) next.
Variants
- The optionandresulttypes we have been using are simple forms of variant types
- Variants let your data be one of several forms (either-or), with a label wrapping the data indicating the specific form
- They are related to uniontypes in C orenumsin Java, but are more safe than C and more general than Java
- Like lists and tuples they are by default immutable
Example variant type for doing mixed arithmetic (integers and floats)
type ff_num = Fixed of int | Floating of float;;  (* read "|" as "or" *)
Fixed 5;; (* tag 5 as a Fixed *)
Floating 4.0;; (* tag 4.0 as a Floating *)
- Each case of the variant is wrapped with a constructor which serves for both
    - constructing values of the variant type
- inspecting them by pattern matching
 
- Constructors must start with a Capital Letter to distinguish from variables
- Variants must be declared but once declared type inference can infer them.
- The ofindicates what type is under the wrapper
- (Note constructors look like functions but they are not – you always need to give the argument)
let ff_as_int x =
    match x with
    | Fixed n -> n    (* pattern match like with option/list/result - those types are also variants *)
    | Floating z -> int_of_float z;;
ff_as_int (Fixed 5);;
A function using the above variant type
let ff_add n1 n2 =
   match n1, n2 with    (* note use of pair here to parallel-match on two variables  *)
     | Fixed i1, Fixed i2 -> Fixed (i1 + i2)
     | Fixed i1, Floating f2 ->  Floating(float i1 +. f2) (* need to coerce *)
     | Floating f1, Fixed i2 -> Floating(f1 +. float i2)  (* ditto *)
     | Floating f1, Floating f2 -> Floating(f1 +. f2)
;;
ff_add (Fixed 123) (Floating 3.14159);;
- No data item?  leave off the of.
- Multiple data items in a single variant clause? Wrap with a tuple (or record once we cover them):
type complex = CZero | Nonzero of float * float;;
let com = Nonzero(3.2,11.2);;
let zer = CZero;;
let ocaml_annoyance = Fn.id Nonzero(3.2,11.2);; (* this is a parsing error, it views as (Fn.id Nonzero)(3.2,11.2) *)
let ocaml_annoyance = Fn.id @@ Nonzero(3.2,11.2);; (* so use @@ instead of " " *)
An Example of Variants plus List. libraries
- Lets write Hamming distance calculator for DNA
- Goal beyond using variants is to cover some useful OCaml programming patterns.
- [@@deriving equal]in the below is a macro (called a “ppx extension” in OCaml)
- It automatically generates a function equal_nucleotide(equal_the-types-name-herein general)
- You will need to use this with Coresince regular=will not work onnucleotides.
(* Example derived from 
   https://exercism.io/tracks/ocaml/exercises/hamming/solutions/afce117bfacb41cebe5c6ebb5e07e7ca
   This code needs a #require "ppx_jane";; in top loop to load ppx extension for @@deriving equal 
   Or, in a dune file it will need   (preprocess (pps ppx_deriving.eq)) added to the library decl *)
type nucleotide = A | C | G | T [@@deriving equal]
let hamming_distance (left : nucleotide list) (right : nucleotide list) : ((int, string) result)=
  match List.zip left right with (* recall this returns Ok(list) or Unequal_lengths, another variant *)
  | List.Or_unequal_lengths.Unequal_lengths -> Error "left and right strands must be of equal length"
  | List.Or_unequal_lengths.Ok l ->
    l
    |> List.filter ~f:(fun (a,b) -> not (equal_nucleotide a b))
    |> List.length 
    |> fun x -> Ok(x) (* Unfortunately we can't just pipe to `Ok` since `Ok` is not a function in OCaml - make it one here *)
let hamm_example = hamming_distance [A;A;C;A;T;T] [A;A;G;A;C;T]
Now let’s use fold instead of filter/length
let hamming_distance (left : nucleotide list) (right : nucleotide list) : ((int, string) result)=
  match List.zip left right with
  | List.Or_unequal_lengths.Unequal_lengths -> Error "left and right strands must be of equal length"
  | List.Or_unequal_lengths.Ok (l) ->
    l
    |> List.fold ~init:0 ~f:(fun accum (a,b) -> accum + if (equal_nucleotide a b) then 0  else 1) 
    |> fun x -> Ok(x)
Parametric variant types
We have used several of these but have not looked at the type too carefully.
Here is the system’s declaration of the option type – the #show_type top loop directive (or just #show) will print it:
# #show_type option;;
type 'a option = None | Some of 'a
- The 'ahere is a parameter, which gets filled in by a concrete type to make an actual type.
- e.g. Some("hello") : string option– thestringfills in the parameter'a
- None : 'a optiontype means instantiate the- 'aparameter with polymorphic/generic- 'a
- This may have been more clear if OCaml used function notation for these types, e.g.
 type option('a) = None | Some of 'aandSome("hello") : option(string)
And here is result:
# #show_type result;;
type ('a, 'b) result = ('a, 'b) result = Ok of 'a | Error of 'b
- Same idea but a pair of type parameters; 'bis the type of theError.
- Observe Ok(4) : (int, 'a) resultandError("bad") : ('a, string) result
Lastly, List.zip has a special type for its return value which is very similar to option:
# #show_type List.Or_unequal_lengths.t;;
type 'a t = 'a List.Or_unequal_lengths.t = Ok of 'a | Unequal_lengths
Recursive data structures
- A common use of variant types is to build recursive data structures (trees)
- Functional programming is fantastic for computing over tree-structured data
- Recursive types can refer to themselves in their own definition
    - similar in spirit to how C structs can be recursive (but, no pointers needed here)
 
- 
    Unlike with functions, no need for rec
- Homebrew lists liztas a warm-up - the built-inlisttype is in fact not needed- Note this example is just for understanding, use the built-in lists if you just want lists.
 
type 'a lizt = Mt | Cons of 'a * 'a lizt;; (* the recursive "'a lizt" on the rhs is a lizt of 'a *)
let lizt_eg = Cons(3,Cons(5,Cons(7,Mt)));; (* analogous to 3 :: 5 :: 7 :: [] = [3;5;7] *)
Coding over lizts is nearly identical to built-in lists; here is mapping:
let rec lizt_map (ml : 'a lizt) ~(f : 'a -> 'b) : ('b lizt) =
  match ml with
    | Mt -> Mt
    | Cons(hd,tl) -> Cons(f hd,lizt_map tl ~f)
let map_eg = lizt_map (Cons(3,Cons(5,Cons(7,Mt)))) ~f:(fun x -> x - 1)
Lets look at the built-in list type:
# #show_type list;;
type 'a list = [] | (::) of 'a * 'a list
This is the exact same structure as lizt
Binary trees
- Binary trees are like lists but with two self-referential sub-structures instead of one
- Here is a tree with data in the intermediate nodes but not in the leaves.
type 'a bin_tree = Leaf | Node of 'a * 'a bin_tree * 'a bin_tree
Here are some simple example trees
let bt0 = Node("whack!",Leaf, Leaf);;
let bt1 = Node("fiddly ",
            Node("backer ",
               Leaf,
               Node("crack ",
                  Leaf,
                  Leaf)),
            bt0);;
let bt2 = Node("fiddly ",
            Node("backer ",
               Leaf,
               Node("crack ",
                  Leaf,
                  Leaf)),
            bt0);;
(* Type error, like list, must have uniform type: *)
Node("fiddly",Node(0,Leaf,Leaf),Leaf);;
Combinators for Binary Trees
- Since lists are built-in we get a massive library of functions on them.
- For these binary trees (and in general for whatever variant types you roll yourself) there is no such luxury.
    - This is because there is no single canonical form of tree, there are many different kinds of trees used
 
- Still, that doesn’t mean you should just code everything by recursing over the tree.  Instead
    - Define the combinators you need (maps, folds, node counts, etc.) using let rec
- Use your combinators without needing let rec
 
- Define the combinators you need (maps, folds, node counts, etc.) using 
- Here is a simple recursive function over binary trees:
    let rec add_gobble binstringtree = match binstringtree with | Leaf -> Leaf | Node(y, left, right) -> Node(y^"gobble",add_gobble left,add_gobble right)
- (Remember, as with lists this is not mutating the tree, its building a new one)
- Observe: this is an instance of the general operation of building a tree with same structure but applying an operation on each node value
- i.e. it is a map operation over a tree.  Let us code mapand use it to add gobbles.
let rec map (tree : 'a bin_tree) ~(f : 'a -> 'b) : ('b bin_tree) =
   match tree with
   | Leaf -> Leaf
   | Node(y, left, right) ->
       Node(f y,map ~f left,map ~f right)
(* using tree map to make a non-recursive add_gobble *)
let add_gobble tree = map ~f:(fun s -> s ^ "gobble") tree
- Fold is also natural on binary trees, apply operation f to node value and each subtree result.
    - This is a fold right (post-processed), folding left on a tree isn’t sensible because there are two subtrees to go down in to.
 
let rec fold (tree : 'a bin_tree) ~(f : 'a -> 'acc -> 'acc -> 'acc) ~(leaf : 'acc) : 'acc =
   match tree with
   | Leaf -> leaf
   | Node(y, left, right) ->
       f y (fold ~f ~leaf left) (fold ~f ~leaf right)
(* using tree fold *)
let int_summate tree = fold ~f:(fun elt laccum raccum -> elt + laccum + raccum) ~leaf:0 tree;;
int_summate @@ Node(3,Node(1,Leaf,Node(2,Leaf,Leaf)),Leaf);;
(* fold can also do map-like operations - the folder can return a tree *)
let inc_nodes tree = fold ~f:(fun elt la ra -> Node(elt+1,la,ra)) ~leaf:Leaf tree;;
- Many of the other Listfunctions have analogues on binary trees and recursive variants in general- length(- sizeor- depthfor a tree),- forall,- exists,- filter(filter out a subtree), etc etc.
 
- For some operations we need to know how to compare the tree elements
    - e.g. if it is a binary (sorted) tree an insertion requires comparison
- here for example we compare node elements for integer node values
 
let rec insert_int (x : int) (bt : int bin_tree) : (int bin_tree) =
   match bt with
   | Leaf -> Node(x, Leaf, Leaf)
   | Node(y, left, right) ->
       if x <= y then Node(y, insert_int x left, right)
       else Node(y, left, insert_int x right)
;;
- Like list operations this is not mutating – it returns a whole new tree.
- But, recall for lists that if we have a list lthen0 :: lcan share theldue to immutability
- So, for here, only one path through tree is not shared: on average only log n new nodes need to be made. More later in lecture on efficiency.
let bt' = insert_int 4 bt;;
let bt'' = insert_int 0 bt';; (* thread in the most recent tree into subsequent insert *)
- For non-integers, we need to explicitly supply any equal or comparison function.
- Library functions needing to compare will in fact take a comparision operation as argument
- For example in the Listlibrary, theList.sortfunction
- Here is an example of how to sort a string list with List.sort:
List.sort ["Zoo";"Hey";"Abba"] ~compare:(String.compare);; (* pass string's comparison function as argument *)
(* insight into OCaml expected behavior for compare: *)
# String.compare "Ahh" "Ahh";; (* =  returns 0 : equal *)
- : int = 0
# String.compare "Ahh" "Bee";; (* < returns -1 : less *)
- : int = -1
# String.compare "Ahh" "Ack";; (* > returns 1 : greater *)
- : int = 1
So, our more general tree insert should follow the lead of List.sort:
let rec insert x bt ~compare =
   match bt with
   | Leaf -> Node(x, Leaf, Leaf)
   | Node(y, left, right) ->
       if (compare x y) <= 0 then Node(y, insert x left compare, right)
       else Node(y, left, insert x right compare)
;;
let bt' = insert 4 bt ~compare:(Int.compare);;
- In general all the built-in types have both compareandequal(which is same as(=)) defined
- Define your own compare/equal for your own types if you need it
- Appending [@@ppx_deriving equal]to type decl as we saw above in Hamming DNA example will automatically define functionequal_mytypefor your typemytype
- Appending [@@ppx_deriving compare]is similar but will define functioncompare_mytype.
- Appending [@@ppx_deriving equal compare]will get both
Polymorphic Variants Briefly
- OCaml has an additional form of variant which has different syntax and is overlapping in uses: polymorphic variants
- A better term would be “inferred variants” - you don’t need to declare them via type.
# `Zinger(3);; (* prefix constructors with a backtick for the inferred variants *)
- : [> `Zinger of int ] = `Zinger 3
- This looks a bit useless, it inferred a 1-ary variant type
- But the “>” in the type means there could be other variants showing up in the future.
# let f b = if b then [`Zinger 3] else [`Zanger "hi"];;
val f : bool -> [> `Zanger of string | `Zinger of int ] list = <fun>
- We can of course pattern match as well:
# let zing_zang z = 
match z with
| `Zinger n -> "zing! "^(Int.to_string n)
| `Zanger s -> "zang! "^s
val zing_zang : [< `Zanger of string | `Zinger of int ] -> string = <fun>
Observe how the type now has a < instead of a >; the meaning is it is those fields or fewer.
# zing_zang @@ `Zanger "wow";;
- : string = "zang! wow"
# zing_zang @@ `Zuber 1.2;;
Line 1, characters 13-23:
Error: This expression has type [> `Zuber of float ]
       but an expression was expected of type
         [< `Zanger of string | `Zinger of int ]
       The second variant type does not allow tag(s) `Zuber
- Generally you should use the non-polymorphic form by default
- The main advantage of the polymorphic form is sharing tags amongst different types
    - regular variants like Ok(4)must be in only one type,resultforOkinCore
- variants like `Zanger "f"can be in[> `Zanger of string ],[> `Zanger of string | `Zinger of int ], etc
- really OCaml should just have one form; the two forms are historical baggage.
 
- regular variants like