Idiomatic Functional Programming
This a major theme of the course; we covered some of this already but let’s put it all together.
- Design principles, design patterns, refactoring (OO) = principles & idioms (FP)
- Principles: overarching principles; Idioms: more focused ideas to aid in achieving principles
FP Principles
- “Concise is nice”
- Goal of making code as short as possible
- From the classic Strunk and White English writing guide:
A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts (and, we add that code should have no unnecessary constructs).
- Concise code means on a larger program more will fit in your brain’s working set
- Modularity / focus of responsibility
- Make clear divisions of responsibility between different modules and functions
- Attempt to make this factoring of responsibilities the most elegant which will aid in theme 1. above.
- Generally avoid side effects; it will help you achieve 1. and 2.
- Recall how pure functional code is referentially transparent, the behavior is all in the interface with no “hidden” actions.
- Side effect world view is a state machine vs functional view as a pipeline explicitly passing data on
- But, if the pipeline metaphor is failing, side effects may make it overall more concise
- Speed
- There is always a trade-off in programming between efficiency and elegance
- Much of the time it is possible to prioritize concision and modularity over running time and space
- Note that Python and JavaScript are ~5-10 times slower than C or OCaml, a case in point for speed not a priority
- But, sometimes speed really matters - When data sets get large or algorithms get complex
- Do generally avoid high polynomial or exponential algorithms on potentially large inputs
- Also pay more attention when data sets get extremely large, even n vs n log n gets noticeable there.
FP Idioms
Here is a list of idioms, many of which are review as we touched on them before
Don’t Repeat Yourself (DRY from OO):
- Extract duplicate code into its own function
- Code usually won’t be exact duplicate; extract different bits as function parameters so the different bits are passed in
- May also entail replacing specific types with generic types
'a
or functor parameter typest
,elt
etc
Hide it behind an interface
- If a function is auxiliary to only one other function, define it in the body of the latter.
- i.e.
let f x = let aux y = ..<aux's body>.. in .. <f's body> ..
- i.e.
- If a function is not local to a single function but is not used outside its module, leave it out of the module type (e.g.
.mli
file) which will hide it to module users - Make a new module for a new data type, and include operations on the type (only) in it
- This is not just for generic data structures lke
Map
/Set
, it is for app-specific data structures - Example:
ChessBoard
is a nontrivial data type for a chess game, make it its own module
- This is not just for generic data structures lke
- Hide types
t
in module types if users don’t need to see the details - Functional code has everything in the interface so it will make a more precice interface
- Or on the other hand perhaps you can have a simpler interface if it is imperative, e.g.
fresh_name : () -> string
making a different string each time called
- Or on the other hand perhaps you can have a simpler interface if it is imperative, e.g.
Have a focus of responsibility
- Each function and module should have a clear focus that can be summarized in a sentence
- Divide one function/module into two if it is doing two different things
- Don’t add random stuff to module if it doesn’t fit with it’s summary purpose
- If you need more than is in an existing module, make a new one and
include
the old one
Concision
(Some of this was already covered in the FPSE Style Guide)
- Combinize: replace recursion with
map
s,fold
s and the like- and, for your own data structures write your own combinators and then use in place of
rec
- and, for your own data structures write your own combinators and then use in place of
- Use advanced pattern matching (
as
,with
, deep patterns, partial record patterns,_
, etc) - Use
|>
in place of call sequences, and make your functions amenable to piping- Make sure to have the underlying pipe-type be the first unnamed parameter
- Core solution: name most of the parameters in
List
etc (but not the list)
- Use
@@
in place of parentheses - Inline simple
let
definitions to make code read as a concise sentence- Also a small function called only once or twice may read better inlined
- Conversely, make more
let
definitions if the code is too convoluted
Efficiency
- Our main goal is conciseness, but in some cases efficiency does matter
- We already discussed this issue a bit (e.g. cons vs append, tail recursion), here is a bit more and we will cover even more later.
Tail recursion
- We largely covered this earlier
- A tail-recursive function is a function where there is no work to do after returning from recursive calls
- Just bubble up the result
- Observe: since there is no need to mark the call point to resume from, no stack is needed!
- Overwrite the parameters going “down”, and return the base case at the bottom, done!
- Compilers take advantage of this to get rid of stack in this case
- Moral: to save space/time you may want to tail-call
Folding left vs right and tail calls
fold_right
is not tail-recursive; here is an implementation:
let rec fold_right ~f ~init l =
match l with
| [] -> init
| h::t -> f h (fold_right ~f ~init t)
let summate l = fold_right ~f:(+) ~init:0 l
let concatenate l = fold_right ~f:(^) ~init:"" l
- Observe: after each recursive call completes, it must compute
f h rec-call-result
- That is how folding right can start from the right, it computes
f
on the way up the call stack. - Folding left is the opposite, computing
f
and passing accumulated result down the call stack:
let rec fold_left ~f ~init l =
match l with
| [] -> init
| h::t -> fold_left ~f ~init:(f init h) t
let summate l = fold_left ~f:(+) ~init:0 l
let concatenate l = fold_left ~f:(^) ~init:"" l
- Observe that in the recursive case when the
fold_left
completes at the base case there is no more work to do - Thus,
fold_left
is tail-recursive- The
init
of the base case is in fact the final result.
- The
- So, if the choice doesn’t matter, use
fold_left
overfold_right
Core
hasfold
as an abbreviation offold_left
to bias you to use it over right fold.- If you are doing millions of iterations, you may have to refactor your code to use tail calls
- Or you could run out of memory
- Yes, this is a bit of a wart on the FP approach, sometimes you need to be aware of non-tail calls
Imperative vs functional data structures
List
vs Array
- Adding an element to the front (extending list) is constant time for list, O(n) for array
- array needs to be copied to get more memory
- different lists can share tail due to referential transparency
- Update of one element in an array is O(1); updating one element of a list is worst-case O(n) - re-build whole list
- Random access of nth element: O(n) list, O(1) array.
- If you want fast random access to a “list” that is not growing / shrinking / changing, can in fact use an
array
.
Map
vs Hashtbl
Map
is implemented like thedict
of the homework, but the tree is rebalanced always- O(log n) worst case time for
Map
to look up, add, or change an entry - “O(1) amortized” for
Hashtbl
- will only matter for really big data sets.
Set
vs Hash_set
-
See previous,
Set
is likeMap
andHash_set
is likeHashtbl
-
Here is a summary of OCaml data structure complexity (for the standard OCaml library but same results as
Core
version) -
Is functional ever any better? Yes!
- If you have many closely related maps, e.g. repeatedly forking off a map into two sub-versions
- Due to referential transparency the cost of copying is zero!!
- Multiple Lists similarly can share common portions
- See Real World OCaml benchmarks (scroll down) for example benchmarks of this
Examples of Idiomatic FP
-
Here are example codebases we will spend some time inspecting and critiqueing.
- Minesweeper at Exercism.io
- This functional implementation shows several nice OCaml patterns. Here is the version we reviewed in class which has several variations on the original implementation.
- Will look at an imperative approach with too much imperative-think in it but with some good properties as well
- We also made a variation on the functional version to be less inefficient and arguably cleaner
- ocaml-cuid is a utility to generate highly random string IDs for webpages etc.
- Lots of nice piping here plus use of functors to build Unix and JavaScript variations
- dolog is a very simple logging utility
- Shows some nice use of state,
include
, and aMake
functor.
- Shows some nice use of state,
- Minesweeper at Exercism.io