This article was originally published at 47deg.com on January 28, 2021.
This is the fourth post in the Lambda Calculus Through JavaScript series. If you’re just joining us, make sure to go back and start with Lambda calculus through JavaScript, part 1.
We have definitely passed the apprentice phase of lambda calculus: we know the rules governing the language, how to encode Booleans and numbers, and have a sense of how fixed points allow us to encode recursion. But nobody writes a program using only Booleans and numbers (unless you’re a Turing machine or a low-level C programmer); being able to define new data types is a must. As usual, we’ll discover that lambda calculus gives us the ingredients to introduce this concept without extending the language, just by translation.
Folding over lists and trees
No introduction to functional programming is complete without a discussion of reduces or fold over lists or arrays. In JavaScript, this operation takes the following form: you give it a way to combine the accumulator with the current value being processes, and (optionally) a value to start the operation.
arr.reduce((acc, current) => do something, initialValue)
This operation is a building block for the rest of functionality in this style. Using reduce
, you can define map
, filter
, concat
, . . . in general, any function you may wish . . . over arrays.
There’s a tight link between the reduce
function and the usual definition of lists in most functional programming languages. Using ReScript syntax, lists are defined as a choice, at every step, between ending the list (using Nil
) or adding a new element at the head of another list (using Cons
). The 'a
is ReScript’s (and OCaml’s) syntax for a generic argument.
type rec list<'a> =
| Cons({ hd: a, tl: list<'a> })
| Nil
Here is the type of the corresponding reduce
function – well, I have altered the order of some arguments to match the JavaScript version:
let reduce: (list<'a>, ('b, 'a) => 'b, 'b) => 'b
The link works as follows: to get the reduce
function, we’ve taken the types of the arguments to each constructor – a
and list<'a>
in the first case, none in the second –; replaced any recursive components by 'b
– only in the Cons
case in this case –; and finally ask for a function over each argument returning 'b
– in the case of Nil
, the absence of arguments means this function is simply a value.
You can play this game with any data type you define in this form (usually called algebraic data types). For example, here is a type for binary trees:
type rec tree<'a> =
| Node({ value: a, left: tree<'a>, right: tree<'a> })
| Lead({ value: a })
Performing the same algorithm gives you the following type for a fold over trees. If you feel brave, try defining the function yourself 😉
let reduceTree(tree<'a>, ('a, 'b, 'b) => 'b, ('a) => 'b) => 'b
Folding over Booleans
What if we applied this procedure to types we’ve already discussed, like Booleans? First of all, we need the definition of the type as an algebraic data type. Fortunately, most functional languages do this in a similar fashion:
type boolean =
| True
| False
Since none of the constructor has arguments, the corresponding type of the fold is as follows.
let reduceBoolean(boolean, 'b, 'b) => 'b
If the type alone does not ring a bell, let me give you an example of how you would use this function to define the negation of a Boolean value.
let not = (b) => reduceBoolean(b, False, True)
This is exactly the same way in which we used if_
back in part 2!
Swapping data and folds
Time for the magic trick 🪄 It turns out that, in the same way we ditched actual Boolean values and replaced them with functions, we can do the same thing with any data type. The core idea is that we are representing data by its fold. The procedure outlined above to obtain the type of the fold is still relevant here to tell us the shape of the function.
More concretely, let’s figure out how we represent lists in this fashion. Remember, we said that lists are represented by functions with the type. Because I don’t want to enter into a discussion about how to actually type that function – because we need something called higher-rank types due to the 'b
appearing on the right but not on the left – I’ll simply leave the equals sign in an intuitive level.
type list<'a> "=" (('b, 'a) => 'b, 'b) => 'b
As discussed above, each list is constructed out of Nil
and Cons
. In other words, if we describe how to encode these two constructors as folds, we can represent any list we want. If we apply reduce
to Nil
, the result is the second argument given to it, the initialValue
. That should then be the behavior of Nil
represented as a fold:
const nil_ = (callback) => (initialValue) => initialValue
Once again, this is now our definition of empty lists within the lambda calculus.
The Cons
constructor takes two arguments, and then returns a list/fold. This means that the skeleton of the function should be:
const cons_ = hd => tl =>
callback => initialValue => ???
The result of reducing a list with more than one element is the same as applying the callback
to the current value – that is simply hd
– and the recursive application of reduce to the rest. Putting on our “a list is a fold” hat, this last part is simply the application of tl
to the arguments.
const cons_ = hd => tl =>
callback => initialValue =>
callback(hd, tl(callback, initialValue))
The final part of the trick is that now there’s no need to define reduce
, since the lists themselves are the fold. If we were to write reduce
, we would get to a very similar situation to defining if_
: it’s just the application of the first argument to the rest!
const reduce = list => callback => initialValue => list(callback, initialValue)
This is an old trick
This representation is known as Church encoding, since it spawned as a generalization of how Alonzo Church represented Booleans in his work. This is not the only way to represent data types, though, there are other possibilities:
- The Scott encoding replaces data types not by their fold functions, but by their pattern matching function. In the case of list, that means that the recursive application of the callback is not represented; you simply get back values representing the head and tail.
- The Boehm-Berarducci encoding introduced Church encoding into the typed lambda calculus. As outlined above, giving proper types to Church encodings requires some advanced techniques, which we could work around because we work in an untyped setting.
Note that, by extension, the three techniques are usually referred to simply as Church encodings.
Representing pairs
As a final exercise for the reader, what is the Church encoding of a pair? As a reminder, we can define the types of pairs as follows. Note that this is not completely idiomatic ReScript (why would you introduce a constructor for only one choice?), but ties the whole idea together.
type pair<'x, 'y> =
| Pair({ left: 'x, right: 'y })
Solution: we have one constructor, so the shape of the corresponding reduce function takes only one functional argument. There’s no recursive appearance of tuple
, so we don’t have to make any substitution. The result is the following type:
let reducePair(tuple<'x, 'y>, ('x, 'y) => 'b) => 'b
Finally, we represent the constructor. This has two arguments, the two components of the pair. And there’s only one thing we can do: apply the corresponding callback function.
const pair_ = l => r => callback => callback(l, r)
If you think you don’t have pairs, but you have higher-order functions, then you actually have pairs!
Enough for today
We finally have a language we can call a proper programming language. Conditionals, recursion, and data types, in combination with the primitive functions, are the basic tools in functional programming.