At Azavea, full-time software developers have the opportunity to put 10% of their time towards learning and research. I am using my 10% time to explore a pure, statically-typed functional programming language called Idris by working my way through the excellent Type-Driven Development with Idris written by Edwin Brady, the creator of the language.
The thesis of the book is that instead of trying to use tests to check for the presence of errors, we can use types to demonstrate the absence of errors in our code. Not only that, we can make types work for us by leveraging the properties of Idris and its powerful type system to help us write code. In type-driven development, we start with the types and the compiler guides us along as we implement our program.
Let’s take a look at how.
Here is a simple function signature in Idris:
add : Int -> Int -> Int
Function signatures describe a function’s inputs and outputs. In Idris, you can think of the function signature as a contract that describes exactly what the function must do. In the above example, we are defining a function named
add which takes two
Ints as input and returns an
Int as output (
Int is a built-in type representing integers):
add : Int -> Int -> Int ^^^ ^^^ ^^^ input input output
We’ll always start with a function signature and work towards the implementation from there. In this case, the implementation would look like this:
add x y = x + y
add takes two arguments
y and we add them together using the
+ operator. The body of our function is an expression which evaluates to a value that is implicitly returned by the function. As such, no explicit returns are needed. We end up with this:
add : Int -> Int -> Int add x y = x + y
We can fire up an Idris REPL within the context of the Idris file we’re working in (
Add.idr) by running
idris Add.idr and then call our function like so (notice the lack of parentheses):
*Add> add 1 2 3 : Int
We can see from the output (
3 : Int) that we got back a
3 that is of type
This function, like all Idris functions, is pure. That means that the same input to the function must produce the same output and the function can have no side effects (nothing besides returning a value). In short, pure functions are dependable and predictable. They can always be relied on to do exactly what they say and nothing more; they won’t fetch data from a server, mutate some global state, launch a rocket into the sun, or any other shenanigans.
As we will see later, this property of purity will be leveraged extensively by the compiler to not only make sure our programs are correct, but also to help us write them!
You can think of types as a description of how data should be used. When practicing type-driven development in Idris, our function definitions flow from the function signature and the types contained within them. The function signatures are more than just documentation enforced by the compiler — they’re a starting point for our implementation.
Let’s start with a very simple example:
invert : Bool -> Bool
We want to write a function that will take a
Bool and return a
Bool (note that types in Idris must start with a capital letter). The
Bool type is defined using the
data keyword like so:
data Bool = False | True
Bool is a union type that can be either
False. Thus, our
invert function should switch between these two possibilities (
Now, to help us code in Idris, there are plugins available for common text editors such as Vim, Emacs, and Atom which provide a set of standard commands to facilitate type-driven development. To start, we can use the definition command (
<LocalLeader>d in Vim) to build out an initial function definition:
invert : Bool -> Bool invert x = ?invert_rhs
We see that Idris has started implementing this function for us and already has added an
x argument for our function equation. Cool!
But what about the
?invert_rhs? That is called a type hole. A type hole always starts with a question mark and represents a yet-to-be-implemented part of your program. If we check the type of our type hole (
<LocalLeader>t in Vim) using the plugin we see this output:
x : Bool -------------------------------------- invert_rhs : Bool
That’s telling us that
x is a
invert_rhs (the expression our function body evaluates to) is too, as we would expect.
Next we can use the case-split command (
<LocalLeader>c in Vim) to continue implementing the function. By case-splitting on the argument, we get:
invert : Bool -> Bool invert False = ?invert_rhs_1 invert True = ?invert_rhs_2
Idris has helpfully leveraged what it knows about
Bool and filled in the possible values for the implementation of our function. It has created a new function equation for each case to pattern match on. Now it’s up to us to fill in those type holes to complete our function:
invert : Bool -> Bool invert False = True invert True = False
Here’s how the function can be implemented start-to-finish using interactive editing:
We can run the function like so:
*Invert> invert True False : Bool *Invert> invert False True : Bool
If you’re not used to static typing, you may at this point be wondering what might happen if you gave
invert the wrong type, something that’s not a
Bool. For example, let’s try to apply
String instead of a
invert : Bool -> Bool invert False = True invert True = False result : Bool result = invert "banana"
We get this compilation error:
| 6 | result = invert "banana" | ~~~~~~~~~~~~~ When checking right hand side of result with expected type Bool When checking an application of function Main.invert: Type mismatch between String (Type of "banana") and Bool (Expected type)
This is not valid Idris code; it does not compile and we can’t run it since it did not pass type checking. This is actually a good thing since the compiler won’t let us write incorrect code and thus catches many errors during development before they can become bugs.
One of the great things about static typing is that you’ll never have to write a test for what happens if your function gets or returns the wrong type because it simply can’t happen — it would be impossible to write a test case like that because it wouldn’t compile! The more powerful the type system, the more precisely we can define our types and the more errors we can prevent at runtime. And as we will continue to see, Idris’s type system is quite powerful!
Defining Custom Types
We can of course define our own types based on our needs. In this example we are going to encode a simple AST in
Expr for evaluating mathematical expressions which supports a few simple operations: addition, subtraction, and multiplication.
data Expr = Val Int | Multiply Expr Expr | Add Expr Expr | Subtract Expr Expr
This type is recursive, meaning it is defined in terms of itself (as such, expressions can be nested). We have types for our three binary operations we have represented (addition, subtraction, and multiplication) with the constructors
Multiply, respectively, as well as
Val to hold our integer literals.
Expr more closely in the REPL using
:doc which will show the documentation for any function or type:
*Expr> :doc Expr Data type Main.Expr : Type Constructors: Val : Int -> Expr Multiply : Expr -> Expr -> Expr Add : Expr -> Expr -> Expr Subtract : Expr -> Expr -> Expr
We can see that
Expr is in fact a
Subtract are constructors for that type, meaning that they are simply functions that return an
Expr. Notice that we didn’t have to actually write any documentation; Idris knows this about
Expr by virtue of the type definition.
Let’s look at some examples to see how we could use
Expr. We can represent the value
4 by itself like so:
*Expr> Val 4 Val 4 : Expr
4 * 2 like so:
*Expr> Multiply (Val 4) (Val 2) Multiply (Val 4) (Val 2) : Expr
4 * (2 - 1) like so:
*Expr> Multiply (Val 4) (Subtract (Val 2) (Val 1)) Multiply (Val 4) (Subtract (Val 2) (Val 1)) : Expr
Notice the type in all of these examples is
Now let’s write a function to evaluate our
Expr with the following signature:
evaluate : Expr -> Int
Using our handy plugin, let the compiler be our guide! Running the definition command gives us:
evaluate : Expr -> Int evaluate x = ?evaluate_rhs
And case-splitting on our
x argument gives us:
evaluate : Expr -> Int evaluate (Val x) = ?evaluate_rhs_1 evaluate (Multiply x y) = ?evaluate_rhs_2 evaluate (Add x y) = ?evaluate_rhs_3 evaluate (Subtract x y) = ?evaluate_rhs_4
Once we finish fleshing out our implementation by filling in the type holes, we get this:
evaluate : Expr -> Int evaluate (Val x) = x evaluate (Multiply x y) = evaluate x * evaluate y evaluate (Add x y) = evaluate x + evaluate y evaluate (Subtract x y) = evaluate x - evaluate y
We can now use our
evaluate function to evaluate the examples shown earlier:
*Expr> evaluate (Val 4) 4 : Int *Expr> evaluate (Multiply (Val 4) (Val 2)) 8 : Int *Expr> evaluate (Multiply (Val 4) (Subtract (Val 2) (Val 1))) 4 : Int
This shows the process of implementing the function using interactive editing:
Modifying Our Code
Let’s say we want to add a new operation to our program: division. This will require a change to our
data Expr = Val Int | Multiply Expr Expr | Add Expr Expr | Subtract Expr Expr | Divide Expr Expr
Immediately the compiler lets us know of a potential problem:
| 8 | evaluate (Val x) = x | ~~~~~~~~~~~~~~~~~~~~ Main.evaluate is not total as there are missing cases
The error message indicates that the compiler is looking for our
evaluate function to account for the possible input of
Divide x y. This is because Idris enforces totality, an even stronger property than purity. A total function is guaranteed to produce an output for any well-typed input in finite time, and a function is assumed to be total unless it is explicitly annotated as a partial function using
partial. This means that we are forced to account for all possible inputs, which we are not doing in this case.
To make our function total, we can modify
evaluate like so:
evaluate : Expr -> Int evaluate (Val x) = x evaluate (Multiply x y) = evaluate x * evaluate y evaluate (Add x y) = evaluate x + evaluate y evaluate (Subtract x y) = evaluate x - evaluate y evaluate (Divide x y) = evaluate x / evaluate y
However, once we do so we see another compiler error:
| 12 | evaluate (Divide x y) = evaluate x / evaluate y | ~~~~~~~~~~~~~~~~~~~~~~~ When checking right hand side of evaluate with expected type Int Can't find implementation for Fractional Int
Division is not a valid operation on
Int since we cannot have fractional integers. If we replace our
Double our program will compile and we can use the division operator to calculate
4 / (2 - 1):
*Expr> evaluate (Divide (Val 4) (Subtract (Val 2) (Val 1))) 4.0 : Double
If we check the documentation for our function in the REPL using
:doc, we see that it is total:
*Expr> :doc evaluate Main.evaluate : Expr -> Double The function is Total
In making this change, we see that the strong guarantees Idris affords us around type safety and totality help us avoid errors at compile time.
Sometimes we will want to make our types be able to take a parameter which could be of any type. For instance, the generic type
List takes a type parameter
elem (note that types are always UpperCamelCase whereas type parameters are always lowercase).
Observe that we can have lists of different types:
Idris> [1, 2, 3] [1, 2, 3] : List Integer Idris> ["a", "b", "c"] ["a", "b", "c"] : List String
:: operator (pronounced “cons”) is used for adding an element to the front of a list:
Idris> 0 :: [1, 2, 3] [0, 1, 2, 3] : List Integer
The above examples use syntactic sugar that makes working with
Lists easier. We can use the more verbose syntax to construct a list as well, starting with the empty list primitive
Nil and using
:: to build up our list backwards by glomming head elements onto it:
Idris> 0 :: (1 :: (2 :: (3 :: Nil))) [0, 1, 2, 3] : List Integer
When using syntactic sugar for lists, Idris will desugar the comma-separated elements inside of square brackets to these primitive forms.
To better understand how this works, let’s examine the
List type definition:
data List elem = Nil | (::) elem (List elem)
We can see that
List is a union type of
Nil (empty list) and potentially a chain of other
Lists constructed via cons-ing together heads and other lists (
:: is a constructor that takes a head
elem and a tail
List elem). Notice that all of the
Lists take the same
elem parameter which will necessarily be the same type.
Let’s get some practice working with
Lists by implementing the
takeList function (we’ll call it
takeList to avoid a name conflict with the built-in
take function). This function will retrieve the first
n elements from a list, returning the whole list if too many elements are requested:
Idris> takeList 1 [1, 2, 3]  : List Integer Idris> takeList 2 [1, 2, 3] [1, 2] : List Integer Idris> takeList 3 [1, 2, 3] [1, 2, 3] : List Integer Idris> takeList 4 [1, 2, 3] [1, 2, 3] : List Integer
The function signature will look like this:
takeList : (n : Nat) -> (list : List a) -> List a
Notice that this function does not take an
Int but instead takes a
Nat, a natural number. This type is commonly used for working with collections in Idris since
Nat must be greater than or equal to zero and collections cannot have a negative length. This is yet another example of precisely defining our types to remove potential error cases that one would otherwise need to account for.
Nat is a very simple recursive type:
data Nat = Z | S Nat
We can build up our
Nat recursively starting from
Z (Zero) by using
S (Successor) to add to it:
Idris> Z 0 : Nat Idris> S Z 1 : Nat Idris> S (S Z) 2 : Nat
With that new knowledge of
Nat in hand, let’s get back to implementing our
takeList function, starting with the function signature:
takeList : (n : Nat) -> (list : List a) -> List a
You’ll notice that our function arguments look a little different than what we’ve seen up until this point. This function signature provides names for its arguments. Our first argument of type
Nat is called
n and our second argument of type
List a is called
list. As we will soon see, these names are used for better variable naming when code is generated via interactive editing.
Again using our plugin, let’s first build out an initial definition:
takeList : (n : Nat) -> (list : List a) -> List a takeList n list = ?takeList_rhs
Next, we case-split on
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = ?takeList_rhs_1 takeList (S k) list = ?takeList_rhs_2
We know that if
Z (zero), then we’re requesting zero elements and we can return an empty list, so we can fill in
 like so:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list =  takeList (S k) list = ?takeList_rhs_2
The core of our function logic will depend on what we do with
list. Let’s case-split on the second
list to see what values it might be:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list =  takeList (S k)  = ?takeList_rhs_1 takeList (S k) (x :: xs) = ?takeList_rhs_3
We also know that if the list we’re working with is empty, we want to return an empty list, no matter the value of
k, which means that
?takeList_rhs_1 can also be an empty list, giving us:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list =  takeList (S k)  =  takeList (S k) (x :: xs) = ?takeList_rhs_3
The type hole
?takeList_rhs_3 is all that remains to be implemented. In the function equation on the last line of our implementation, we’re using the cons operator to pattern match on the head and tail (
xs) of our list so we can work with them. By matching using
::, we not only access the head and tail but also assert that there is a head and tail to match on.
To build out the result that will be returned, we want to grab the head and then continue taking the remaining
k elements from the rest of our list:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list =  takeList (S k)  =  takeList (S k) (x :: xs) = x :: takeList k xs
Here is how this can be implemented with interactive editing:
While this function works great on lists, in Idris we can do one better. By using a more precise collection type, we can write a
takeList function where you cannot request more elements than exist in the collection and have this be enforced at compile time. Seem impossible? That’s the power of dependent types!
Idris makes it possible to define types that are computed from some other value. These are called dependent types.
Vect is an example of a dependent type: its length is part of its type and can therefore be checked at compile time. To put it another way, the type of
Vect depends on its length. Whereas generics have a type specified as a type parameter, dependent types depend on a value being specified to determine their type.
Here is an example of a
import Data.Vect fourInts : Vect 4 Int fourInts = [1, 2, 3, 4]
Vect has a length of 4 and is made up of
Ints. If we look at the type of
Vect in a REPL using
:t we see:
> :t Vect Vect : Nat -> Type -> Type
Vect takes a
Nat which defines its length explicitly and a
Type for the type of element contained within it and returns a new
Type. This is a distinctive property of Idris: types are first-class. That means we can pass them into and return them from functions. This property is important for making dependent types possible.
In the case of
Vect, crucially, a
Vect 4 Int is a different type than
Vect 5 Int.
Let’s get some practice working with dependent types by writing a
takeVect function that operates on
Vects. As usual, we start with the function signature:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem
This function signature looks different than others we’ve seen so far, specifically the second argument:
Vect (n + m) elem. This is saying that the
Nat used to construct the
Vect is greater than or equal to
m can be greater than or equal to zero).
To start implementing our function, first we use the definition command:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect n xs = ?takeVect_rhs
Followed by case-splitting on
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs = ?takeVect_rhs_1 takeVect (S k) xs = ?takeVect_rhs_2
Next we can use the search command (
<LocalLeader>o in Vim) to search for an expression that satisfies the type hole:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs =  takeVect (S k) xs = ?takeVect_rhs_2
In this case, Idris was able to fill in the expression for us since when
n is zero, we necessarily must have an empty
Vect. The last remaining step is to implement
?take_rhs_2 similarly to how it was done for
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs =  takeVect (S k) (x :: xs) = x :: takeVect k xs
This shows the process of implementing our function:
So what does this get us? Well, if we try to take more elements than are available, we get a compile time error:
*Take> takeVect 5 [1, 2, 3, 4] (input):1:13:When checking argument xs to constructor Data.Vect.::: Type mismatch between Vect 0 elem1 (Type of ) and Vect (S m) elem (Expected type) Specifically: Type mismatch between 0 and S m
As we can see, by crafting our function signatures correctly, we can leverage the power of dependent types to make invalid operations impossible and have that be enforced at compile time.
As we have seen, instead of using types to check for correctness after the fact, they can be the starting point for your programs. In Idris, your functions flow from your types with the compiler as your guide. By precisely defining our types, once the code has compiled successfully, we can be reasonably confident that it is working as intended. And any future changes can be made knowing that the compiler has you covered.
Though Idris is primarily a research language and is not yet ready for production, the ideas behind it are compelling. Dependent types have many potentially interesting applications to aid in writing correct programs beyond the examples already shown, such as money being dependent on time. As an example in a geospatial context, we could imagine using dependent types to ensure map algebra operations are only performed on rasters with the same CRS and resolution.
Type-driven development is a powerful way to write correct programs, and what’s shown here is only a taste of what Idris has to offer. To learn more, check out the Idris docs or pick up a copy of Type-Driven Development with Idris.