Navigate / search

Minimize mental overhead for code readers

This week at our FunctionalKats meetup in Dublin, tackled the Luhn checksum algorithm.

This is a simple programming task, the challenge was to try and make the code readable.

I rattled off some code that worked, but I wasn’t at all happy with it. It implemented the problem as described. Partitioning the numbers into two groups and dealing with each group in turn.

let Luhn s =
    let double x = x*2

    let sumDigits = function
        | 18 -> 9
        | n when n > 9 -> n%9
        | n -> n

    let odds, evens =
        |> (string >> int)
        |> Array.ofSeq
        |> Array.rev
        |> Array.partitioni (fun i -> i%2 = 0)

    let oddSum = Array.sum odds
    let evenSum =
        |> double
        |> sumDigits
        |> Array.sum
    (oddSum + evenSum) % 10 = 0

The Array.partitioni function doesn’t exist, I had to write that too.

module Array =
    let partitioni f a =
        let first, second =
            |> Array.mapi (fun i x -> f i, x)
            |> Array.partition (fun (i, _) -> i)
        ( snd first, snd second)

I didn’t like the idea of partitioning the numbers. That may match the problem description but I think we can do better. I like functions that take in something and transforms it, perhaps through many transformations, but without splitting or branching the logic.

If a function splits the data and does separate transformations, there’s mental overhead involved. Every variable in a function is an extra concept for the reader to understand.

My second attempt tried to rectify this

let Luhn s =
    let double x = x*2

    let sumDigits = function
        | 18 -> 9
        | n when n > 9 -> n%9
        | n -> n

    let mapOdd = id
    let mapEven = (double >> sumDigits)
    let mapNumber i = if i%2=0 then mapOdd else mapEven

    let total =
        |> (string >> int)
        |> Array.ofSeq
        |> Array.rev
        |> Array.mapi mapNumber
        |> Array.sum

    total % 10 = 0

Instead of partitioning the data we map the entire array. The function used in the mapping is different for odd and even positions. The mapNumber function handles the toggling of the two functions.

This is better, but it’s still not right. Those mapOdd and mapEven functions, that IF expression selecting between them. It still feels like two trains of thought. We can do better.

let sumDigits n = Math.DivRem(n, 10) ||> (+)
let isMultipleOf x y = y % x = 0

let Luhn: string->bool = (string >> int) 
    >> Seq.rev
    >> Seq.map2 (*) (Seq.cycle (seq[1;2])) 
    >> sumDigits
    >> Seq.sum
    >> isMultipleOf 10

Now we’re getting much closer to the ideal. The function accepts a string, produces a bool. It gets from string to bool through series of simple transformations. No IF expressions, no splitting of the numbers into groups, no toggling of functions.

The trick lies in generating a cycle of 1’s and 2’s. When you multiply a sequence by 1,2,1,2,1,2…. You double every second item in the sequence. With that done, the rest of the function falls into place.

There are a few other subtle improvements. The sumDigits function is simpler thanks to the use of Map.DivRem and ||>.  The Luhn function itself is a composition of transformations. I also created the Seq.cycle and Seq.rev.

module Seq =
    let rec cycle s = seq { yield! s; yield! cycle s}

    let rev s = s |> Array.ofSeq |> Array.rev |> Seq.ofArray

I’m happy with the function as it stands. The ideas I wanted to get across with this post are

  • Don’t let the description of a problem constrain your thinking. It’s often possible to find solutions that are simpler than the problem would imply.
  • Try to reduce the moving parts, the “concepts” in your functions. Aim for functions that read straight through. A reader should have to push as little as possible onto their mental stack.
  • Try to avoid splitting your data and working on different subsections of it. A function should only accept the data it needs to work on. Keeping that data together as you transform it isn’t always possible. If you can achieve it, your code will usually be simpler.

No More Ad-Hoc Dev Environments

This entry is part 1 of 1 in the series Windows DevBox

I’ve noticed over the past year that I waste a lot of time on environmental issues. With my machine, with configuration, with things that screw me up and then take forever to reverse or work around.

I should be getting to the root causes of those issues, and I am. But yet, despite our best efforts there will always be new issues with never before seen causes.

Our development environment is the foundation for everything we do. It is insane to focus on automated tests, builds, and deployment, while setting up our dev machines in an ad-hoc fashion.

I’m a strong believer that a new hire should be able to get from a blank hard drive to a working machine with minimal fuss.

Why put the new hire through this? Why not have machines ready to go?

A few reasons.

  1. If a new hire can set everything up then your procedure is solid. I don’t want this job done by someone with a head full of undocumented (or even documented) knowledge.
  2. Why waste the time of an existing employee who could be doing something more useful. Leave the setup of machines to the people who will use them, that way they are productive from the moment they sit down.
  3. Machines are out of date as soon as you configure them. A stash of machines waiting for new hires is just an inventory of machines that will need updating by the new hire anyway.The setup procedure should create an up to date machine. Think Lean, think JIT. A clean up to date machine, pulled into existence any time, by anyone. No waiting for the “New Machine Expert” to get around to it. No having to figure out what’s out of date when you finally get it.
  4. Most new hires just want to get some code, build it and run it. They should be able to do that by the end of day one. In fact they should be there by lunch time. Nobody wants to spend their entire first day or first few days following written instructions to set up their machine.
  5. When you solve this problem for new hires, you solve it for everyone. Our dev boxes should be disposable, interchangeable. Knowing that you can get up and running, or back to a known state quickly is liberating. It frees us all from the fear of screwing something up.

These are not new ideas. Vagrant, Docker and others are solving this exact problem. The installation of software is increasingly moving to package managers. Things tend to be a little easier of you are running Linux than say…Windows.

Oh, did I mentioned that all this has to work on Windows?

My dream is to plug a USB stick into a new machine, turn it on, and leave it to create a working development machine. Over the next few posts I’ll document some of the things I’m learning along the way to that ideal.

Mastermind, The Code Breaking Game

Download Code

Mastermind, is a code breaking game for two players.

A “Code Maker” creates a secret sequence of colour pegs. A “Code Breaker” must break the code by taking guesses and working with the feedback from the Code Maker. Feedback is given using Black and White Pegs.

  • A correct colour in the correct position is acknowledged with a Black Peg
  • A correct colour in the wrong position is acknowledged with a White Peg
  • The position of these pegs is not significant. So, a black peg indicates that one of the guessed pegs is correct, but not which one.

See here for a detailed description of the game.

For a programmer both the Code Maker and Code Breaker roles are interesting challenges.

The Code Maker role most compare a guess and a secret code and generate the correct feedback.

The Code Breaker role must issue guesses and figure out the secret code based on the feedback.

Before we look at either role, lets create some types that we can work with.

type Colour = Red|Orange|Yellow|Green|Blue
type Answer = {Black: int; White: int}

The Colours are a discriminated union of the 5 colours that can be used in the code. We also have an Answer that contains a certain number of Black and White pegs.

Each code consists of 4 pegs and colours can be duplicated. We don’t enforce that with types, in fact it’s quite easy to make the code work for arbitrary code lengths. There is one function in the current code that locks us into a code length of 4. I’ll discuss that towards the end of this post.

The Code Maker
Let’s start with a function that can compare a secret code with a guess and return the black/white pegs answer. I’ve put a worked example in the comments so you can follow what’s happening.

// E.g.
// code     = Red, Orange, Yellow, Blue
// guess    = Yellow, Yellow, Green, Blue
// expected = {Black = 1 (Blue); White = 1 (Yellow)}
let check code guess =
    let IsMatch t = fst t = snd t

    // right = [(Blue, Blue)]
    // wrong = [(Red, Yellow); (Orange, Yellow); (Yellow, Green)]
    let right, wrong = code guess
        |> List.partition IsMatch

    // Number of Black Pegs
    // 1 (Blue, Blue)
    let rightColourRightPosition =
        List.length right

    // Number of White Pegs
    // wrongCode  = [Red; Orange; Yellow]
    // wrongGuess = [Yellow; Yellow; Green]
    let wrongCode, wrongGuess = List.unzip wrong 

    // E.g. when colour = Yellow, result = 2
    let howManyOfThisColourOutOfPlace colour =
        |> List.filter(fun c -> c = colour)
        |> List.length

    // Number of White Pegs
    // 1 (Yellow) Although Yellow is guessed twice, there is only one Yellow in the code, so result is 1
    let rightColourWrongPosition =
        wrongCode                                                                          // [Red; Orange; Yellow]
        |> Seq.countBy(id)                                                                 // seq [(Red, 1); (Orange, 1); (Yellow, 1)]
        |> (fun group -> (snd group, howManyOfThisColourOutOfPlace (fst group)))   // seq [(1, 1); (1, 0); (1, 2)] (fst is occurences in code, snd is occurences in guess)
        |> Seq.sumBy Math.Min                                                              // For each colour, sum the lesser of occurences in code and in guess

    {Black = rightColourRightPosition; White = rightColourWrongPosition}

The Code Maker has visibility of the Secret Code (because it create it) and the guess (because the Code Breaker asks to have a guess checked). This will be more interesting when we look at the signature for the solve function.

Let’s not get ahead of ourselves. How to we check a guess against a secret code and calculate the number of Black and White pegs?

We zip the secret code and the guess (which are both lists of colours) this gives us a list of tuples, where each tuple represents the colour of the two pegs at the same position in the code and the guess.

We then partition that list into tuples where the fst and snd match (same colour, same position) and where fst and snd don’t match.

The number of Black pegs is simply the length of the list of tuples that matched.

The number of white pegs is calculated by taking the colours from the list of non-matching tuples and figuring out how many of the colours in the code are also in the guess. Since we partitioned the list, none of the pegs that were in the right place will get in the way of this calculation.

Follow along with the comments in the code and it should be clear.

The Code Breaker
On the face of it writing the code for the Code Breaker seems like it would be harder than for the Code Maker. It has to come up with guesses and then correlate the feedback from all the guesses to crack the code. We also can’t pass our secret code to the Code Breaker because that’s not information that it should have.

What should we pass to the solve function? It should be able to come up with guesses by itself. The only thing it needs is a way of asking the Code Maker to check a guess against the code. So, it needs a way of calling the check function. But it can’t pass the secret code to the check function, it can only pass the guess.

Enter Partial Application.

We can wrap a secret code up in a closure by partially applying the check function. This gives us a function that just accepts a guess and returns an answer. From the Code Breaker’s point of view that’s exactly how the Code Maker should work.

let secretCodeChecker = check [Red; Orange; Yellow; Blue]

Armed with that function the logic of the code breaker is pretty simple.

let solve checkFunction =
    let filterPossibilities possibilities guess =
        let answer = checkFunction guess
        |> List.filter (fun potential -> (check guess potential) = answer)

    let rec solve_iter possible =
        match possible with
        | head::[] -> head
        | head::_ -> solve_iter (filterPossibilities possible head)
        | _ -> raise CanNotBeSolved

    solve_iter possibleCodes

Start with all possibleCodes, take the first of them and guess it. Use the response from the Code Maker to filter out codes that are no longer possible. Repeat until there’s only one possibility left.

There’s a quick and easy way to generate all possible codes

let possibleCodes =
    seq {
            for i in ColourList do
                for j in ColourList do
                    for k in ColourList do
                        for l in ColourList do
                            yield [i;j;k;l]
    |> List.ofSeq

The problem with this is that it locks us into codes of length 4 even though nothing else in our solution has that restriction. A more general recursive function would be nice here, but let’s not waste time on that right now. Let’s get our Code Breaker working.

The most intersting part of breaking the code is filtering out potential codes based on feedback from a guess.

This ridiculously simple to do. We supply a guess to the Code Maker and get back an answer. We then take all of the remaining possible secret codes and we check our guess against each (using the non-partially applied check function) and we keep only the codes that result in the same answer that the Code Maker gave us.

To figure out the secret code that’s wrapped up in the secretCodeChecker function, we just pass it to solve

> solve secretCodeChecker;;

val it : Colour list = [Red; Orange; Yellow; Yellow]

And what about those nested loops that produce all possible codes. How can be turn that into a more general recursive function?

I’ll leave that for you as an exercise. If you look up the ‘Making Change’ example in Structure and Interpretation of Computer Programs here you’ll be in the right area, however instead of simply counting the number of possibilities, you need to actually capture and return them.