Navigate / search

Curried vs Uncurried Functions

Over a year ago I explained Memoization in this post. If you’ve read that, you should be comfortable with the following.

open System.Collections.Generic

let negate x =
    printfn "Negating..."
    -x

let cache f =
    let cache = Dictionary<_, _>()

    fun x ->
        if cache.ContainsKey(x) then
            cache.[x]
        else
            let result = f x
            cache.[x] <- result
            result

let fastNegate = cache negate

We’re pretending that ‘negate’ is an expensive function that we want to avoid calling unnecessarily, so we use the ‘cache’ function to wrap ‘negate’ with some caching behaviour. Negate takes one argument. What happens when we try to cache a function that takes more than one argument, like ‘add x y’?

On first glance it appears to work.

let add x y = 
    printfn "Adding two args"
    x + y

let fastAdd = cache add

val add : x:int -> y:int -> int
val fastAdd : (int -> int -> int)

But if you try to use it the results are, disappointing.

> fastAdd 1 5;;
Adding two args
val it : int = 6

> fastAdd 1 5;;
Adding two args
val it : int = 6

The underlying ‘add’ function is called every time. We can check to see if anything is added to our cache with a ‘printfn’.

let cache f =
    let cache = Dictionary<_, _>()

    fun x ->
        if cache.ContainsKey(x) then
            cache.[x]
        else
            let result = f x
            cache.[x] <- result
            printfn "Added Item to cache, Items now in Cache: %d" cache.Count
            result

> fastAdd 1 2;;
Added Item to cache, Items now in Cache: 1
Adding two args
val it : int = 3

> fastAdd 1 2;;
Adding two args
val it : int = 3

> fastAdd 2 1;;
Added Item to cache, Items now in Cache: 2
Adding two args
val it : int = 3

That’s odd. The first time we call the function, something is added to the cache. The second time, using the same values, nothing is added to the cache. When we try different arguments another item is added to the cache. This suggests the cache is being used.

Remember that functions in FSharp are curried, which means that a function which takes two arguments (like add) is translated (curried) into a series of functions which each take a single argument.

So, when we try to cache

add x y

what we actually cache is the partially applied function

add x

We can test this. As long as we keep the same value for x, the fastAdd function will use the cached version. As soon as we change x, we get a new partially applied function that is different to the one in the cache. We can see this in the example above. It is the change in x from 1 to 2 that triggered the addition of a new item to the cache.

So, we’re not actually caching our *expensive* add function at all. We’re just caching the partial application of it.

We could rewrite our ‘add’ function so that it takes one argument, a tuple.

let add (x, y) =
    x + y

let fastAdd = cache add

> fastAdd (1, 2);;
Added Item to cache, Items now in Cache: 1
val it : int = 3

> fastAdd (1, 2);;
val it : int = 3

> fastAdd (1, 3);;
Added Item to cache, Items now in Cache: 2
val it : int = 4

We’ve ‘uncurried’ the function. Now everything works. But it’s still no good. We’ve duplicated our add function, and remember this all relates to functions that are a lot more complicated than addition.

We could wrap the original curried function so the logic isn’t duplicated. We’re really just putting an adapter around it to change it’s signature.

let add x y = 
    printfn "Adding two args"
    x + y

let uncurriedAdd (x, y) =
    add x y 

let fastAdd = cache uncurriedAdd

val add : x:int -> y:int -> int
val uncurriedAdd : x:int * y:int -> int
val fastAdd : (int * int -> int)

> fastAdd (1, 3);;
Adding two args
Added Item to cache, Items now in Cache: 1
val it : int = 4

> fastAdd (1, 3);;
val it : int = 4

That’s still rubbish though. The signature of ‘fastAdd’ is different to ‘add’, so it isn’t an alternative. If we think in terms of higher order functions, ‘add’ and ‘fastAdd’ have different types. We can’t pass ‘fastAdd’ to a function that is expecting ‘add’ and expect things to just work.

We could write a ‘cache2’ function that follows the same pattern as ‘cache’ but works with functions of two arguments. That will work but it’s a shame not to use the working cache function that we already have.

Feel free to stop reading and figure out a solution. What follows is one way of doing it.

The solutions above weren’t good enough, but we were on the right track. Uncurrying the ‘add’ function to make it compatible with our cache is fine, although we shouldn’t have to write a specific function each time we need to uncurry a function. Through the magic of higher order functions we can write the following.

let uncurryTwoToOne f (x, y) = f x y

That allows us to write the following.

let uncurriedAdd = uncurryTwoToOne add

That saves us doing the uncurrying individually for each function. It still leaves us with a function that isn’t compatible with the original signature of Add. So, caching it is still no good. Remember the ‘cache’ function doesn’t change the signature, it takes a function that accepts one argument and it wraps it to add caching, but it maintains the same signature. That way, the cached version can be used anywhere the original is used.

It looks like we need to wrap the cached version of ‘add’ in yet another function to convert the signature back to curried form. Another higher order function to the rescue.

let curryOneToTwo f x y = f (x, y)

And now we finally have a working solution.

let fastAdd = add |> uncurryTwoToOne |> cache |> curryOneToTwo

> fastAdd 1 3;;
Adding two args
Added Item to cache, Items now in Cache: 1
val it : int = 4

> fastAdd 1 3;;
val it : int = 4

Note that in the ‘fastAdd’ function we are not piping the result of ‘add’ through the chain of functions, the actual ‘add’ function itself is the value. Each function in the chain is passed a function which it wraps before passing it on. Like building up the layers of an onion.

It seems like we should be able to make ‘fastAdd’ into a general purpose higher order function too.

let curried2 f = f |> uncurryTwoToOne |> cache |> curryOneToTwo

Putting it all together. We get the following

open System.Collections.Generic

let uncurryTwoToOne f (x, y) = f x y

let curryOneToTwo f x y = f (x, y)

let cache f =
    let cache = Dictionary<_, _>()

    fun x ->
        if cache.ContainsKey(x) then
            cache.[x]
        else
            let result = f x
            cache.[x] <- result
            printfn "Added Item to cache, Items now in Cache: %d" cache.Count |> ignore
            result

let cache2 f = f |> uncurryTwoToOne |> cache |> curryOneToTwo

let add x y = 
    printfn "Adding two args"
    x + y

let fastAdd = cache2 add

> fastAdd 1 3;;
Adding two args
Added Item to cache, Items now in Cache: 1
val it : int = 4

> fastAdd 1 3;;
val it : int = 4

> fastAdd 1 4;;
Adding two args
Added Item to cache, Items now in Cache: 2
val it : int = 5

> fastAdd 1 4;;
val it : int = 5

Playing for a Draw

It’s September, so time for a small diversion from the technical stuff to something far more important, Hurling.

Prior to 2012 the last All-Ireland final to end in a draw was the 1959 Kilkenny Waterford final. Since 2012 all three finals have had to be replayed. This many draws in a row is wildly unusual, but that in itself isn’t evidence of funny business. Anomalies happen, that’s why we call them anomalies.

Hurlers are operating at levels of strength, skill and fitness unlike anything we have ever seen. If we assume that the top teams are reaching roughly the same heights, aren’t close games more likely? and consequently shouldn’t we see more draws?

Or is there more to it?

This year alone, Kilkenny and Tipperary ended level after normal time in the National Hurling League Final, and Kilkenny were only a point ahead after extra time. Kilkenny and Galway tied the Leinster Semi-Final, Clare and Wexford couldn’t be separated in Normal or Extra time in the Qualifiers, and their replay also went to extra time. And of course Sunday’s Epic also ended all square.

For as long as I can remember there have been suspicions that the GAA hope for draws and the revenue boost that replays provide. It’s quite possible that’s true, but it’s a leap from that to suggest that the GAA actively influence referees to blow up level games, or allow close games to run on a little longer in the hopes of the trailing team equalizing.

I’m sure referees would be extremely unhappy and would absolutely refuse to be influenced in such a way, and I don’t believe it happens.

I do however think referees are doing exactly what I described above, albeit at their own volition rather than in response to pressure. I believe they are stopping tied games as quickly as possible, but allowing close games to run on. Giving the trailing team “A chance”.

Exhibit A – All-Ireland Final – Kilkenny vs Galway – 9 September 2012
3 minutes of injury time were added, with Kilkenny leading by a point.
At 71.50 a free was given to Galway and Jackie Tyrell was booked moments after a trip on Kilkenny’s Tommy Walsh was ignored. Galway tied the score from the resulting free and the referee ended the game a second or two short of the end of injury time.

Exhibit B – All-Ireland Final – Cork vs Clare – 8 September 2013
2 minutes of injury time were added. When the two minutes have elapsed Cork are up by a point. The Referee allows play to continue for 27 seconds. Clare equalize and play is then stopped.

Exhibit C – League Final – Kilkenny vs Tipperary – 4 May 2014
3 minutes of injury time added. Kilkenny a point up when the 3 minutes have elapsed. Play is allowed to continue for 38 seconds, including a missed free to Tipperary. Tipperary equalize from play, the referee doesn’t allow the puckout. Normal time ends level.

In Extra-Time 1 minute of injury time is added. Kilkenny lead by a point when the 1 minute elapses. Referee allows the puckout, Kilkenny win the ball and the game is ended. We of course can’t know if Tipperary would have been allowed a few seconds to equalise if they had won the ball, but the pattern from other games suggests they might.

Exhibit D – Leinster Semi-Final – Kilkenny vs Galway – 22 June 2014
2 minutes of injury time added. Kilkenny score with 2 seconds of injury time remaining to lead by a point.
Play is allowed to run on for 21 seconds, Commentator makes the comment “He has to give them a chance”. Galway equalize and the game ends.

Exhibit E – All-Ireland Final – Kilkenny vs Tipperary – 9 September 2014
1 minute of injury time added. Sides are level, a questionable free is awarded to Tipperary. The incident is similar to the free awarded in the 2012 final between Kilkenny and Galway, but this time the free is given to the defending player rather than the player with the ball. Had the free been given to Kilkenny it would almost certainly have been scored.

The entire 1 minute of injury time is used up taking the free and waiting for a decision from Hawkeye, play resumes with 71 minutes 23 seconds on the clock yet no additional time is played after the puckout. The referee ends the game immediately, with the sides level.

What’s going on?
I don’t believe there is a conspiracy to create draws for the GAA, but I do believe the referees are taking the easy way out by giving teams “a chance” to equalize, or blowing up if the sides are level. The pattern of this happening in the past makes it harder for a referees in the future.

Solution
There has apparently already been a decision to implement a public clock for hurling. That should happen urgently.

Tied championship games should always go to extra time. At the moment there’s a bizarre situation where a League game or a Qualifier game can go to extra time, but other championship games including the All-Ireland final is an immediate replay if the sides are level and the end of normal time.

Of course, hurling games need never go to a replay. Simply allowing tied teams to play on until someone scores would end most games within minutes. Perhaps requiring a team to lead by two points to clinch a game would be fairer, and should still end the game with very little extra time.

I don’t think the GAA try to engineer draws, but I suspect the likelihood of them ever eliminating them is fairly remote.

Aim for DRY, but be willing to fall short.

I’m not going to rehash the reasons why “Don’t repeat yourself” (DRY) is a good thing, I’ll assume you know why duplicating code is bad.

Strictly speaking DRY isn’t really about repeating code, it’s more about representing the same “Information” or “Knowledge” in different places. For that to happen the code doesn’t actually have to be the same.

Conversely, duplicated code doesn’t always represent duplicated knowledge or information, completely different concepts can sometimes be represented by virtually identical code.

Mathias Verraes recently wrote a really good blog post on this very issue, it’s well worth a look.

This post is about aiming for DRY, but deliberately coming up short.

I’ve been adding some NUnit tests to an existing system recently and I kind of liked the way they turned out, despite the fact that I duplicated some code.

The following is heavily modified code, I can’t show the original, but it gets the point across. Please don’t read into anything you see about credentials or security, this code is one step away from Widgets and Wotzits, it’s NOT REAL CODE.

    class Describe_Club_Membership
    {
        [Test]
        public void A_User_Can_Join_The_Club()
        {
            UsingCredentials("appName", "appKey");
            var userId = GetUserId("someUsersEmail@example.com");

            EnsureUserIsNotAMember(userId)

            JoinClub(userId);
            Assert.That(IsAMember(userId));
        }

        [Test]
        public void A_User_Can_Leave_The_Club()
        {
            UsingCredentials("appName", "appKey");
            var userId = GetUserId("someUsersEmail@example.com");

            EnsureUserIsAMember(userId)

            LeaveClub(userId);
            Assert.That(IsNotAMember(userId));
        }

The entire contents of the tests are written using functions that exist only for the purpose of making the tests readable. In a sense these few functions define a Domain Specific Language. Code within any function (but particularly within a test) should always be written at the same level of Abstraction, would should never have code like the following in a test.

    class Describe_Club_Membership
    {
        [Test]
        public void A_User_Can_Join_The_Club()
        {
            var credentialsFactory = new CredentialsFactory("Membership");
            var credentials = credentialsFactory.CreateDefaultCredentials("appName", "appKey");

            var registrationRepo = new RegistrationsRepository(credentials);
            var userId = registrationRepo.GetUserByEmail("someUsersEmail@example.com").FirstOrDefault().Id;

            var binding = new BasicHttpBinding();
            var membershipEndPoint = new EndpointAddress(ConfigurationManager.AppSettings["..."]);
            
            var membershipService = new MembershipServiceClient(binding, membershipEndPoint);

            ... etc...   
        }

You get the idea, everything that needs to be done, is in the test, from pulling information out of Repositories, to reading Configuration Files, to configuring services endpoints to actually calling services, and on and on.

I don’t know about you, but I prefer the first code sample to the second.

Those functions that make up the Domain Specific Language in the first tests, have to go somewhere, and as you can see from the second example, some of them involve a little bit of work. Look again at the first code sample, you’ll see that I have a function called ‘UsingCredentials’ but it doesn’t return credentials. If it did I could pass them into the subsequent functions that need credentials, but in my DSL it’s enough to state the credentials I want to use. Passing the credentials and all the other dependencies around would unnecessarily clutter up the tests.

So, clearly the credentials are being saved somewhere by the ‘UsingCredentials’ function and then subsequent functions can use them.

    class Describe_Club_Membership
    {
        private Credentials _credentials;

        [Test]
        public void A_User_Can_Join_The_Club()
        {
            UsingCredentials("appName", "appKey");
            var userId = GetUserId("someUsersEmail@example.com");

            EnsureUserIsNotAMember(userId)

            JoinClub(userId);
            Assert.That(IsAMember(userId));
        }

        // Test Fixture DSL Commands
        private void UsingCredentials(string appName, string appKey)
        {
            _credentials = CredentialsHelper.GetCredentials(appName, appKey);            
        }

        ...

The UsingCredentials function is actually just a one liner, the heavy lifting is done elsewhere in a helper class. The job of getting credentials may be needed by lots of TestFixtures so I want that logic in a shared component, not duplicated. We’re heading in the right direction of DRY.

So, what then is the point of the UsingCredentials function we see here?

Put simply, I like the way it makes the tests read. I could have used this one liner, and others like it in the test, but a test made up of a series of these kinds of calls is more cluttered and it gives the impression that these helper classes may be significant to the test. I don’t want to give that impression.

I want to declutter the test so that only the logic being tested remains. Wrapping this and other one liners in tiny DSL style commands sends a message that the details of how the helpers are called is not as important than the logic being tested.

But what then if I want to use the same Helper from other Test Fixtures, aren’t I going to have to duplicate the little functions that wrap the one liners?

Yes I am.

This is where I deliberately fall short of DRY.

For one of my fixtures I have 7 of these tiny functions. They vary in complexity but none contain more than one line of code. Code that I do not want to see directly in my tests. In another fixture, I use three of those same wrapper functions, and I’ve duplicated the three that I need.

How could I make use of these DSL functions without duplicating them? One benefit of the wrapper functions is that they can access the credentials or any other member of the fixture without it being passed as a dependency. If I move the wrapper functions out I’m basically back to calling the helper functions and needing to pass all dependencies. In short, the wrapper functions no longer serve any purpose.

I could define the helpers and the credentials in a base class and inherit my test fixtures from that. That is in fact a really nice solution, it’s DRY, the Tests look just as nice and the Test Fixtures don’t even have the tiny wrapper functions in them.

If you wanted to go with the inheritance option I’d be hard pushed to make a case against it. I just don’t like inheritance and I really don’t like it in test suites.

Irrational as it may be I prefer to have these tiny functions so my tests can be as clean as possible, but I’m willing to fall a little short of DRY rather than use inheritance.

If someone is curious when looking at a test and wants to see what a function does or where credentials come from, it’s right there, in the fixture, no climbing an inheritance hierarchy.

As soon as you start using inheritance it sucks you in, you start adding more to the base class, or an extra level of inheritance here and there. It gets messy real fast.

What will happen when not all of the test Fixtures need Credentials? When some subset of fixtures need some other shared member? The Base class starts to become the union of all needs rather than the intersection. And for what? So we can avoid duplicating a handful of one line functions that are unlikely to change. Even if they do change it’s going to be trivial, perhaps a new argument to the helper function.

The key points for me are:

  1. Keep your tests as free of clutter as possible, nothing should remain but the logic being tested, written at a consistent level of abstraction.
  2. If necessary define a tiny Domain Specific Language for your tests as simple one liner functions which call to the real shared code, kept elsewhere.
  3. Keep the DSL functions you need right there in the Test Fixture, duplicate them if necessary. Don’t try to over engineer your DSL using inheritance or any other clever tricks.

The third point about inheritance is maybe up for debate, but I don’t thing the first two points are.