Navigate / search

F# and Databases

This entry is part 1 of 2 in the series F# and Databases

Here’s an F# function that takes a connection string, a SQL query, and wedged between them a function that
can convert the results from a DataReader to a Sequence of suitable typed records. Even if you’re unfamiliar with
F# this code should look familiar, it’s straightforward ADO.Net

module SQLiteLoader
    open System.Data
    open System.Data.SQLite
        
    let Query (connectionString: string) toType (sql: string) = 
      seq { use cn = new SQLiteConnection(connectionString)
            let cmd = new SQLiteCommand(sql, cn) 
            cmd.CommandType<-CommandType.Text
        
            cn.Open()
            use reader = cmd.ExecuteReader()
            while reader.Read() do
                yield reader |> toType
         }

Let’s look at that ‘toType’ function more closely. I have a type called Category with two properties, and I have a function ‘ToCategory’ that takes a SQLiteDataReader and sets the properties of a Category. The ‘ToCategory’ function, or another very like it will be passed to the Query function above when the time comes.

module Categories

    open System.Data.SQLite
    open SQLiteLoader

    type Category =
        { 
            Code: string
            Description: string    
        }

    let ToCategory (reader: SQLiteDataReader ) =
        { 
            Code = unbox(reader.["Code"])
            Description = unbox(reader.["Description"])
        }    

Don’t worry about that unbox function. If you come from a C# or VB.Net background you may know about boxing and unboxing, or, more likely, you know of it, but don’t usually give it much of a thought. C# and VB.Net allow from implicit type conversions so boxing and unboxing usually just works. In F# all type conversions must be explicit. This can catch you by surprise because in F# Type inference is such an impressive feature it’s a bit of a jolt to be caught out by a casting issue. In this case unbox is just casting the value in the reader (an obj) to the desired type.

OK, the stage is set, we have a query that can load data from the data base, we have a type that we’d like that data mapped to, and we have a function to do the mapping. Time to have some fun.

open System
open SQLiteLoader
open Categories

let Display category =
    printfn "%s - %s" category.Code category.Description

[<EntryPoint>]
let main argv = 
    let categories = Query @"Data Source=SkuMaker.db3;Version=3;" ToCategory "Select * from Categories"
    Seq.iter Display categories
    Console.ReadKey() |> ignore;
    0

Getting our categories is just a matter of calling Query and giving it a connection string, our mapping function and a sql query.

Passing the connection string every time we want to query the database is going to get a bit tiresome. Let’s see if we can avoid doing that.

[<EntryPoint>]
let main argv = 
    let Loader = Query @"Data Source=SkuMaker.db3;Version=3;"
    let categories = Loader ToCategory "SELECT * FROM Categories"    
    
    Seq.iter Display categories
    Console.ReadKey() |> ignore;
    0

That’s better, we create a new function Loader which is a partial execution of Query. We’ve used Currying to lock in the connection string. Now we can use Loader in place of Query and it will work exactly the same way, except we’ve eliminated the need to keep passing in a connection string.

This is pretty cool. We have a function that knows where our Database is, and we just need to give it a mapping function and a sql query and it will return a suitably mapped sequence of results.

We can go further.

Taking Loader as our starting point we can use partial execution (currying) again to lock in the ToCategory mapping function and create a new function CategoryLoader that knows about both the connection string and the mapping to categories. This new function just needs to be passed a sql string.

[<EntryPoint>]
let main argv = 
    let Loader = Query @"Data Source=SkuMaker.db3;Version=3;"    
    let CategoryLoader = Loader ToCategory   
    let categories = CategoryLoader "SELECT * FROM Categories"    
    
    Seq.iter Display categories
    Console.ReadKey() |> ignore;
    0

We can use Loader to create loaders for other ‘Types’ and we can use CategoryLoader for different queries that load categories. And we can even create different Mapping functions if we want different representations of a Category.

I’m not really interested in investing any time in creating some all encompassing Data Wrapper for F#, I’ve been down that road in C# and it’s just heartache and struggle and code that ends up more complicated than it needs to be.

In the example above you can see that F# allows us to quickly and easily spin up code that just works, is light weight enough that writing it shouldn’t be a huge burden.

The joy of this is that as we find limitations or better ways of doing things we can simply start doing things differently for situations where it works. We’re not left trying to shoehorn whatever problem we have into the Framework we’ve shackled ourselves to.

Don’t scatter logic all over the call stack

Here’s a problem I come across all the time in legacy code and I have been guilty of it myself in the past. This isn’t going to be rocket science, but, apparently, this stuff doesn’t go without saying. Normal caveats about this being a simplified contrived example apply.

Take a look at this code

static void Main(string[] args)
{
    PrintTradeBalance();
    Console.ReadKey();
}

Oh! to have code this simple right? It’s pretty clear what it does, it prints a trade balance, whatever that is. Let’s not break out the bunting and brass band yet. What does this code ACTUALLY do. Let’s dig into the function.

private static void PrintTradeBalance()
{
    var results = GetTradeBalance();
	
    foreach (var result in results)
        Console.Write("Country {0}, Value {1}\n", result.Country, result.Value);
}

Well it turns out a TradeBalance isn’t a thing it’s a collection, each country has one. Let’s leave the function name aside for a moment. We see the function GETS the trade balances, then prints them. It’s still not really clear what a trade balance is or where it comes from. Lets’ keep digging.

private static IEnumerable<Result> GetTradeBalance()
{
    var cashMovements = GetCashMovements();

    var results = cashMovements
        .GroupBy(i => i.Country)
        .Select(r => new Result
        {
            Country = r.Key,
            Value = r.Sum(c => c.ValueIn - c.ValueOut)
        });

	return results;
}

Now we’re getting somewhere. We get CashMovements and aggregate them by Country, Netting the ValueIn against the ValueOut. But what are CashMovements? Where do they come from?

private static IEnumerable<CashMovement> GetCashMovements()
{
    var transactions = GetTransactions();

    var movements = transactions
        .GroupBy(i=>new { i.Country, i.Direction })
        .Select (m=>new CashMovement
        {
            Country = m.Key.Country,
            ValueIn = m.Where(x => x.Direction == "Export").Sum(c => c.Price * c.Quantity),
            ValueOut = m.Where(x => x.Direction == "Import").Sum(c => c.Price * c.Quantity)
        });
        
    return movements;
}

CashMovements are an aggregation of Transactions, which have a direction (Import/Export) a price and a Quantity.

We could go further and see where the transactions come from, but lets assume they are pulled from a DB.

Problems
Oh code! how do I hate thee? Let me count the ways.

  • We had to burrow down three levels deep just to see where the source data came from.
  • The transformation of the data was spread over those functions we drilled into.
  • We encountered the steps of the transformation in reverse order as we drilled down.

Does any of this matter?

Yes. I think it does.

First let’s address one concern you might have. Yes, the aggregation that’s going on here is relatively simple and could probably be done in one function. That’s not really the point. In practice these kinds of transformations, aggregations and enrichment are far more complicated and this drilling down deeper and deeper into functions calls is not uncommon.

Having to burrow down into a function should mean we are going into more detail, a lower lever of abstraction. In this case that’s not really what’s happening, we are simply drilling down into the NEXT step of the overall algorithm. All of these steps are essentially at the same level of abstraction.

We lose any sense of the overall transformation by stringing together function calls like this. Worse, as mentioned above, the algorithm is actually upside down. We drill down 3 levels to get the source data and then transform it on the way back up the call stack. We could hardly find a less intuitive way of describing the algorithm if we tried.

We’ve also coupled each function to the next. Each of these functions does something useful to data, but we don’t provide the data, The functions pull it out of the next function, which means they need to be aware of the next step in the chain. This is a fundamentally flawed way of thinking about functions, it makes life needlessly complicated. The code is harder to test, harder to understand, harder to maintain.

So, why is this kind of code so common?

I believe it may be a problem in the way we teach programming. We tell students to decompose problems from the top down, pushing complexity down into lower level functions. The result is code like that described above.

There are two kinds of complexity and I don’t think students of programming are adequately taught to understand and handle the different kinds.

The first kind of complexity is domain complexity, the problem at hand. The seemingly arbitrary business rules and edge cases that seem to keep popping up. This kind of complexity is vitally important. This is the stuff your client will ask you to explain, ask you to change, ask you to validate. This sort of complexity shouldn’t be pushed down, it should be pulled out, highlighted.

The transformation of transactions into Trade Balances is a Use Case of our system. The logic needs to be easily accessible and future developers need to be able to grasp what’s happening quickly and change it with confidence.

The other kind of complexity is implementation details. You need to go through a Web Service to get the transactions? That’s an implementation issue. You pull the data from a Database and map it to structs or classes? that’s an implementation issue. These sorts of complexity should be pushed down, hidden, abstracted away so that the business logic stands out.

In the code above we shouldn’t be pushing logic down to lower and lower functions. Spreading the algorithm over the call stack by “pushing down” is nuts. We should be pulling the algorithm up, embracing it, making it jump off the screen when someone opens our code.

This isn’t a rant about functional programming, it doesn’t matter whether you use functional, OO or procedural code the principle here is the same.

The following modified code isn’t astonishing or revolutionary or even beautiful, but it’s better, a little better, and if we could all just get a little better life would be so much easier, legacy code would be a much smaller problem. The changes aren’t even difficult. This is just a question of internalising a very simple idea and you’ll never write code like the code above again.

static void Main(string[] args)
{
    var transactions = GetTransactions();
    var cashMovements = GetCashMovements(transactions);
    var tradeBalances = GetTradeBalances(cashMovements);
    PrintTradeBalances(tradeBalances);
    Console.ReadKey();
}

We get Transactions, turn them into CashMovements, and turn those into TradeBalances and print them.
We can now drill down meaningfully into any of those steps to see how each is done. That is valid use
of drill down, we are going to a lower level of abstraction.

The algorithm also reads the right way around. We start with Transactions and end with TradeBalances.

The only change to the functions is that instead of each calling the next function in the change, each is passed the data that it operates on as a parameter.

private static IEnumerable<CashMovement> GetCashMovements(IEnumerable<Transaction> transactions)
{
    var movements = transactions
        .GroupBy(i => new { i.Country, i.Direction })
        .Select(m => new CashMovement
        {
            Country = m.Key.Country,
            ValueIn = m.Where(x => x.Direction == "Export").Sum(c => c.Price * c.Quantity),
            ValueOut = m.Where(x => x.Direction == "Import").Sum(c => c.Price * c.Quantity)
        });
        
        return movements;
}

private static IEnumerable<TradeBalance> GetTradeBalances(IEnumerable<CashMovement> cashMovements)
{
    var tradeBalances = cashMovements
        .GroupBy(i => i.Country)
        .Select(r => new TradeBalance
        {
            Country = r.Key,
            Value = r.Sum(c => c.ValueIn - c.ValueOut)  
        });

        return tradeBalances;
}

private static void PrintTradeBalances(IEnumerable<TradeBalance> tradeBalances)
{
    foreach (var tradeBalance in tradeBalances)
        Console.Write("Country {0}, Value {1}\n", tradeBalance.Country, tradeBalance.Value);
}

Learning to think Functionally -> Why I still don’t understand Single Case Active Patterns

This entry is part 9 of 11 in the series Learning to think Functionally

I won’t lie to you, my burgeoning relationship with F# hit a bit of a rough patch recently.

While I’ve understood pattern matching from the outset, I’ve only had a vague idea about Active Patterns. Nothing I’ve read has really made them click for me. So, I decided to focus and try and get a better understanding.

What I managed to grasp almost immediately is that Active Patterns are functions and there are a few different types

  • Single Case
  • Multi-Case
  • Partial

This is where the trouble starts. Most explanations seem to begin with the Single Case Active Pattern on the basis that it is the “simplest”.

Here’s a typical example

let (|ToColor|) x =
    match x with
    | "red"   -> System.Drawing.Color.Red
    | "blue"  -> System.Drawing.Color.Blue
    | "white" -> System.Drawing.Color.White
    | _       -> failwith "Unknown Color"

This converts the value x to a color.

And here’s how we would use it

let (ToColor col) = "red"

These kinds of examples are everywhere and my overwhelming feeling on seeing them is WHY? Why not simply use a function?

let ToColor x =
    match x with
    | "red"   -> System.Drawing.Color.Red
    | "blue"  -> System.Drawing.Color.Blue
    | "white" -> System.Drawing.Color.White
    | _       -> failwith "Unknown Color"

The only difference seems to me to be the way the function is called.

Instead of

let (ToColor col) = "red"

we have

let col = ToColor "red"

Call me old fashioned but the regular function call looks better and more understandable to me. What on
earth is going on in the Active Pattern? A Function, with and out parameter, that accepts another parameter
using assignment?

It just seems daft compared to the more straightforward function call that we know and love. There has to be some scenario that makes the Active Pattern useful, but I’m pretty sure simple conversions like this aren’t it.

I did send out a cry for help on Twitter and I got a few replies showing cases where the Single Active Pattern is necessary, however the examples were quite complicated (to my novice eyes), leading me to think that far from being the “simplest” Active Pattern, the Single Case actually fulfils quite a niche purpose which is not straightforward at all. The Simplistic examples published on various blogs do nothing to illustrate where this feature is actually useful.

So, let’s park the Single Case, I’ll return to is when I’m able to explain it properly. For now, I still don’t understand it. In the Next post I’ll explain the Multiple Case Active Pattern, which I do understand.

Simplicity

Ladies and Gentlemen of the class of ‘14

If I could offer you only one tip for the future, simplicity would be it. The long term benefits of simplicity have been proven by Rich Hickey whereas the rest of my advice has no basis more reliable than my own meandering experience. I will dispense this advice now.

Beware the over engineered complexity of your code; oh nevermind; you will not understand the over engineered complexity of your code until it bites you in the ass. But trust me, in 2 years you’ll look back at code you’ve written and recall in a way you can’t grasp now how much unnecessary gold plating you added and how little it really achieved.

You are not as smart as you imagine. Don’t worry about designing for the future; or worry, but know that worrying is as effective as trying to solve an algebra equation by chewing bubblegum. The real troubles in your design are apt to be things that never crossed your worried mind; the kind that blindside you at 4pm on some idle Tuesday.

Do one thing everyday that challenges you.

Refactor

Don’t be reckless with other people’s code, don’t put up with people who are reckless with yours.

Contribute to Open Source.

Don’t waste your time on jealousy; sometimes you’re ahead, sometimes you’re behind the race is long, and in the end, it’s only with yourself.

Remember the compliments you receive, forget the insults; if you succeed in doing this, tell me how.

Keep your old source code, throw away your old performance reviews.

Automate the tests

Don’t feel guilty about not knowing the right way to build software, the best developers I know didn’t know at 22 the best way to build software, the really good ones at 40 still don’t.

Get plenty of sleep.

Be kind to your laptop power supply, you’ll miss it when it’s gone.

Maybe you’ll make MVP, maybe you won’t, maybe you’ll join a startup, maybe you won’t, maybe you’ll go into management at 30, maybe you’ll be pushing code to github on your 75th birthday whatever you do, don’t congratulate yourself too much or berate yourself either your choices are half chance, so are everybody else’s.

Enjoy your development tools, whatever they are, use them every way you can, don’t be afraid of them, or what other people think of them, they are the greatest toys you’ll ever own.

Deploy, even if you have nowhere to do it but in your own living room.

Read other people’s code, even if you don’t understand it all.

Do NOT read “The Clean Coder”, it will only make you feel dirty.

Get to know desktop app development, you never know when it’ll be gone for good.

Code in C++ once, but stop before it makes you hard; code in Ruby once, but stop before it makes you soft.

Speak at Conferences.

Accept certain inalienable truths, complexity will rise, frameworks will not meet all your needs, you too will get old, and when you do you’ll fantasize that when you were young code was simple, frameworks were great and developers respected their colleagues.

Respect your colleagues. Don’t expect anyone else to do your work for you. Maybe you have an unspecified deadline, maybe you have a 10X team member; but you never know when either one might run out.

Don’t mess too much with your process, or by the time you get around to writing code there won’t be enough time to write any.

Be careful whose advice you buy, but, be patient with those who supply it. Advice is a form of nostalgia, dispensing it is a way of fishing the past from the disposal, wiping it off, painting over the ugly parts and recycling it for more than it’s worth.

But trust me on the simplicity.