Python vs. Haskell round 2: Making Me a Better Programmer

Another point for Haskell

In Round 1 I described the task: find the number of “Titles” in an HTML file. I started with the Python implementation, and wrote this test:

def test_finds_the_correct_number_of_titles():
    assert title_count() == 59

Very simple: I had already downloaded the  web page  and so this function had to do just two things: (1) read in the file, and then (2) parse the html & count the Title sections.

Not so simple in Haskell

Anyone who’s done Haskell might be able to guess at the problem I was about to have: The very same test, written in Haskell, wouldn’t compile!

it "finds the correct number of titles" $
    titleCount `shouldBe` 59

There was no simple way to write that function to do both, read the file and parse it, finally returning a simple integer. I realized that the blocker was Haskell’s famous separation between pure and impure code. Using I/O (reading a file) is an impure task, and so anything combined with it becomes impure. My function’s return value would be locked in an I/O wrapper of sorts.

I got frustrated and thought about dumping Haskell. “Just another example of how it’s too difficult for practical work,” I thought. But then I wondered how hard it would be to read in the file as a fixture in the test, and then call the function? I’d just need to pass the html as a parameter. And yep, this worked:

it "finds the correct number of titles" $ do
    html <- readFile "nrs.html"
    titleCount html `shouldBe` 59

As I refactored the code to pass this test, I realized that this is much better: Doing I/O and algorithmic work should always be separate. I had been a little sloppy or lazy in setting up my first task. The app with the Haskell-inspired change will be more reliable and easier to test, regardless which language it ends up being written in.

See also:

Python vs. Haskell round 1: Test Output

The point goes to Haskell

On the surface, it may sound silly to compare these two languages because they’re about opposite as you could get: Python is interpreted, dynamically typed, and slightly weakly typed as well. Haskell on the other hand, is compiled, statically and strongly typed. But they’re both open source, and they both have large collection of libraries which are helpful for my work.

I’m creating, an app displaying the Nevada Revised Statutes, similar to my first, Most of my code is Ruby and Rails, but I want to improve the architecture and maybe move to a new language. So I’ve been parallel developing the scraping & import code in my candidate languages, Python and Haskell.

The screenshots above show the results of running my first failing test in each language — failing for pretty much the same reason in each case. My first test is, parse the NRS index page and return the number of Titles found. Python’s pytest puts a lot of junk on the screen in comparison to Haskell’s hspec. Reading the hspec output is a thing of beauty, really.

Next: Round 2, Making Me a Better Programmer

The Benefits (not features!) of Programming with Haskell

I’m just a couple of months in, and have written my first production Haskell app, a PDF parser for Oregon laws. Programming it feels different, in a good way. Looking over the list below, two themes — easy and fast — stand out. Compared to OO languages:

  • It’s easy to jump back in to previous work;
  • easy to test my code;
  • easy to refactor;
  • easy to code in a familiar style;
  • the programs are lightning fast;
  • I’m delivering features faster;
  • I can use Atom as an IDE;
  • it’s simple to develop on a Mac and deploy on Linux,
  • it’s easier to understand someone else’s code.

Plus, I just subjectively look forward to it. I think this all adds up to a low mental burden, yet the language is very stimulating.

I decided to focus on “benefits” because really, that’s what we’re all after. This may sound controversial, but I don’t think anybody “wants” referential transparency. That’d be like wanting a regular cleaning service. Instead, what we do want is the benefit; a clean home. Or in programming terms, a clean codebase. So let’s talk about those features now:

The features that enable those benefits

I thought a lot about why I’m seeing these benefits. Here’s a quasi root-cause analysis for them:

Easy to come back to a project and do a little work on it, because of low cognitive load, because everything I need to understand a chunk of code is on the screen in front of me, because of pure functions which have very clear input and output, and because my text editor (Atom) acts as a full IDE showing me the types of every variable and signatures of every function.

Easy to refactor, because I can simply cut and paste functions from one file to another, because there are no globals, only constants, and because the functions — pure and impure — are easy to compose. Also because I can change my data structures and the compiler shows necessary changes in the code, e.g. forcing that every property is set, and to the correct type. Because I can rename anything and the compiler shows every reference needing updating.

Easy to code in a familiar style, because new functions can be composed from others, and new operators can be defined — without sacrificing the other benefits.

I can develop on OS X and deploy on Linux, because the libraries are cross-platform.

The programs are lightning fast, because they’re compiled and because Haskell has lots of optimizations, like laziness.

I’m delivering features faster, because I’m writing much less code, and because I don’t need to write many tests. In Ruby, my test suites are perhaps 2-3x the size of my code base. In Haskell, I’m only TDD’ing and testing particular pieces which can be tricky to get right, like regex-based parsing functions. The compiler tests the rest for me automatically. I’m also writing fewer tests because the code itself is declarative (as well as type-checked). In other words, it reads like a test already, and so a test would only be redundant.

Based on my experience so far, Haskell’s my current language of choice for new server-side code.

I’ve decided to make this the first part in a series about my real-world experience of using Haskell for application programming. I’ll follow up with a post on some of the rough edges in the ecosystem to be aware of, and the best path I’ve found for learning Haskell.