Ben Pirt

Jun 6

User-centred API Design

I’ve seen a few thoughts on how to design an API recently, one in particular takes a very pragmatic approach that is pretty much exactly how I would build an API these days (and prompted me to finally drag this out of my Evernote drafts folder and onto my blog), but none quite cover the thoughts that have been brewing in my head whilst building the Xively API over the last few years (and thinking about the next version). These are more like high-level design approaches than specific technical recommendations - these are the things you need to think about before even putting code to terminal.

The big idea

There are going to follow a few more detailed concepts, but I thought I’d kick off with the underling principle of API design, which is:

The user is king (or queen)

We’re talking here about API design, but one of the first principles of successful design is to design with the user in mind (aka User-Centred Design). You wouldn’t expect any serious website to be designed without any thought for who was going to use it, so why do APIs often get seen as an afterthought, tacked on and unplanned? Putting yourself in the position of being the user is a good first start at understanding what it would be like to use your API, even better is if you can find some real users (if you’re in the lucky position of having some!) and ask them what they find difficult and try out prototypes on them. The other items in this list are all really themed around making your API as easy to use as possible. An easy API will mean that if you get people coming along to kick the tyres of your product, they get further and get a better impression of it. Imagine assessing two payment providers, say Stripe and PayPal. Which would you be most likely to use, given the choice? I’ll give you a clue - it probably wouldn’t be PayPal.

ProTip: for those reading this who already have an API - take a day a week to make something that uses the API and you’ll discover how inconsistent and buggy it really is. This will likely be the best way you can use this day. You can also dog-food your own API, which is becoming increasingly popular and viable recently, using it from Javascript in the browser (we’ve been doing this for a few years - highly recommended).

Reduce Surface Area

The more API endpoints you have, the more confused your users will be. If you’re unsure whether you should develop a specific piece of functionality, don’t. Wait until enough real users ask / beg you for it and then consider whether it is worth making your API more complex for everyone in order to build it.

Fail Fast (and Verbosely)

Never try to second guess what the user wants to happen if there is any uncertainty in the request because it just masks errors and makes users more confused in the long run. A good analogy here is in comparing PostgreSQL (one of my favourite pieces of software) and MySQL - if you try to enter an invalid date in Postgres it will give you a clear error message saying the date is invalid, whereas MySQL will silently coerce it to 0000-00-00 (make sure strict mode is enabled to avoid this). You know what they say about assumptions. Always return a clear error message. Instead of returning error codes, return links to specific pieces of documentation detailing that error more fully. 

One way to do things

I’m going to use my own API as an example of what not to do here. At Xively we have a hierarchical relationship of Feeds > Datastreams > Datapoints. You can get Datapoints in to the system using three different endpoints and in about 6 different ways. We’ve recently been rewriting the documentation and have stripped the methods that we’re documenting (and recommending) down to 1 because we’ve realised the error of our ways. The next version of the API will definitely follow this mantra. It’s also good for the developer because it reduces application complexity and makes things much easier to maintain without having to deal with many, many edge cases.

Reduce the number of concepts

Every API has a number of concepts that users have to understand in order to be able to use it. It stands to reason that the length of time it takes to become fluent in an API is proportional to the number of concepts you have to grok first. REST is a good approach to this because it reduces the number of possible operations a user has to understand down to four. Then all they need to understand is what each object type is. If you compare REST to a more RPC based approach, then you often end up with a concept for every operation.

Don’t overload endpoints

Often, when trying to reduce the surface area of an API you end up tricking yourself into thinking you’ve achieved this by hiding functionality on existing methods. For example, using a GET parameter to change the behaviour of a specific endpoint. All you’ve done is made that endpoint more complex and harder to use because users now have to have a set of additional conditions baked into their understanding of the endpoint. This is a tricky one to solve though - the only real way of stopping it is to not add the functionality, so again, only build things you know users (and a reasonable amount of them, not just one loud one) really need.

Reduce the minimum requirements

Users should be able to get up and running with the minimum possible effort. This means that if you have, for example, a JSON or XML representation of a dat model and a lot of its attributes are required, then they will have to spend time and thought adding them all immediately. If you’re able to, make as many of the attributes optional as possible and then make sure you document the minimal way first. Which brings me on to…

Documentation

Obviously you should have some. However, the best kind of documentation to get users started quickly is not the endpoint by endpoint manual, but quick start guides detailing how to do a specific commonly performed task. Task-oriented documentation is so much more useful than a manual because it gives the user the context of the action they are trying to perform and then walks them through all of the steps they might need to perform it. Typically this might consist of a number of actions which would be spread over a number of different pages in the manual, that the user would have no clue as to how to assemble into a coherent set of actions. Again, put the user first and think about what they are trying to achieve and then how you can best help them understand how to achieve it.

If you don’t start by putting the user first, then there’s no point in implementing any of the guides for how to build a good API because you’ll already have failed.


Nov 23

Software I Love

I’ve been meaning to write something about the software I use on a daily basis for a while now. So here it is, a hat tip to all of those amazing people whose shoulders I’ve been standing on for a while now. All of these pieces of software have been high performance and solid in production environments, handling millions or even billions of requests. They form a kind of basic toolbox to use to solve a problem, so I haven’t included some of the things I’ve used and liked that are more niche or things like web frameworks because they are somewhat interchangeable. I’m going to come back and write a more detailed piece on each of these, but for now here’s a quick run-down of my favourite software:

PostgreSQL: The engineering quality of Postgres is astounding. The performance is amazing and it just keeps getting better. It’s incredibly rock-solid, uses MVCC and always takes the safe route (as opposed to other databases I might mention). Just not enough superlatives out there to describe it :-)

Nginx: It’s fast, its reliable and lets you do all kinds of useful things (try the Lua module for ultimate flexibility). Whenever you need to do any kind of HTTP wrangling you’ll normally be able to do it with this. Oh, and you don’t need to restart it to update the config.

HAProxy: The HA stands for High Availability and it’s not lying. It has a particularly good load balancing algorithm and is rock solid. You can use it in raw TCP/IP mode or do deeper HTTP packet inspection to route more intelligently. For simple sites, Nginx is fine, but for more complex ones with wide ranges in response time HAProxy wins.

Redis: Don’t think of it as a more up-to-date memcached - it’s an entirely different breed of animal. Having shared, persistent and ultra-fast data structures opens up a world of possibilities for working with your data.

Memcached: Extremely simple, but works well for caching your application data. The simple approach to distributing your data using persistent hashing works very well. Starting to be encroached upon by Redis now but I can’t leave it out because it’s served me so well over the years.

Statsd / Graphite: I’ve bundled these because they work so well together, but they’re separate pieces of software. These tools make it so easy to collect metrics from anywhere that you have no excuse for not doing so. If you’re going to make things better you need metrics.

RabbitMQ: Rabbit just keeps getting better and better. If you need to do any kind of real-time push communication, Rabbit is your guy. The topic exchanges are incredibly useful and it scales up very nicely thanks to Erlang’s capabilities.

Beanstalkd: If you just need a single-server, in-memory queue (though it can do persistence), then Beanstalkd is simple and works without any hassles. No bells and whistles like Rabbit, but sometimes that’s a good thing.

Varnish: If you want to do HTTP caching, Varnish can’t be beat. It’s super-fast, very flexible thanks to its C based configuration and makes a massive difference to site speed if you can cache.

Git: It’s not server-side, but it’s a crucial part of the development process. Used in conjunction with GitHub it’s unstoppable.

All of these things form the infrastructure around your web application and if you know how to use them well, then you have an amazing set of tools in your repertoire to use to solve pretty much any problem you might have on the server-side. If any of these are new to you, my advice is to take a look and see how (or if) you might be able to use them. I imagine a lot of the more seasoned developers will already have come across most of these.

Thank you to the developers of these awesome tools, without which I’d feel pretty much lost. If anyone reading this has any suggestions of their own, leave them in the comments. I’m always interested in adding more tools to my toolbox.


Jul 7
Really looking forward to getting stuck into these amazing sauces from St. John and Dolly Smith’s. The Brinjal is incredible (well actually they all are!)

Really looking forward to getting stuck into these amazing sauces from St. John and Dolly Smith’s. The Brinjal is incredible (well actually they all are!)


Jun 14

Hybrid SQL / NoSQL with PostgreSQL and Backbone.js

I’m going to come right out and admit it up-front - I love PostgreSQL! Since using it at Cosm I have developed a massive respect for its reliability, consistency and performance. However, there’s also a lot to be said for the flexibility that a schema-less NoSQL document store like CouchDB or Mongo provides - I just can’t bring myself to give up that sweet, sweet relational integrity (especially when the vast, vast majority of applications don’t ever need the scale benefits that can come with dropping the relational aspect). Fortunately there’s a way of getting both with good old PostgreSQL which I’ll get to in a moment.

Read More


May 13

Spiced Squash and Chickpeas

I thought I’d open this blog with one of my favourite recipes of late. It’s quite easy and absolutely delicious. It was evolved from an original recipe in the highly recommended River Cottage Veg Every Day by Hugh Fearnley-Whittingstall

Read More