Service-oriented Songkick

This article is part of a series on Songkick’s migration to a service-oriented architecture. The full series:

For a few months now, we’ve been transitioning Songkick to a service-oriented architecture (SOA). This is the first in what will hopefully be a series of articles on what that means and how we’re doing it, and what benefits it’s bringing us. But first, some history.

In the beginning

Songkick has, for its five-year history, been a Rails app. (Well, there was a prototype in PHP but you didn’t hear that from me, right.) It was still a Rails app by the time I joined, two years into the project in 2009. And I mean it was a Rails app and nothing else. Although the system consisted of a database, a website, message queues, background processing, file storage, daily tasks like email notifications, and so on, it was all one big project.

Oh sure, we told ourselves all the non-web components were separate projects, but they were included in the Rails app via git submodules. They all shared code from app/models and lib. They all changed together. If you changed our background job processor you had to bump a submodule in the Rails app, run all the tests and deploy the entire thing to all the machines.

Oh and did I mention the build took two hours spread across a handful of Jenkins (then Hudson) build slaves? It’s a wonder we ever shipped anything.

Time for some house-cleaning

If you’ve worked on any early-stage, rapidly growing product you probably recognize this scenario. You’ve been adding features and tests all over the place, you’re not sure which ones have value but you keep all of them anyway, and you focus on releasing as fast as possible. We went through two major versions of the product like this, and it’s fine when your team and the codebase are relatively small. Everyone knows where everything is, it’s not that hard to maintain.

But in the medium- and long-term, this doesn’t scale. The Big Ball of Mud makes it increasingly harder to experiment, to bring new hires up to speed or deal with sudden scaling issues. We needed to do something.

Step 1: get organized

We began this process in mid-2010 by extracting the shared part of our codebase into a couple of libraries, songkick-core and songkick-domain. Core mostly contains fairly generic infrastructure stuff: APIs for locating and connecting to databases and message queues, managing per-environment config, logging/diagnostics support etc. Domain contains all the shared business logic, which given our history means all our ActiveRecord models and observers, and anything else we needed to share between the website and the background processes: the web routes, third-party API clients, file management, factory_girl definitions and shared cucumber steps, etc.

This was a really useful step forward since it let us take all our background process code out of the Rails app and deploy it separately. Each project just included Core and Domain as submodules and through them got access to all our infrastructure and business logic. Happy days. It was a great feeling not to have all that stuff gunking up our website’s codebase, and it meant it didn’t need re-testing and re-deploying quite so often.

Step 2: encourage small projects

One great way to keep development sustainable is to favour small components: components and libraries with focused responsibilities that you can easily reuse. Encouraging this style of development means you need to make it easy to develop, integrate, and deploy many small components rather than one big ball. The easier this is, the more likely people are to create such libraries.

Unfortunately, despite our restructured codebase this was nowhere near easy enough. Using git submodules meant that any time Core or Domain was changed, one had to bump those submodules in all downstream projects, re-test and re-deploy them. We needed something more dynamic that would ease this workload.

The first thing we tried was Rubygems. We started packaging Core as a gem and a Debian package, which is how we distribute all the libraries we rely on. We thought that by using semantic versioning we could force ourselves to pay better attention to our API design. This turned out to be wishful thinking: this is a core component on which everything depends, and has to change fairly frequently. It’s the sort of thing that should be deployed from git by Capistrano, not through formal versioning and apt-get. The fact that it was now a globally installed library also made it really hard to test and do incremental roll-out. Long story short, we ended up at version 0.3.27 before giving up on this system.

(I can already hear everyone saying we should have used Bundler. Another consequence of the time we started the project is that we run Rails 2.2 on Ruby 1.8.7 and Rubygems 1.3.x, and making Bundler work has proved more trouble than it’s worth. Upgrading Rails and Ruby is, let’s say, Somewhat Difficult, especially with the volume of code and libraries we have, and at a startup there’s always something more urgent to do. These days we have a bunch of apps and services running on modern Ruby stacks, but it’s still not pervasive. Part of this process is about decoupling things so we can change their runtimes more easily.)

Step 3: tools, tools, tools

So we needed a migration path to get to a more sustainable model. In 2011 we built a dependency tracker called Janda (don’t ask) to make it easier to manage and encourage lots of small projects. It was based on a few key ideas borrowed from Bundler and elsewhere:

  • Every project declares which others it depends on
  • Circular dependencies are not allowed
  • Dependencies can be loaded from a global location or vendored inside the project
  • A project cannot load code from anything not in its dependency graph
  • Versioning is done with git
  • Builds are run by checking dependencies out into the project itself and the system tracks which versions of components have been tested together
  • We only deploy one version of each project to production at any time
  • The deployment system makes sure the set of versions we deploy are mutually compatible, based on build results

This gave us several important things: a system for dynamically locating and loading dependencies, which let us stop using submodules and manually updating them; a dependency-aware build and deployment system that made it easy to check what needed testing as a result of every change; and a framework imposing some light restrictions on how code could be structured.

Building this tool exposed dozens of places in our code where we had implicit and circular dependencies we weren’t aware of. To make our software work with this system, we had to get it into better shape through refactoring. This process itself led to several new libraries being extracted so they could be safely shared and tracked. It was a big step forward, and helped us ship code faster and with more confidence.

Step 4: break the dependencies

That probably sounds like a weird thing to say after spending all that effort on a dependency tracker. But in truth it was always going to be an interim measure; we want to be using the same Ruby toolchain everyone else is, it’s just easier that way. Plus, we have mounting pressure in other areas. Domain is still a big project, full of dozens of classes that know too much about each other. Every ActiveRecord model we have is free to interact with the others. It’s hard to change it without breaking anything downstream, and it’s making it harder for us to split our monolithic database into chunks that can scale independently. All familiar scaling woes.

So, since late last year we’ve been working on the current stage of growing our codebase: replacing all our couplings to ActiveRecord, and the Domain project as a whole, with web services. We have a handful of services that expose JSON representations of various facets of our domain. One service handles concert data, one handles user accounts, one deals with uploaded media, and so on. Long-term, the aim is to get to a stage where we can change the internals of these services – both their code and their datastores – independently of each other, and independently of the apps that use them.

These services put an explicit stable boundary layer into our stack that makes it easier to work on components on either side of the line independently. They reduce coupling, because apps are now making HTTP calls to stable language-agnostic APIs rather than loading giant globs of Ruby code, and it simplifies deployment – if you change a service, you don’t need to restart all the clients of the service since there’s no code they need to reload.

Enough pontificating, show us the code!

We’re going to get into the details of how we’re implementing this in later articles. There’s a lot we can talk about, so if you have any questions you should drop us a line on Twitter.

2 thoughts on “Service-oriented Songkick

  1. Pingback: 071 RR Zero Downtime Deploys

  2. Pingback: SickBiscuit : A simple service oriented architecture using Ruby and Go