Ingredients for a healthy Android codebase

Getting started in Android development is pretty straightforward, there are plenty of tutorials and documentation provided by Google. But Google will teach you to build a tent, not a solid sustainable house. As it’s still a very young platform with a very young community, the Android world has been lacking some direction on how to properly architect an app. Recently, some teams have started to take the problem more seriously, with the shiny tagline “Clean architecture for Android”.

At Songkick, we had the chance to rebuild the Android client from scratch 7 months ago. The previous version was working very well but the codebase had not been touched for almost 3 years, which was leaving us with old practices, old libraries, and Eclipse. We wanted to take a good direction so we spent a week designing the general architecture of the app. So we tried to apply the following principles from Uncle Bob’s clean architecture:

Systems should be

  • Independent of Frameworks. The architecture does not depend on the existence of a particular library. This allows you to use such frameworks as tools, rather than having to design your system around their limited constraints.
  • Testable. The business rules can be tested without the UI, Database, Web Server, or any other external element.
  • Independent of UI. The UI can change easily, without changing the rest of the system. A Web UI could be replaced with a console UI, for example, without changing the business rules.
  • Independent of Database. You can swap out Oracle or SQL Server, for Mongo, BigTable, CouchDB, or something else. Your business rules are not bound to the database.
  • Independent of any external agency. In fact your business rules simply don’t know anything at all about the outside world.

…and this is what we ended up with:

Screen Shot 2016-02-25 at 14.21.45

Layers

Data layer

The data layer acts as a mediator between data sources and the domain logic. It should be a pure Java layer. We divide the data layer in different buckets following the repository pattern. In short, a repository is an abstract layer that isolates business objects from the data sources.

Screen Shot 2016-02-25 at 14.23.01

For example it can expose a searchArtist() method but the domain layer will not (and should not) know where the data is coming from. In fact one day we could swap the data source from a database to a web API and the domain layer will not see the difference.

When the data source is the Songkick REST API, we usually follow the format of the endpoint to know where data access belongs. That way we have a UserRepository, an ArtistRepository, an EventRepository, and so on.

Domain layer

The role of the domain layer is to orchestrate the flow of data and offer its services to the presentation layer. The domain layer is application specific, this is where the core business logic belongs. It is divided in use cases. A use case should not be directly linked to any external agencies and it should also be a pure Java layer.

Presentation layer

At the top of the stack, we have the presentation layer which is responsible for displaying information to the user.

That’s where things get tricky because of this class:

Screen Shot 2016-02-25 at 14.25.02

When I started developing for Android, I found that an Activity is a very convenient place where everything can happen:

  • it’s tied to the view lifecycle
  • it can receive user inputs
  • it’s a Context so it gives access to many data sources (ContentResolver, SharedPreferences, …)

Adding on top of that, most of the samples provided by Google have everything in an Activity, what could go wrong? If you follow that pattern I can guarantee that your Activity will be huge and untestable.

We took the decision to consider our activities/fragments as views and make them as dumb as possible. The view related logic lives in presenters that communicate with the domain layer. Presenters should only have simple logic related to presentation of the data, not to the data itself.

Models vs. View models

This architecture is moving a lot of logic away from the presentation layer but there is one last thing that we didn’t consider: models. Models that we get from the data sources are very rarely what we want to display to the user. It’s very common to do some extra treatment just before binding the data to the view. We’ve seen some apps that have 300 lines of code onBindViewHolder(), resulting in very slow view recycling. This is unacceptable, why would you want to add additional overhead to your process on the main thread? Why not move that overhead to the same background thread you used to fetch the data?

In the Songkick Android app, the presentation layer barely know what the original model is. It only deals with view models. A view model is the view representation of the content your data layer fetched. In the domain layer, each use case has a transformer that converts models to view models. To respect the clean architecture rules, the presentation layer provides the transformer to the domain layer and the domain layer uses it without really knowing what it does.

So say that you have the following Artist model:

Screen Shot 2016-02-25 at 14.32.42

If we just want to show the name and if the artist is on tour, our ArtistViewModel is as follow:

Screen Shot 2016-02-25 at 14.32.32

So that we can efficiently bind it to our view:

Screen Shot 2016-02-25 at 14.32.19

Communication

To communicate between these layers, we use RxJava by:

  • exposing Observables in repositories
  • exposing methods to subscribe/unsubscribe to an Observable that emits ViewModels in the use case
  • subscribing/unsubscribing to the use case in the Presenter

Structure

To structure our app we are using Dagger in the following way:

Screen Shot 2016-02-25 at 14.28.59

Repositories are unique per application as they should be stateless and shared across activities. Use cases and presenters are unique per Activity/Fragment. Presenters are stateful and should be linked to a unique Activity/Fragment.

We are also trying to follow the quote by Erich Gamma:

“Program to an interface, not an implementation”

  • It decouples the client from the implementation
  • It defines the vocabulary of the collaboration
  • It makes everything easier to test

Testing

Most of the pieces in this stack are pure Java classes. So they should be ready for unit testing without Robolectric. The only bit that needs Robolectric would be the Activity/Fragment.

We usually prefer testing the presentation layer with pure UI tests using Espresso. The good thing is that we can just mock the data layer to expose observables emitting entities from a JSON file and we’re good to go:

Screen Shot 2016-02-25 at 14.30.07

Of course there are drawbacks to only testing the domain and presentation layer without checking if it’s compliant with the external agencies, but we generally found that tests were much more stable and very accurate with that pattern. End-to-end tests are also valuable and we could imagine adding a separate category running through some important user journeys by providing the default sources to our data layer.

Conclusion

We’ve now run the new app for 4 months and it appeared to be very stable and very maintainable. We’re also in a great place with a good test coverage on both unit and UI tests. The codebase is pretty scalable when it comes to add new features.

Although it works for us, we are not saying that everyone should go for this architecture. We’re just at the first iteration of “Clean architecture” for Android, and are looking forward to seeing what it will be in the future.

Here’s a link to the talk I gave about the same topic: https://youtu.be/-oZswd1j5H0 (slides: https://speakerdeck.com/romainpiel/ingredients-for-a-healthy-codebase)

References

Uncle Bob’s clean architecturehttp://fernandocejas.com/2014/09/03/architecting-android-the-clean-way
https://github.com/android10/Android-CleanArchitecture
Martin Fowler – The repository pattern
Erich Gamma – Design Principles from Design Patterns

Move fast, but test the code

At Songkick we believe code only starts adding value when it’s out in production, and being used by real users. Using Continuous Deployment helps us ship quickly and frequently. Code is pushed to Git, automatically built, checked, and if all appears well, deployed to production.

Automated pipelines make sure that every release goes through all of our defined steps. We don’t need to remember to trigger test suites, and we don’t need to merge features between branches. Our pipeline contains enough automated checks for us to be confident releasing the code to production.

However, our automated checks are not enough to confirm if a feature is actually working as it should be. For that we need to run through all our defined acceptance criteria and implicit requirements, and see the feature being used in the real world by real users.

In a previous life we used to try and perform all of our testing in the build/test/release pipeline. Not only was this slow and inefficient, dependent on lots of different people to be available at the same time, but often we found that features behaved very differently in production. Real users do unexpected things and it’s difficult to create truly realistic test environments.

Our motivation to get features out to real users as quickly as possible drove our adoption of Continuous Deployment. Having manual acceptance testing within the release pipeline slowed us down and made processes unpredictable. It was hard to define a process that relied on so many different people. We treated everyday events such as meetings and other work priorities as exceptional events which made things even more delay-prone and frustrating.

Eventually we decided that the build and release pipeline must be fully automated. We wanted developers to be able to push code and know that if Jenkins passed the build, it was safe for them to deploy to production. Attempting to automate all testing is never going to be achievable, or desirable. Firstly, automated tests are expensive to build and maintain. Secondly, testing, as opposed to checking, is not something that can be automated.

When we check something we are comparing the system against a known outcome. For example checking a button launches the expected popup when clicked, or checking a date displays in the specified format. Things like this can be, and should be automated.

Testing is more involved and relies on a human making a judgement. Testing involves exploring the system in creative ways in order to discover the things that you forgot about, the things that are unexpected, or difficult to completely define. It’s hard to predict how time and specific data combinations will affect computer systems, testing is a good way to try and uncover what actually happens. Removing the constraint of needing fully defined expected outcomes allows us to explore the system as a user might.

In practical terms this means running automated checks in our release pipeline and performing testing before code is committed, and post release. Taking testing out of the release pipeline removes the time pressures and allows us freedom to test everything as deeply as we require.

Songkick's Test and Release Process

Songkick’s Test and Release Process

Small, informal meetings called kick-offs help involve everyone in defining and designing the feature. We discuss what we’re building and why, plan how to test and release the code, and consider ways to measure success. Anything more complicated than a simple bug fix gets a kick-off before we start writing code. Understanding the context is important for helping us do the right thing. If we know that there are deadlines or business risks associated then we’re likely to act differently from a situation than has technical risks.

Coming out of the kick-off meeting we know how risky we consider the feature to be. We will have decided on the best approach to testing and releasing the code. As part of developing the feature we’ll also write or update our automated checks to make sure we don’t break the feature further down the line. Our process is intentionally flexible to allow us to treat each change appropriately depending on risk and need to ship.

Consider a recently released feature to store promoter details against ticket allocations as an example. The feature kick-off meeting identified risks and we discussed what and how to test the feature. We identified ways to break down the work into smaller pieces that could be developed and released independently; each hidden behind a feature flipper to keep it invisible from real users.

Developers and testers paired together to decide on specific areas to test. The tester’s testing expertise, and the developer’s deep understanding of the code feed into an informal collection of test ideas based on risk. Usually these are represented in a visual mind map for easy reference.

The developers, guided by the mind map, tested the feature and added automated unit and integration tests as they went. Front-end changes were overseen by a designer working closely with one of the developers to come up with the best, feasible, design. Once we had all the pieces of the feature the whole team jumped in to do some testing, and update our automated acceptance tests.

The feature required a bit of data backfilling so the development team were able to use the functionality in production, in ways we expect real users to use it. Of course we found some bugs but by working with small releases we were able to quickly locate the source of the problem. Fast release pipelines allow fixes to be deployed within minutes, making the cost of most bugs tolerably low.

Once the feature had been fully released and switched on for all users we used monitoring to check for unexpected issues. Reviewing features after a week or two of real world usage allows us to make informed decisions about the technical implementation and user experience. Taking the time to review how much value features are adding allows us to quickly spot and respond to problems.

Testing a feature involves many experts. Testers must be on hand to aid the developers in their testing, often by creating a mindmap of test ideas to guide testing. We try to use our previous experience of releasing similar features to focus the testing on areas that are typically complex or easy to break. Designers and UX people get involved to make sure the UX works as hoped, and the design looks good on all our supported devices and browsers. Product managers make sure the features actually do what they want them to do. High risk features have additional deep testing from the test team and in certain cases we throw in some focused performance or security testing.

Most of our bugs come from forgetting use cases or not understanding existing functionality in the system. Testing gives us a chance to use the system in an investigative way to hopefully find these bugs. Moving testing outside of our release pipeline gives us space to perform enough testing for each feature whilst maintaining a fully automated, and fast, release pipeline.

Apple tvOS Tech Talks, London 2016

Apple tvOS Tech Talks
London 2016
by Michael May

opening-slide

As part of Apple’s plan to get more apps onto the Apple TV platform they instigated one of their irregular Tech Talks World Tours. It came to London on January 11th 2016 and I got a golden ticket to attend the one day event.

The agenda for the day was

Apple TV Tech Talks Kickoff
Designing for Apple TV
Focus Driven Interfaces with UIKit
Break
Siri Remote & Game Controllers
On-Demand Resources & Data Storage
Lunch
Media Playback
Leveraging TVML for Media Apps
Best Practices for Designing tvOS Apps
Break
Tuning Your tvOS App
Making the Most Out of the Top Shelf
App Store Distribution
Reception

All sample code was in Swift, as you might expect, but they made a point of saying that you can develop tvOS apps in Objective-C, C++, and C too. I think these are especially important for the gaming community where frameworks such as Unity are so important (despite Metal and SpriteKit).

I won’t go through each session, as I don’t think that really serves any useful purpose (the videos will be released, so I am told). Instead I’ll expand on some of my notes from the day, as they were the points I thought were interesting.

The day started with a brief intro session that included a pre-amble about how TV is so entrenched in our lives and yet so behind the times. This led into a slide that simply said…

future-of-tv

“The Future of TV is Apps”

That’s probably the most bullish statement of intent that I’ve heard from Apple, so far, about their shiny new little black box. I think that if we can change user behaviour in the coming months and years then I might agree (see my piece at the end).

Then they pointed out that, as this is the very first iteration of this product, there are no permutations to worry about – the baseline for your iOS app might be an iPhone 4S running iOS 8 but for tvOS it’s just the latest and greatest – one box, one OS.

This is a device for which you can assume

  • It is always connected (most of the time)
  • It has a high speed connection (most of the time)
  • It has a fast dual-core processor
  • It has a decent amount of memory
  • It has a decent amount of storage (and mechanisms for maintaining that)

They then went on to explain that the principles for a television app are somewhat different from a phone app. Apple specifically called out three principles that you should consider when designing your app.

  • Connected
    Your users must feel connected to the content of your app. As your app is likely some distance from the user, with no direct contact between finger and content, this is a different experience from touching the glass of an iPhone UI.
  • Clear
    Your app should be legible and the user should never get lost in the user interface. If the user leaves the room for a moment then comes back, can they pick up where they left off?
  • Immersive
    Just like watching a movie or TV series, your app should be wholly immersive whilst on-screen.

If you had said these things to me casually, I would have probably said, “well, yeah, obviously” but when you have it spelled out to you, it gives you pause for thought;

“If I did port my app, how would I make an experience that works with the new remote and also makes sense on everything from a small flat-screen in a studio flat to an insanely big projector in a penthouse.”

Add to that the fact that the TV is a shared experience – from watching content together to just allowing different users to use your app at different times – it’s not the intimate experience we have learned to facilitate on iOS. It should still be personal, but it’s not personal to the same person all the time. Think of Netflix with their user picker at startup, or the tvOS AirBnB app with it’s avatar picker at the bottom of the screen.

Next was the Siri Remote and interactions via it. This is one complex device packed in a deceptively small form factor – from the microphone to the trackpad, gyroscope and accelerometer, this is not your usual television remote. We can now touch, swipe, swing, shake, click and talk to our media centre. The exciting thing for us as app developers is that almost all of this is open for us to use, either out of the box (for apps) or as custom interactions from raw event streams (particularly useful for games).

As you might expect from Apple, they were keen to stress that there were expectations for certain buttons that you should respect. Specifically, the menu and play/pause buttons. I like that they are encouraging conformity – it’s very much what people expect from Apple, but found it a bit silly when demonstrating how one might use the remote in landscape as a controller for a racing game. This, to me, felt a bit like dogma. If you want this to become a great gaming device, accept the natural limitations of the remote and push game controllers as the right choice here. Instead they kept going on about the remote and controllers being first class citizens in all circumstances.

Speaking to an indie game developer friend about the potential of the device, he said that he would really like three things from Apple, at least, before hopping on board;

  • Stats on Apple TV sales to evaluate the size of the market
  • A games pack style version that comes with two controllers to put the device on a par with the consoles
  • Removal of the requirement to support the remote as an option in games. Trying to design a game that must also work with the remote is just too limiting and hopefully Apple will realise this as they talk to more games companies.

A key component of the new way of interacting with tvOS (versus iOS) is the inability to set the focus for the user. Instead you guide the “focus engine” as it changes the focus for the user, in response to their gestures. This gives uniformity, again, and also means that apps cannot become bad citizens and switch the focus under the user. One could imagine the temptation to do this being hard to resist for some kinds of apps – breaking news or the latest posts in a social stream, perhaps.

Instead you use invisible focus guides between views and focusable properties on views to help the engine know what the right thing to do is. At one point in the presentations the speaker said

“Some people think they need a cursor on the Apple TV…they are wrong”

It seems clear to me that the focus engine is designed specifically to overcome this kind of hack and is a much better solution. If you’ve ever tried to use the cursor remote on some “Smart” TV’s then you’ll know how that feels. If not, imagine a mouse with a low battery after one too many happy hour cocktails.

With the expansive, but still limited resources of the Apple TV hardware, there will be times when there simply is not enough storage for everything that the user wants to install. The same in fact, holds true for iOS already. Putting aside my rant about how cheap memory and storage are and how much Apple cash-in on both by making them premium features, their solution is On-Demand Resources (ODR).

With ODR you can mark resources as being one of three types which change when, and if, they are downloaded, and how they may be purged under low resource conditions. Apple want you to bundle up your resources (images, videos, data, etc, but not code) into resource packs and to tag them. You tag them as either

  • Install
  • Prefetch
  • Download only on demand

Install come bundled with the app itself (splash screen, on-boarding, first levels, etc). Prefetch are downloaded automatically, but after launching the app and on demand are as you might expect – on demand from the app. On demand can be purged, using various heuristics as to how likely they are to affect the user/app – things like last accessed date and priority flags.

Although not talked about that much as far as I can tell, to me TVML is one of the big stories of tvOS. Apple have realised that writing a full blown native app is both expensive and overkill for some. If you’re all about content then you probably need little more than a grid of content to navigate, a single content drill down view and some play/pause of that streaming content. TVML gives you an XML markup language, powered by a JavaScript engine, that vends native components in a native app. It can interact with your custom app code too, through bridges between the JavaScript DOM and the native wrapper. This makes a lot of sense if you are Netflix, Amazon Prime Video, Mubi, Spotify or, as they pointed out, Apple Music and the tvOS App Store.

It highly specific but its highly specific to exactly the type of content Apple so desperately need to woo and who are likely wondering if they can afford to commit time and effort to an untested platform. As we’ve seen with the watchOS 2, developers are feeling somewhat wary of investing a lot of time in new platforms when they also have to maintain their existing ones, start moving to Swift, adopt the latest iOS 9 features, and so on.

I think this is a big deal because what Apple are providing is what so many third parties have been offering for years, to differing degrees of success. This is their Cordova, their PhoneGap or, perhaps most closely, their React Native. This is a fully Apple approved, and Apple supported, hybrid app development solution that your tranche of web developers are going to be able to use. If this ever comes to iOS it could open up apps to developers and businesses that just cannot afford a native app team, or the services of an app agency (assuming your business is all about vending content and you can live with a template look and feel). I think this could be really big in the future and in typical Apple fashion they are keeping it very low key for now.

They kept teasing that we were all there to find out how to get featured (certainly people were taking more photos there than anywhere else) but before that they spoke about tuning your apps for the TV. This included useful tricks and tips for the well documented frustrations of trying to enter text on the tvOS remote (make sure to mark email fields as such  – Apple will offer a recently used email list if you do) to examples of using built-in technologies to share data instead of asking the user to do work.

To the delight of my friends who work there, they demonstrated the Not On The High Street App and it’s use of Bonjour to discover the users iPhone/iPad and push the product they want to sell into the basket of the app on that platform. From there the user can complete their purchase very quickly – something that would be fiddly to do on the TV (slow keyboard, no Apple Pay, no Credit Card scanner).

Next came another feature that I think could hint at new directions for iOS in the future – the top shelf. If the user choses to put your app in the top row of apps, then, when it’s selected, that app gets to run a top shelf extension that populates the shelf with static or dynamic image content. This is the closest thing to a Windows Phone live tile experience that we’ve seen so far and, as I say, I think it could signpost a future “live” experience for iOS too. A blend of a Today Widget and a Top Shelf Widget could be very interesting.

Finally came the session they were promising; App Store Distribution. The key take-aways for me were

  • Don’t forget other markets (after the US the biggest app stores are Japan, China, UK, Australia, Canada and Germany)
  • Keep your app title short (typing is hard on tvOS)
  • Spend time getting your keywords right (and avoid wasting space with things like plurals)
  • Let Apple know 3-4 weeks before a major release of your app (appstorepromotion@apple.com)
  • Make your app the very best it can be and mindful of the tvOS platform

top-ios-markets

Then it was on to a reception with some delicious canapés and a selection of drinks. This wasn’t what made it great though. What made it great were all the Apple people in the room, giving everyone time who wanted it. This was not the Apple of old and it was all the better for it. The more of this kind of interaction they can facilitate the stronger their platform will be for us.

The Future of TV is Apps?

I think the future of consumer electronics is a multi-screen ecosystem where the user interface and, of course the form factor itself, follows the function to which it is in service.

Clearly, the television could become a critical screen in this future. I believe that, even as we get new immersive entertainment and story-telling options (virtual reality, 3D, and who knows what else), the passive television experience will persist. Sometimes all you want to do is just sit back and be entertained with nothing more taxing than the pause button.

A TV with apps allows this but also, perhaps, makes this more complex. When all I want to do is binge on Archer, a system with apps might not be what I want to navigate. That being said, if all I want to do is binge on Archer, and this can be done with a simple “Hey Siri, play Archer from my last unplayed episode”, then it’s a step ahead of my passive TV of old. It had better know I use Netflix and it had better not log me out of Netflix every few weeks like the Fire TV Stick does.

If I then get a notification (that hunts for my attention from watch to phone to television to who knows what else) that reminds me I have to be in town in an hour and that there are problems on the Northern Line so I should leave extra time, I might hit pause, grab my stuff and head out. As I sit on the tube with 20 minutes to kill, I might then say “Hey Siri, continue playing Archer”.

Just as I get to my appointment I find my home has noticed a lack of people and gone into low power mode, via a push notification. If I want, I can quickly reply with my expected arrival home time, so that it can put on the heating in time and also be on high alert for anyone else in my house during that period.

I suspect most of these transactions are being powered by apps, not the OS itself, but I do not expect to interact with the apps in most cases anymore. Apps will become simply the containers for the means of serving me these micro-interactions as I need/want them.

One only has to look at the Media Player shelf, Notification Actions, Today Widgets, Watch Apps, Glances, Complications, 3D Touch Quick Actions, and now the tvOS Top Shelf to see that this is already happening and will only increase as time goes on. Your app will power multiple screen experiences and be tailored for each, with multiple view types, and multiple interactions. Sometimes these will be immersive and last for minutes or hours (games, movie watching, book reading, etc) but other times these be will be micro-interactions of seconds at most (reply to a tweet, check the weather, plan a journey, start a music stream, buy a ticket, complete a checkout). Apps must evolve or die.

That situation is probably a few years off yet, but in the more immediate term, if we want the future of TV to be apps (beyond simply streaming content) then users will need to be persuaded that their TV can be a portal to a connected world.

From playing games to checking the weather to getting a travel report, these are all things for which an apps powered TV could be very useful. It’s frequently on, always connected, and has a nice big screen on which to view what you want to know. Whether users find this easier than going to pick up their iPhone or iPad remains to be seen.

I think Apple see the Apple TV as a Trojan horse. Many years ago, Steve Jobs introduced the iMac as the centre of your digital world; a hub into which you plugged things. I think the Apple TV is the new incarnation of that idea – except the cables have now gone (replaced with the likes of HomeKit, AirPlay and Bonjour), the storage is iCloud and the customisation is through small, focused apps, and not the fully fledged applications of old.

It’s early days and if the iPhone has taught us anything it’s that the early model will rapidly change and improve. Where it actually goes is hard to say, but where it could go is starting to become clear.

Is the future of the TV apps? Probably so, but probably not in the way we think of apps right now. The app is dying, long live the app.

tour-pass

 

Posted in iOS

Recent talks on Songkick Engineering

Since I joined Songkick a little over four years ago, our development team has done some amazing things. Our technology, process and culture have improved an enormous amount.

We’ve always been eager to share our progress on this blog and elsewhere, and we often talk about what we’ve learned and where we are still trying to improve.

Here are some recent talks given by members of our team discussing various aspects of how we work.

How we do product discovery

A few weeks ago, I gave a talk at the Future of Web Apps conference on how we do product discovery at Songkick. I had such an overwhelming response to it that I thought it might be useful to share it with the rest of the world.

Apologies for the whack formatting, but SlideShare doesn’t support Keynote files and I didn’t have time to redo the slides in PowerPoint to include the notes in a better way.

I’d love to hear how you guys go about product discovery and any tips / tricks on how to do it better.

Testing iOS apps

We recently released an update to our iPhone app. The app was originally developed by a third-party, so releasing an update required bringing the app development and testing in-house. We develop our projects in a continuous build environment, with automated builds, unit and acceptance tests. It allows us to develop fast and release often, and we wanted the iPhone project to work in the same way.

This article covers some of the tools we used to handle building and testing the app.

Build Automation

We use Git for our version control system, and Jenkins for our continuous integration server. Automating the project build (i.e. building the project to check for compilation errors) seemed like a basic step and a good place to start.

A prerequisite to this was to create a Mac Jenkins Build Slave, which is outside of the scope of this blog post (but if you’re interested, I followed the “master launches slave agent via SSH” instructions of the Jenkins site).

A quick search of Jenkins plugins page revealed a Xcode plugin which allows for building Objective-C applications. Setting up the plugin was a snap – search and install the “XCode integration” plugin from the Jenkins server plugin page, point the plugin to your project directory on the build slave, enable keychain access, and save.

Now for every commit I made to the project, this task would automatically run, and send me a rude email if project compilation failed. In practice I found that this was an excellent way of reminding me of any files I had forgot to check-in to Git; the project would compile on my laptop but fail on the CI server due to missing classes, images, etc.

Unit testing

I looked briefly into the unit testing framework Apple provides, which ships with Xcode. I added a unit test project to the Songkick app, and looked into creating mocks using OCMock, an Objective-C implementation of mock objects.

We already have fairly extensive API tests to test for specific iPhone-related user-flows (such as signing up, tracking an artist, etc), and due to time constraints we opted to concentrate on building acceptance tests, and revisit unit tests if we had time.

Acceptance Testing

There are a bunch of acceptance testing applications available for iOS apps. Here’s a few of the tools I looked into in detail:

Frank

Frank is an iOS acceptance testing application which supports a Cucumber-style test syntax. I was interested in Frank as we already make use of Cucumber to test our Ruby projects, so the familiarity of the domain-specific language would have been a benefit.

I downloaded the project and got a sample test up-and-running fairly quickly. Frank ships with some useful tools, including a web inspector (“Symbiote”) which allows for inspecting app UI elements using the browser, and a “Frank console” for running ad-hoc commands against an iPhone simulator from the command line.

Frank seems to be a pretty feature rich application. The drawbacks for me were that Frank could not be run on real hardware (as of March 2013, this appears to now be possible), and Frank also requires recompiling your application to make a special “Frankified” version to work with the testing framework.

Instruments

Apple provides an application called Instruments to handle testing, profiling and analysis of applications written with Xcode. Instruments allows for recording and editing UIAutomation scripts – runnable JavaScript test files for use against a simulated iOS app or a real hardware install.

InstrumentsRecordingTest

Being able to launch your app with Instruments, perform some actions from within the app, and have those actions automatically converted into a runnable test script was a really quick and easy way of defining tests. Instruments also supports running scripts via the command line.

The drawback of test scripts created with Instruments is that they can be particularly verbose, and Instruments does not provide a convenient way of formatting and defining individual test files (outside of a single UIAutomation script per unique action).

Tuneup_js

Designed to be used as an accompaniment to UIAutomation scripts created using Instruments, Tuneup_js is a JavaScript library that helps to ease the pain of working with the long-winded UIAutomation syntax.

It provides a basic test structure for organising test steps, and a bunch of user-friendly assertions built on top of the standard ones supported by Instruments.

tuneup

I found that recording tests in Instruments, and then converting them into the Tuneup_js test syntax was a really quick way of building acceptance tests for iOS apps. These tests could then be run using a script provided with the Tuneup_js package.

Scenarios

I settled on using Instruments and Tuneup_js to handle acceptance testing. Instruments because of the ability to quickly record acceptance test steps, and Tuneup_js because it could be used to wrap recorded test steps into repeatable tests and allowed for a nicer test syntax than offered out-of-the-box with UIAutomation. What was missing with these applications was a way to handle running the test files in an easily repeatable fashion, and against the iOS simulator as well as hardware devices.

I couldn’t find an existing application to do this, so I wrote Scenarios (Scenar-iOS, see what I did there?) to handle this task. Scenarios is a simple console Ruby app that performs the following steps:

  • Cleans any previous app installs from the target test device
  • Builds the latest version of the app
  • Installs the app on the target test device
  • Runs Tuneup_js-formatted tests against the installed app
  • Reports the test results

Scenarios accepts command-line parameters, such as the option to target the simulator or a hardware device (with the option of auto-detecting the hardware, or supplying a device ID). Scenarios also adds a couple of extra functions on top of the UIAutomation library:

  • withTimout – Can be used for potentially long-running calls (e.g. a button click to login, where the API call may be slow):
    withTimeout(function(){
      app.mainWindow().buttons()["Login"].tap();
    });
  • slowTap – Allows for slowing-down the speed at which taps are executed. Instruments can run test steps very fast, and sometimes it helps to slow down tests to see what they are doing, and help create a more realistic simulated user experience:
    app.toolbar().buttons()["Delete"].slowTap();

Scenarios ships with a sample project (app and tests) that can be run using the simulator or hardware. Here’s a video of the sample running on a simulator:

Jenkins Pipeline

Now I had build and acceptance tests in place, it was time to hook the tests up to Jenkins. I created the following Jenkins projects:

  • “ios-app” – runs the build automation
  • “ios-app-acceptance-tests-simulator” – runs the app (via Scenarios) on a simulator
  • “ios-app-acceptance-tests-iPhone3GS” – runs the app (via Scenarios) on an iPhone 3GS

jenkins-pipeline

Committing a code change to the iOS app Git repo caused the projects in the Jenkins pipeline to build the app, run the acceptance tests against the simulator, and finally run the acceptance tests on an iPhone 3GS. If any stage of the pipeline failed, I received an email informing me I had broken something.

test-iphone

Manual testing with TestFlight

As well as an automated setup, we also made use of the excellent TestFlight service, which enables over-the-air distribution of apps to testers. We had 12 users and 16 devices set up in TestFlight, and I was releasing builds (often daily) over-the-air. It enabled us to get some real-user feedback on the app, something that build and acceptance tests cannot replace.

Jenkins also has a TestFlight plugin, which enables you to automatically deploy a build to TestFlight as part of the pipeline. Very cool, but as we were committing code changes often throughout the day (and only wanted to release to TestFlight once a day), we decided to skip this step for the time being.

Overall, I think that the tools (both open-source and proprietary) available today for automated testing of iOS apps are feature rich (even if some are still in their infancy), and I’m pretty happy with our development setup at Songkick.

Migrating to a new Puppet certification authority

At Songkick all our servers are managed using Puppet, an open source configuration management tool. We use it in client-server mode and recently had the need to replace the certification authority certificates on all our nodes. I couldn’t find much information on how to do this without logging onto every machine, so I’ve documented my method.

What is this Puppet CA anyway?

If you’re using puppet in its typical client-server or agent-master setup, then when the puppet master is first started it will create a certification authority (CA) which all clients that connect to it need to be trusted by and must trust. This usually happens transparently, so often people aren’t aware that this certification authority exists.

The CA is an attempt to have trust between the agents and the master, so that an attacker cannot set up malicious puppet masters and tell puppet agents to do his or her bidding and also prevent malicious clients being able to see configuration data for other clients. Agents should only connect to masters that have certificates signed by its CA and masters should only send configuration information to clients that have certificates signed by the same CA.

There’s a more comprehensive explanation of Puppet SSL written by Brice Figureau which goes into far more detail than we have space for. The main thing to understand is that the CA is an important part of maintaining security and that you can only have one CA across a set of machines that access the same secured resources.

Why would I want to migrate to a new CA?

  • Your current CA certificate is about to expire. By default, CA certificates have a validity period of 5 years, so fairly early adopters of puppet will need to replace them.
  • You’ve had multiple CAs in use and need to consolidate on one.
  • You believe that your certificates and private keys are in the hands of people who could cause mischief with them.
  • You have fallen foul of bugs relating to the fact that you use a CA created in an older version of puppet.

It was in fact the second of these reasons that applied to Songkick; we’d previously been using multiple puppet masters, each with their own CA. We wanted to start using exported resources, stored in the same PuppetDB instance for all nodes. This meant that each master needed to be trusted by the same CA that signed the PuppetDB instance; hence we needed to consolidate on one CA.

How do I do it?

Set up NEW puppet master(s)

Set up at least one new puppet master server, with a new CA certificate.

If you have a lot of existing hosts managed by puppet, then it’s worth considering enabling the autosign option, even if only temporarily, as you’ll have a number of certificate requests to approve manually otherwise.

Configure AGents to connect to THe new master(S)

We’re assuming here that you’re managing the puppet agent configuration through puppet. and that changes to the puppet configuration cause an automatic restart of the puppet agent.

Change the configuration of your puppet agents, to connect to the new master(s) and use a different ssldir:

[main]
server = <new server hostname> 
ssldir = /var/lib/puppet/ssl2

Be careful not to apply this change to your newly created puppet master.

Your clients should reconfigure themselves, restart and when they start up, connect to your new puppet master, forgetting their old ssl configuration, including the CA certificates.

If you have autodiscovery records for puppet in DNS, e.g. an A record for ‘puppet’ or the SRV records, then you should leave them in place for now. Agents that have not been migrated to the new CA may need it.

It is a good idea to test this on a handful of nodes and check that it works in a completely automated fashion before applying to every node.

Tidying up (part 1)

Once every node has checked in with the new master and been issued with a new certificate, it’s time to start the process of tidying up. It’s a good idea to revert back to using the default ssldir, so that when agents bootstrap themselves with the default config, they do not then switch to the new ssldir and thus forget their old certificates. This will cause the master to refuse to talk to them, as this looks like a spoofing attempt.

On each client, we mirror the new ssldir to the old one:

file { '/var/lib/puppet/ssl': 
  source => 'file:///var/lib/puppet/ssl2',
  recurse => true, 
  purge => true, 
  force => true, 
}

Be careful not to apply this change to your newly created puppet master.

Tidy up (part 2)

Once that’s shipped everywhere, we remove the ssldir configuration, fall back on the default ssldir and remove the above resource definiton to copy the ssldir.

Tidy up (part 3)

You can now update your autodiscovery DNS entries, to point to the new servers and remove the autosign configuration, if desired.

Finally, we ship a change to the clients that removes the temporary /var/lib/puppet/ssl2 directory.

And that’s it, everything has been migrated to the new CA, with no need to do anything outside of puppet.

Testing your database backups: the test environment database refresh pattern

When did you last try restoring your database backups? A month ago, a week ago? A year ago? Never? When was the last time you refreshed the data in your test environments? When I joined Songkick, one of the first things I asked was when we last tested a restore of our database backups. The answer, pleasingly, was at 03:00 UK time that morning and not coincidentally, that’s when we last refreshed the data in our test environments.

Here’s how we get the warm and fuzzy feeling of knowing that our backups contain data that can be restored and makes sense.

  1. Every morning, our database servers run their scheduled backups, copying the resulting images to a backup server in the data centre.
  2. Overnight those backups get copied to the office, giving us an offsite copy.
  3. In the small hours, when most of us are asleep, each of the database servers in our staging environment retrieve the backups, erase their local data files and then restore the production backups over the top of them.
  4. We perform sanitisation on the data, to make it suitable for use in a testing environment.
  5. And finally, and most importantly, we use the databases in our testing.

By doing this, we identified one case when our backups seemed to work, produced plausible looking backups, but MySQL failed to apply InnoDB log records during recovery. It was inconvenient to discover this problem in our staging environment, but far less inconvenient than discovering it when we needed the backups to put our production system back into operation.

Here are some practical tips based on our experience implementing and managing this system at Songkick:

Back all databases up at the same time

If your system is composed of services backed by independent databases on different machines, it’s possible that there’s some implicit consistency between them. For example, a common situation at Songkick is to have an accounts service responsible for storing user accounts and another service that stores user data keyed against a user, then there’s an expectation that those databases have some degree of consistency.

If you back them up at different times, you’ll find inconsistencies, that a service might have a reference to a user that doesn’t yet exist. If the ID of the user is exposed to other services and that ID can be reused, you may find that newly created users in your test environment have existing data associated with them and this can cause significant problems in testing.

It’s worth noting that, in the case of a production restore, these issues would need to be diagnosed and solved in the heat of the moment. By finding them in your test environment, you’re giving yourself the space to solve them earlier, under less pressure.

Design the backups to be regularly exercised

Some types of backups are more amenable to being restored regularly in test environments. For example, our initial MongoDB database backups performed snapshots of our MongoDB database path. These proved difficult to restore, because they included local databases which contained information on replica set membership. This means that on startup, our staging MongoDB server would forget its existing replica set membership and try to talk to the production servers instead.

We switched to using mongodump to take a logical export of the database, simply so that we could restore it on the primary member of our existing staging replica set and update the entire replica set.

Sanitisation tips

After we’ve restored the databases, there are certain things we do to make them safe and usable in our testing environments.

  • Remove or obfuscate email addresses. We’re not fond of accidentally emailing people with test events we’ve created in staging, so we change people’s email addresses to be unusable, so that can’t happen. We leave people’s email addresses alone if they work at Songkick, so we can test email features by emailing ourselves.
  • Remove or obfuscate payment tokens. If it’s uncool to accidentally email people, accidentally charging them is positively hostile. Anything that’s used for payment needs to be removed.
  • Fix or replace information about the environment. It’s best to avoid keeping references to your technical environment in the same database as your application data, but sometimes it’s tricky to workaround. For example, our MogileFS installation needs to be kept in sync with our production one, to avoid problems with missing media. This means that we need to manually update the database to substitute the hostnames of the mogilefs servers.

Write code that can withstand the database going away

Unless you’ve put some work in, almost no database driver will gracefully handle the disappearance of a database server and then its re-appearance some time later. If the restore in your test environment is the first time you’ve tried this, you may find that you need to manually restart services, even after the database re-appears on the network.

The solution will vary depending on the database client being used, but often it’s a case of catching an exception, or changing some options when you establish the connection.

By making your applications reconnect to the database with no manual input, you are again fixing a problem that will eventually occur in production – a much more stressful time for it to be diagnosed and fixed.

Summary

Testing your database backups by restoring them automatically and regularly in your test environments is a great way to battle-harden your backups and applications and to make sure that your test environment looks like the real production environment.


If you’ve liked what you’ve read, why not head over to our jobs page? We’re looking for a Systems Engineer to add more touches like these to our infrastructure.

Safely dealing with magical text

Boy, what a week it’s been. A remote-code-execution bug was discovered in Ruby on Rails, and we’ve all been scrambling to patch our servers (please patch your apps before reading any further, there is an automated exploit out there that gives people a shell on your boxes otherwise).

What the Ruby community, and those of other dynamic languages, must realize from recent Rails security blunders is that very similar problems can easily exist in any non-trivial web application. Indeed, I found a remote-execution bug in my own open-source project Faye yesterday, 3.5 years into the life of the project (again: patch before reading on).

There are a lot of lessons to be had from recent Rails security blunders, since they involve so many co-operating factors: excessive trust of user input, insufficient input validation and output encoding, the behavioural capabilities of Ruby objects and certain Rails classes, ignorance of cryptography and the computational complexity of data transport formats. In this post I’d like to focus on one in particular: safely encoding data for output and execution.

Ugh, do I have to?

I know, I know, booooooring, but so many people are still getting this really badly wrong and it continues punish end users by exposing their data to malicious manipulation.

Robert Hansen and Meredith Patterson have a really good slide deck on stopping injection attacks with computational theory. One core message in that paper is that injection exploits (including SQL injection and cross-site scripting) involve crafting input such that it creates new and unexpected syntactic elements in code executed by the software, essentially introducing new instructions for the software to execute. Let’s look at a simple example.

Learn you a query string

I found the code that prompted me to write this post while updating some Google Maps URLs on our site this afternoon. Some of this code was constructing URLs by doing something like this:

def maps_url(lat, lng, zoom, width, height)
  params = [ "center=#{lat},#{lng}",
             "zoom=#{zoom}",
             "size=#{width}x#{height}" ]

  "http://maps.google.com/?" + params.join("&amp;")
end

maps_url(51.4651204, -0.1148897, 15, 640, 220)

# => "http://maps.google.com/?center=51.4651204,-0.1148897&amp; ...
#                             zoom=15&amp; ...
#                             size=640x220"

You can see the intent here: whoever wrote this code assumes the URL is going to end up being embedded in HTML, and so they have encoded the query string delimiters as &amp; entities. But this doesn’t fix the problem entities are designed to solve, namely: safely representing characters that usually have special meaning in HTML. What is telling is that the comma in the query string should really also be encoded as %2C, but isn’t.

So although the ampersands are being encoded, the actual query data is not, and that means anyone calling this function can use it to inject HTML, for example:

link = '<a href="' +
           maps_url(0, 0, 1, 0, '"><script>alert("Hello!")</script>') +
           '">Link text</a>'

# => '<a href="http://maps.google.com/?center=0,0&amp; ...
#                                      zoom=1&amp; ...
#                                      size=0x"> ...
#         <script>alert("Hello!")</script> ...
#         ">Link text</a>'

By abusing the maps_url() function, I have managed to inject characters with special meaning — <, >, etc. — into the output and thereby added new HTML elements to the output that shouldn’t be there. By passing unexpected input I’ve created a lovely little cross-site scripting exploit and stolen all your users’ sessions!

Note that you cannot cleanly fix this by using an HTML-escaping function like ERB::Util.h() on the output of maps_url(), because this would serve to re-encode the ampersands, leaving strings like &amp;amp; in the href attribute.

Stacks of languages

Meredith Patterson of the above-linked paper gave another presentation at 28C3 called The Science of Insecurity. I’ve been telling absolutely everyone to watch it recently, so here it is.

This talk describes how we should think of data transfer formats, network protocols and the like as languages, because in fact that’s what they are. It covers the different levels of language power – regular languages, context-free languages and Turing-complete languages – and how use of each affects the security of our systems. It also explains why, if your application relies on Turing-complete protocols, it will take an infinite amount of time to secure it.

When you build HTML pages, you are using a handful of languages that all run together in the same document. There’s HTML itself, and embedded URLs, and CSS, and JavaScript, and JavaScript embedded in CSS, and CSS selectors embedded in CSS and JavaScript, and base64 encoded images, and … well this list is long. All of these are languages and have formal definitions about how to parse them, and your browser needs to know which type of data it’s dealing with whenever it’s parsing your code.

Every character of output you generate is an instruction that tells the browser what do next. If it’s parsing an HTML attribute and sees the " character, it truncates the attribute at that point. If it thinks it’s reading a text node and sees a <, it starts parsing the input as an HTML tag.

Instead of thinking of your pages as data, you should think of them as executable language.

Back to reality

Let’s apply this idea to our URL:

http://maps.google.com/?center=51.4651204,-0.1148897&amp;zoom=15&amp;size=640x220

Outside of an HTML document, the meaning of this list of characters changes: those &amp; blobs only have meaning when interpreting HTML, and if we treat this query string verbatim we get these parameters out:

{
  'center'   => '51.4651204,-0.1148897',
  'amp;zoom' => '15',
  'amp;size' => '640x220'
}

(This assumes your URL parser doesn’t treat ; as a value delimiter, or complain that the comma is not encoded.)

We’ve seen what happens when we embed HTML-related characters in the URL: inserting the characters "> chops the <a> tag short and allows injection of new HTML elements. But that behaviour comes from HTML, not from anything about URLs; when the browser is parsing an href attribute, it just reads until it hits the closing quote symbol and then HTML-decodes whatever it read up to that point to get the attribute value. It could be a URL, or any other text value, the browser does not care. At that level of parsing, it only matters that the text is HTML-encoded.

In fact, you could have a query string like foo=true&bar="> and parsing it with a URL parser will give you the data {'foo' => 'true', 'bar' => '">'}. The characters "> mean something in the HTML language, but not in the query string language.

So, we have a stack of languages, each nested inside the other. Symbols with no special meaning at one level can gain meaning at the next. What to do?

Stacks of encodings

What we’re really doing here is taking a value and putting it into a query string inside a URL, then putting that URL inside an HTML document.

                                +-------------------------+
                                | "51.4651204,-0.1148897" |
                                +------------+------------+
                                             |
    +----------------------------------------|--------+
    |                                +-------V------+ |
    | http://maps.google.com/?center=| centre_value | |
    |                                +--------------+ |
    +------------------------+------------------------+
                             |
                       +-----V-----+
              <a href="| url_value |">Link>/a>
                       +-----------+

At each layer, the template views the value being injected in as an opaque string — it deosn’t care what it is, it just needs to make sure it’s encoded properly. The problem with our original example is that it pre-emptively applies HTML encoding to data because it anticipates that the value will be used in HTML, but does not apply encodings relevant to the task at hand, namely URL construction. This is precisely backwards: considering the problem as above we see that we should instead:

  1. Decide what type of string we’re creating — is it a URL, an HTML doc, etc.
  2. Apply all encoding relevant to the type of string being made
  3. Do not apply encodings for languages further up the stack

In other words, we should make a URL-constructing function apply URL-related encoding to its inputs, and an HTML-constructing function should apply HTML encoding. This means each layer’s functions can be recombined with others and still work correctly, becasue their outputs don’t make assumptions about where they will be used. So we would rewrite our code as:

def maps_url(lat, lng, zoom, width, height)
  params = { "center" => "#{lat},#{lng}",
             "zoom"   => zoom,
             "size"   => "#{width}x#{height}" }

  query = params.map do |key, value|
    "#{CGI.escape key.to_s}=#{CGI.escape value.to_s}"
  end
  "http://maps.google.com/?" + query.join("&")
end

url = maps_url(51.4651204, -0.1148897, 15, 640, 220)

# => "http://maps.google.com/?center=51.4651204%2C-0.1148897& ...
#                             zoom=15& ...
#                             size=640x220"

html = '<a href="' + ERB::Util.h(url) + '">Link</a>'

# => '<a href="http://maps.google.com/?center=51.4651204%2C-0.1148897&amp; ...
#                                      zoom=15&amp; ...
#                                      size=640x220">Link</a>'

Now we see that we get two valid pieces of data: url is a valid URL with all its query parameters correctly encoded but no HTML entities present, and html is a valid HTML fragment with its attributes correctly entity-encoded.

Also, note how we have treated all incoming data as literal (i.e. not already encoded for the task at hand), and we have not hand-written any encoding ourselves (e.g. hand-writing entities like &amp;). You should deal with data assuming it contains the literal information it represents and use library functions to encode it correctly. There’s a very good chance you don’t know all the text transformations required by each layer.

Thinking in types

At this point you’re probably thinking that I’ve made something quite simple seem very complicated. But thinking in terms of types of strings, treating your output as a language stack and following the bullet list above is a good discipline to follow if you want to make sure you handle data safely.

There are some systems that do this for you, for example Rails 3 automatically HTML-escapes any value you insert into an ERB template by default. I’m working on a more general version of this idea: Coping is a templating language that checks your templates conform to the language you’re producing, and doesn’t let input introduce new syntactic elements.

If you’re feeling very brave, I recommend taking the Coursera Compilers course. Although it doesn’t seem immediately relevant to web devs, many concepts from parser theory, type checking and code generation can be applied to security and are well worth learning.

Above all, learn from other people’s security failures and consider where you may have made similar mistakes.

Introducing Aspec: A black box API testing DSL

Caltrak is the service that stores Songkick users’ tracked artists and cities. It has no other service dependencies. You put data into the Caltrak box, then you get it back out.

For instance, you might make two POST requests to store artist trackings, and then want to retrieve them, which would look like this:

# create and retrieve artist trackings
POST /users/7/artists/1    204
POST /users/7/artists/2    204
 GET /users/7/artists      200    application/json   [1, 2]

Did you understand basically what that was saying? I hope so, because that’s an executable spec from the Caltrak tests.

It’s pretty simple. Every line is both a request and an assertion. Every line says “If I make this request then I expect to get this back”.

This works because the behaviour of this service can be entirely described through the REST API. There are no “side affects” that are not visible through the API itself.

Here is a longer portion from the aspec file.

# no users have pending notifications
   GET /users/with-pending-notifications                200  application/json  []

# users with events on their calendar have pending notifications
  POST /users/764/metro-areas/999                       204
  POST /users/764/artists/123                           204
  POST /events/5?artist_ids=123&metro_area_id=999       204
  POST /events/5/enqueue-notifications                  204
   GET /users/with-pending-notifications                200  application/json  [[764, "ep"]]

# users are unique in the response
  POST /users/764/artists/123                           204
  POST /users/764/artists/456                           204
  POST /users/764/metro-areas/999                       204
  POST /events/5?artist_ids=123,456&metro_area_id=999   204
  POST /events/5/enqueue-notifications                  204
   GET /users/with-pending-notifications                200  application/json  [[764, "ep"]]

Some aspects:

  • Each line has the format Verb, Url (with Params), Status, Content Type, Body separated by whitespace. These are the only things that can be asserted about the service responses.
  • Each “paragraph” is a separate test. The database is cleared in-between.
  • Lines beginning with # are comments.
  • Aspec stubs time, so that the first line of the test occurs precisely on the epoch and each subsequent line occurs 2s after that. This allows us to test responses with creation timestamps in them.

Motivation

When we began developing Caltrak, I wasn’t happy with the process of writing tests for this service.

I wanted the test framework to expose the simple nature of the API. You could make something almost as simple in RSpec or Cucumber with judicious use of helpers and so on, but you would end up with a DSL that obscured the underlying REST API.

In an Aspec file, there is no syntax that does not express real data either sent or received from the service. You’re basically writing down the actual HTTP requests and responses with lots of parts omitted. It is technical, but it is very readable. I think it is better documentation than most service tests.

Also, there is no context that is not immediately visible, as there might be with nested RSpec contexts, for example, where in a large test file the setup may be very distant from the test and assertion.

Implementation

NB This project is very immature. Use at your own risk.

Aspec assumes your project uses Rack, and uses Rack/Test to talk to it. The code is published on GitHub and there is a tiny example API project.

It is very similar to rspec in operation. You write a .aspec file, and put an aspec_helper.rb next to it.

Then run

aspec aspec/my_service.aspec

I’d be interested in hearing your thoughts on this testing style.