At Songkick we believe code only starts adding value when it’s out in production, and being used by real users. Using Continuous Deployment helps us ship quickly and frequently. Code is pushed to Git, automatically built, checked, and if all appears well, deployed to production.
Automated pipelines make sure that every release goes through all of our defined steps. We don’t need to remember to trigger test suites, and we don’t need to merge features between branches. Our pipeline contains enough automated checks for us to be confident releasing the code to production.
However, our automated checks are not enough to confirm if a feature is actually working as it should be. For that we need to run through all our defined acceptance criteria and implicit requirements, and see the feature being used in the real world by real users.
In a previous life we used to try and perform all of our testing in the build/test/release pipeline. Not only was this slow and inefficient, dependent on lots of different people to be available at the same time, but often we found that features behaved very differently in production. Real users do unexpected things and it’s difficult to create truly realistic test environments.
Our motivation to get features out to real users as quickly as possible drove our adoption of Continuous Deployment. Having manual acceptance testing within the release pipeline slowed us down and made processes unpredictable. It was hard to define a process that relied on so many different people. We treated everyday events such as meetings and other work priorities as exceptional events which made things even more delay-prone and frustrating.
Eventually we decided that the build and release pipeline must be fully automated. We wanted developers to be able to push code and know that if Jenkins passed the build, it was safe for them to deploy to production. Attempting to automate all testing is never going to be achievable, or desirable. Firstly, automated tests are expensive to build and maintain. Secondly, testing, as opposed to checking, is not something that can be automated.
When we check something we are comparing the system against a known outcome. For example checking a button launches the expected popup when clicked, or checking a date displays in the specified format. Things like this can be, and should be automated.
Testing is more involved and relies on a human making a judgement. Testing involves exploring the system in creative ways in order to discover the things that you forgot about, the things that are unexpected, or difficult to completely define. It’s hard to predict how time and specific data combinations will affect computer systems, testing is a good way to try and uncover what actually happens. Removing the constraint of needing fully defined expected outcomes allows us to explore the system as a user might.
In practical terms this means running automated checks in our release pipeline and performing testing before code is committed, and post release. Taking testing out of the release pipeline removes the time pressures and allows us freedom to test everything as deeply as we require.
Small, informal meetings called kick-offs help involve everyone in defining and designing the feature. We discuss what we’re building and why, plan how to test and release the code, and consider ways to measure success. Anything more complicated than a simple bug fix gets a kick-off before we start writing code. Understanding the context is important for helping us do the right thing. If we know that there are deadlines or business risks associated then we’re likely to act differently from a situation than has technical risks.
Coming out of the kick-off meeting we know how risky we consider the feature to be. We will have decided on the best approach to testing and releasing the code. As part of developing the feature we’ll also write or update our automated checks to make sure we don’t break the feature further down the line. Our process is intentionally flexible to allow us to treat each change appropriately depending on risk and need to ship.
Consider a recently released feature to store promoter details against ticket allocations as an example. The feature kick-off meeting identified risks and we discussed what and how to test the feature. We identified ways to break down the work into smaller pieces that could be developed and released independently; each hidden behind a feature flipper to keep it invisible from real users.
Developers and testers paired together to decide on specific areas to test. The tester’s testing expertise, and the developer’s deep understanding of the code feed into an informal collection of test ideas based on risk. Usually these are represented in a visual mind map for easy reference.
The developers, guided by the mind map, tested the feature and added automated unit and integration tests as they went. Front-end changes were overseen by a designer working closely with one of the developers to come up with the best, feasible, design. Once we had all the pieces of the feature the whole team jumped in to do some testing, and update our automated acceptance tests.
The feature required a bit of data backfilling so the development team were able to use the functionality in production, in ways we expect real users to use it. Of course we found some bugs but by working with small releases we were able to quickly locate the source of the problem. Fast release pipelines allow fixes to be deployed within minutes, making the cost of most bugs tolerably low.
Once the feature had been fully released and switched on for all users we used monitoring to check for unexpected issues. Reviewing features after a week or two of real world usage allows us to make informed decisions about the technical implementation and user experience. Taking the time to review how much value features are adding allows us to quickly spot and respond to problems.
Testing a feature involves many experts. Testers must be on hand to aid the developers in their testing, often by creating a mindmap of test ideas to guide testing. We try to use our previous experience of releasing similar features to focus the testing on areas that are typically complex or easy to break. Designers and UX people get involved to make sure the UX works as hoped, and the design looks good on all our supported devices and browsers. Product managers make sure the features actually do what they want them to do. High risk features have additional deep testing from the test team and in certain cases we throw in some focused performance or security testing.
Most of our bugs come from forgetting use cases or not understanding existing functionality in the system. Testing gives us a chance to use the system in an investigative way to hopefully find these bugs. Moving testing outside of our release pipeline gives us space to perform enough testing for each feature whilst maintaining a fully automated, and fast, release pipeline.