A Day with Cypress Part 2

For those of you who are here because you read Part 1 of A Day with Cypress, welcome. For those who haven’t yet read it, why not head back and join me from the beginning of this journey. If that’s not for you, and you like to go rogue, welcome, but don’t expect a ‘last time on…’ style intro, you’re coming in hot!

In Part 1 we stopped before getting to points 2 and 3 of the list of objectives I’d set myself on that fateful day. So we’ll rectify that in Part 2 by covering item 2 in the list.

  1. To put a Cucumber implementation on top to author the scripts
  2. To look again at stubbing calls at runtime
  3. To look at using snapshots instead of explicit assertions

One of the fantastic features of Cypress, and a real USP when comparing it to a Selenium based library, is that it runs within the browser and has access to the javascript that your application is built from, at test time. This provides users with a whole host of possibilities to solve testability challenges, for which we would otherwise need to find alternative and potentially clunkier solutions. Injecting test data is one of those testability challenges. I’m going to guess that most software testers can recall a time in their career when they’ve hit their heads against a brick wall trying to get good test data in to their application.

It would appear the team at Cypress have experienced this themselves because it’s one of the most useful parts of their library. Cypress comes with the ability to intercept and replace some requests at test time and above all it’s really simple to start using. Let’s look at an example.

Testing the UI or Through the UI

Being cheeky I’ve picked the Royal Mail Find a Postcode feature to demonstrate this. When using the feature, as you type in to the text box, each letter you type will initiate an XHR request which, based on the post code you’ve begun to enter, returns a suggested list of addresses to show to the user.

As the user types in to the text box, we see that each letter typed sends a new request for suggested addresses.

When it comes to testing this feature, I might consider the following sample of identified risks:

  1. When I type, the expected request isn’t sent to the service
  2. When we get a response, it doesn’t contain useful information
  3. The UI doesn’t render useful information appropriately

Risk 2 doesn’t need the UI to explore and verify. I could quite happily send requests to the end point, using a tool like Postman or writing some code and process the response. This will give me the same level of confidence in what I discover as it will if I had I tested ‘through’ the UI. Amongst other things, by removing the UI from the process, I reduce the complexity of the setup and reduce the risk that something in the UI blocks my testing.

However, risks 1 & 3 are tightly coupled to the UI. If the UI isn’t mapping the typed text to the request to the service correctly, then that is a UI issue. Similarly, if the UI doesn’t render the appropriate information to the user based on the response from the service, then that is also a UI issue. So, based on the assumption (as explained above) that I don’t need to test through the UI to test risk 2, then I can make the logical assumption that I don’t actually need the service to test the UI.

In summary:

If I only want to test the UI, then I may not need the integrated service to be up and running. However, I do need the information that is going to be processed by the UI.

In setting this principle, we achieve another benefit and that is one of determinism. For any check that might go in to our pipeline, we want to reduce the potential for noise (unwanted or useless information) as much as possible. Checks should fail because we’ve introduced a problem, not because our test data and setup isn’t deterministic.

There are many ways we can achieve this:

  1. Using hand rolled local stubs which we proxy a locally hosted version of the UI to
  2. Using a hosted service such as GetSandbox,  configuring the application or environment to point to a different service end point or address
  3. Seeding data in to the integrated service and therefore completing a fully top-to-bottom information loop.

Cypress allows us to do option 1 but really easily. All we need to do is add to a script steps 1 & 2 below and configure them to intercept the appropriate service call and replace the response.

  1. Start up the Cypress routing server with cy.server()
  2. Define a route that you want to intercept with cy.route()
  3. Execute the actions necessary to get the UI to send the request
  4. Assert on the expected outcomes

All good in theory but I know you’d like to see an actual example.

Testing the Address Finder

To configure the steps correctly we need two additional pieces of information:

  1. The request that’s going to be sent to the service
  2. An example response that we want to modify to suit our needs

In this case, if the user types ‘s’ in to the text box, the following request is sent to the service:


The request also contains a number of parameters, but I’ve omitted them for this example. If you want to explore more, you can find this yourself using the Network tab of your favourite browsers Dev Tools.

The response from that request contains an array of objects. Each object looks like this:

  "Text":"South, Pannels Ash",
  "Description":"Pentlow, Sudbury, CO10 7JT"

Through a little reverse engineering, we can see that the Suggested Addresses list displays the Text and Description properties of each of the returned objects. *Ideally, if you were working on this feature, you’d not have to guess this.

So taking a really simple test case:

Given the user is on the ‘find a postcode’ page
When the user enters ‘s’ into the text box,
Then we should be shown a suggested addresses list that contains the expected addresses for ‘s’.

As discussed, this isn’t great if you’re using a fully integrated system and can’t control the information that you get back. So instead we should author the test case to suit our testing needs, then tell Cypress to help us achieve the setup needed.

Our test case now becomes:

  Given the user is on the 'find a postcode' page
  When the user enters 's' into the text box
  Then we should be shown a suggested address
    of 'Buckingham Palace'
    with text 'Where the Queen Lives'

Let’s look at the code, starting with the step definition for the When step.

When('the user enters {string} into the text box', (postcode) => {
    delay: 1000




Let’s look at what we’ve got. Firstly there’s a call to cy.server which we’ve configured to have a 1000ms delay before any response is provided. In this case I’m using the delay to emulate a slow round trip from request to response. It’s not necessary for this example but it helps to demonstrate a pattern explained later on. Once the server is up and running, we then need to specify a route that we’d like to intercept. To setup a route for our needs, we need to first help it identify the correct request we want to manipulate. There are three ways to pattern match a URL for a request: exact strings; regex and glob. In our example I’ve asked the route to ignore the top level domain and any parameters.

Now that we know how to intercept the request, we need to know what to do with it. cy.route is a very extensible method, allowing the user a number of options to satisfy the testing needs. We can inline responses directly into the method, but unless you’re replacing the response with an empty object {} then your code is going to look quite clunky. Your second options is to pass in a callback which will be called when the route is intercepted, allowing you access to the response data and manipulate it as needed. Option three is the one I’ve used and Cypress call it Fixtures. A fixture, in its basic terms, is data in a file. This is a powerful technique because it helps you manage potentially complex JSON objects away from your code but makes it really easy to use them.

As you can see, I’ve configured cy.route to replace the response with the contents of the fixture ‘bPSuggestedAddress.json’. This file lives in the folder /fixtures and contains the following object

      "Text":"Buckingham Palace",
      "Description":"Where the Queen Lives"

Dead simple I’m sure you’ll agree. We’ve also given this route an alias. These are nice little shortcuts that can be later referenced in code, and they also turn up in the Cypress test runner, so are great for marking up steps to easily find them.

Continuing the step definition, we type the letter ‘s’ into the appropriate text box and then ask the step definition to wait until the response has been intercepted. Remember that alias I just mentioned? Well because I’ve set up the route with one, I can now ask Cypress to wait until that alias has completed. Don’t be afraid to do this. Remember, this isn’t some arbitrary wait which could be indefinite, instead it will wait for exactly the time we’ve specified. In this case we’ve put in a delay of 1000ms for the server to intercept our request and send back the desired response. If we hadn’t put in a delay, that response would be sent back immediately.

And that’s the step completed. The last step in the script is:

Then we should be shown a suggested address
    of 'Buckingham Palace'
    with text 'Where the Queen Lives'

And here’s how I’ve implemented it:

Then('we should be shown a suggested address of {string} with text {string}',
  (address, text) => {
    expect(cy.get('body > div.pca > div:nth-
      child(4)').children().should('have.length', 3))
    cy.get('body > div.pca > div:nth-child(4) > div.pcaheader')
    cy.get('body > div.pca > div:nth-child(4) > div.pcafooter')
    cy.get('#cp-search_results_item0').parent().should('have.id', 'cp-
    cy.get('#cp-search_results_item0').should('contains.text', address + text)
    cy.get('#cp-search_results_item0 > span').should('have.text', 'Where the
      Queen Lives')

These are all specifics related to how the ‘suggested addresses’ component works, so don’t worry too much about them, but I think we can both agree that it looks quite clunky and I definitely know that there’s some parts of the expected implementation that I’ve not covered. Which begs the question, how can we do that better? In Part 3 of this series, I’ll take you into the realms of snapshot testing. If you’re familiar with component based snapshots with Jest and Enzyme, then you’ll have a good idea of where this will go.

So there you go, you’ve now learnt how to use the basics of request and response manipulation with Cypress. It’s not a tool that you’ll want to use all of the time, but in the right situation and the right context, it’s an extremely powerful addition to your tool box. Also, because it’s so simple and quick, it’s a great tool to actually aid testing (not checking). That means you can use it to help you when you’re exploring the application and want to run ad-hoc experiments on your application. If some of those experiments end up as pipeline checks then great, but don’t feel the need or to be pressured into assuming that every piece of automated testing code you write has to go into the pipeline, but that’s another post for another time.

Thanks again for reading and I do hope that you found this useful. Please leave your comments below.

I’ve also considered turning these into vlogs, so if that’s something you think would be worthwhile, please get in touch and let me know.

A day with Cypress


Thanks everyone for the feedback I’ve had since the publication of this post, it’s been very interesting reading the various opinions and questions.

Following up on that feedback I wanted to make a couple of things clear about the post:

  1. This is not a ‘best practices’ guide. I think there are some good practices below, but as with all good practices, they need the right context. If you’re new to writing code, it’s probably better to get a good grounding in the basics before applying much of what I cover, otherwise you may end up heading down the wrong direction.
  2. The introduction of Cucumber to Cypress is to satisfy my own technical curiosity and should not be a first step when adopting/looking in to Cypress. I’ve long lost my enthusiasm for the gherkin syntax and avoid it where I can, but it can have its place and can be useful, but again that’s in the appropriate context.

My advice would be to take the post on face value. It is description of what I did, rather than a recipe for introducing Cypress to your testing tool set.

If you’ve got any questions on the above and how it applies to the post below, feel free to get in touch either here or @StooCrock on twitter.

The Fates Align

Sometimes you just have to take advantage of the trials the universe throws at you. Yesterday, just before heading out the door and getting on my bike, I did something I would normally not do: I paused and decided to check whether my train was running. There are strikes on at the moment and whilst I was expecting my usual 7am train to be off the schedule, I thought I’d better check that the schedule hadn’t changed further.

Am I glad I did! There ended up being multiple issues on the line, meaning it was highly unlikely I was going to get on a train for a few hours. I was already dressed and raring to go so I thought to myself, let’s spend some time with Cypress.io, because that’s what you do at 6:45am right?! That few hours turned in to a full blown WFH day and this is what I did with it.

I’d used Cypress a little in the past, and some of the teams in my platform use it, so I knew what I was going to get up-to. This was going to be a refresher course but with stretch goals:

  1. To put a Cucumber implementation on top to author the scripts
  2. To look again at stubbing calls at runtime
  3. To look at using snapshots instead of explicit assertions

With a test application at hand, I started putting together a simple set of acceptance checks after installing the cucumber plugin and deciding how to organise my scripts. As with most Cucumber implementations, it’s the quirks which slow you down, especially when the library used is still maturing. In this case I went for The Brain Family Cucumber plugin, mostly because it’s linked to from the Cypress site 🙂

I’m not going to repeat the setup instructions, but instead will say it was pretty easy to get things to a point where I could write a feature file and my first step definition.

I didn’t think too much about the organisation of the project at this point, but I’m sure it’ll be something I consider going forward, so I stuck with the default suggestions.

My general approach to writing any code is to make it work first, then make it better later, in this case, using the checks themselves to help me make incremental refactors. I more often that not have a good idea of ‘what good looks like’, but I’m happy to get there organically, changing my mind where necessary, as I learn and adapt.

Cypress & Page Objects

With Cypress, the first opportunity for refactoring rears its head when it comes to abstracting framework implementation details away from the checks themselves. Cypress doesn’t have its own implementation of Page Objects, unlike other frameworks such as Nightwatch.js.

The Cypress documentation is clear for me, but I can understand that it might be a little daunting if you’re early on your code writing and/or automation journey.

Cypress say:

> Can I use the Page Object pattern?


The page object pattern isn’t actually anything “special”. If you’re coming from Selenium you may be accustomed to creating instances of classes, but this is completely unnecessary and irrelevant.

The “Page Object Pattern” should really be renamed to: “Using functions and creating custom commands”.

I interpret this as: there’s no point in us providing something which javascript inherently does, so do what you think is right.

Great, so where does that leave us? First up there are selectors, lots of little bitty pieces of text that we pass in to a function to help us find something on the page. We tend to abstract these out to Page Objects, but if taken on face value (they encapsulate a whole page), have a tendency to become unwieldy very quickly. For some pages that’s a whole lot of selectors. I prefer to abstract what’s unique about the page in to its own object (normally it’s base level constituent parts), and then where it makes sense, have logical components of that page separated in to their own individual objects.

An example…

Let’s create an example using the Cypress.io home page.

If you inspect the home page you’ll find that it’s made up of discreet sections stitched together. There’s a navigation bar, a hero image, a host of ‘sections’ and a footer. Let’s assume that in this instance, the home page has to have each of these sections in order for it to be a well formed page. So, if I’m going to check that the home page is constructed correctly, I could in this instance, decide that as long as those components exist, then what’s in them is initially irrelevant. My check therefore only goes one or two levels deep, at most.

Let’s start by writing the Cucumber Scenario for this check and implementing the step definitions for it without any abstractions.

Firstly this is our scenario, which we’ve put in a file called cypressHome.feature

Feature: The cypress home page

Users can visit the cypress home page

Scenario: Opening the cypress home page in a browser
Given I open the Cypress home page
Then it has a navigation bar
And it has a hero image
And it has an end to end section
And it has a footer

And now the associated step definitions which we’ve put in a file called cypressHomeSteps.js

const url = 'https://cypress.io'

Given('I open the Cypress home page', () => {

Then('it has a navigation bar', () => {

Then('it has a hero image', () => {

Then('it has an end to end section', () => {

Then('it has a footer', () => {

These selectors we’re using seem like good candidates to abstract away in to a component, let’s call it cypressHomePage.js and put it in a components folder in the project tree.

In this file we’re going to export an object that we can then reference in the step definition file. Here’s what it now looks like.

module.exports = {
NAVBAR: ".navbar",
HERO: "#hero",
END2END: "#end-to-end",
FOOTER: "#footer"

To reference this object, we can import it in to our step definitions file. The path to the file is important, so make sure that you adjust it for your folder structure.

Once we’ve imported the object and updated the step definitions, our modified file now looks like this:

const url = 'https://cypress.io'
const { NAVBAR, HERO, END2END, FOOTER } = require("../../components/cypressHomePage");

Given('I open the Cypress home page', () => {

Then('it has a navigation bar', () => {

Then('it has a hero image', () => {

Then('it has an end to end section', () => {

Then('it has a footer', () => {

Now we have a defined set of selectors in a component which can be reused across the rest of our step definitions if needed. The benefit is should those selectors change, we only need to change them in one place. A secondary benefit is the references are semantically valid, making them easier to read and in the case of some crazy length selector text, they take less space 🙂

In this example, I’ve included the URL as a const. This is purely to help highlight some changes later in the post. A better practice and one built in to Cypress is to have the URL in the cypress.json file as a value for the property ‘baseUrl’. When navigating to the page, you would then use cy.visit(‘/’), as the URL would be automatically included. This becomes more powerful as you require the use of more pages in your application as they’d be referenced as a friendlier and more readable cy.visit(‘/myaccount’) etc.

Taking another step

All very basic so far, but that’s the best thing about it. Having singular steps for each section is nice to a point. It’s explicit so we know exactly what is going on without the need to dig deeper in to the code base. But it’s a little wasteful, so let’s take refactor again. This step might be a little controversial but I like it.

Remembering our scenario from earlier, I’m suggesting we make a little change:

Feature: The cypress home page

Users can visit the cypress home page

Scenario: Opening the cypress home page in a browser
Given I open the Cypress home page
Then it has the necessary sections

I can hear the audible gasps! Some might say that I’ve broken a fundamental rule of the Gherkin syntax, by removing the explicit expectation from the scenario and condensing it in to something a little more fluffy. Whether you want to do this, or not is up to you, but go with me for this example.

If we head back to our step definitions, we can now reduce that down to these:

const url = 'https://cypress.io'
const { NAVBAR, HERO, END2END, FOOTER } = require("../../components/cypressHomePage");

Given('I open the Cypress home page', () => {

Then('it has the necessary sections', () => {

The changes are in but I’ve not really done anything different… yet. When building a framework, we want to try and keep our checks, and the library we’ve chosen as separate as possible. Up until now our scenario step definitions have been tightly coupled to Cypress, but we can do something about that by extending our component.

What we’ll do is create a function in our component that we’ll call from the scenario step definition. We’ll call this function ‘hasTheNecessarySections’. Because the function lives within the same object as the properties we created earlier for each of the sections, we’ll need to reference each with this.. We’ll also do what we suggested earlier, and do the same thing for navigating to the page in the first place, by adding another function named navigateTo.

module.exports = {
NAVBAR: ".navbar",
HERO: "#hero",
END2END: "#end-to-end",
FOOTER: "#footer",
URL: 'https://cypress.io',
navigateTo: function() {
hasTheNecessarySections: function() {

And in our step definitions we’ll change how we import the component and call the new functions in the appropriate steps.

const cypressHomePage = require("../../components/cypressHomePage");

Given('I open the Cypress home page', () => {

Then('it has the necessary sections', () => {

And that’s it. It’s a simple but effective way of abstracting away implementation details from our checks, using component based objects even though Cypress doesn’t support them natively.

Taking it even further

At the beginning of the post I mentioned the 3 areas I wanted to look in to. I’ve covered most of goal 1, demonstrating how to use Cypress with a Cucumber plugin and a bonus section on abstracting implementation details from your scenarios. One area I’ve not covered is the use of tables in scenario steps. For those familiar, tables allow you to enter test data that a step should iterate over, with related outputs if necessary. I’ll cover the use of them in a later post.

What I haven’t done is touch on goals 2 or 3. I was planning to, but this post got a bit on the long side. Next time I’ll demonstrate how to use the cy.server() and cy.route() APIs to stub data from APIs, allowing us to make best use of the snapshot capabilities provided by the Cypress Snapshot library. Using this library we’ll remove the need to explicitly call cy.get() for each section of the home page and replace them with a single call to a saved snapshot. In doing so we’ll be able to cover the whole of the page (if we want to), not just the elements in the markup, but the appropriate data as well.

I hope this has been useful for you. Leave a comment if it has or if there’s anything you disagree with.

Thanks for reading.

Modelling for Testers

In this post, we’ll go on a short journey through the concept of formal modelling and how you can apply modelling techniques to help you test more effectively.

Firstly, what is a model?

Traditionally used in mathematics and science, a model helps communicate and describe something. A very basic model that we’ve all come across is the one used to calculate the approximate volume of any cardboard box width height length. If we stick some random numbers in to the formula, the output is the volume of randomly generated, virtual cardboard box. If we take a cardboard box and measure each dimension, we can calculate the approximate volume of a real box – that’s pretty useful, especially if you’re a company like Amazon!

Screen Shot 2018-07-31 at 16.03.00

Models don’t have to be formula, they can be diagrams; information flows; or physical objects. It took me a little while to make the connection, but that Airfix kit you might’ve built as a child (or grown up), is called a model because it is a model of the real thing.

Using Provided Models to Aid Testing

Not only are there different ways to model something, there are also different uses for them. Requirements are written models that describe the expected behaviour of a system. This type of model is used to communicate how something should work, so that it can be made. These are really useful models for us as testers as they give us an expected outcome. Remember that Airfix kit, well here’s a formal model that you use to help build your physical model.

Screen Shot 2018-07-31 at 18.11.11

We can test the model before a line of code is written, by asking questions of the model and testing its limitations. By doing this early, we can extend the model to be more specific or we can help correct the model where it is deficient. This helps increase the likelihood that what we build, is actually what was required.

When it comes to testing the output of the requirements, we can reuse those models to determine if what we built, meets the expectations of the model. Traditionally we might call these Test Cases: for a given requirement (formula) and set of data (dimensions) we expect the output of the software to be deterministic (volume). This is what we testers call checking and is a prime candidate for automation (but that’s a different topic!). It may even be better still to use this as a way to drive the writing of the system’s code – I’m sure you’ve come across TDD and it’s many variants.

Creating Our Own Models to Aid Testing

As a tester, even if you hadn’t made the connection yourself before, you should now realise that you’re using models all of the time in your daily routine. Models provided to us are great for testers, but modelling techniques are even more useful.

You may ask, if the models have been provided for me, why would I need to model anything myself? Great question! As testers, our primary responsibility is to provide information about the product, focussing on the risks to its value. If the scope of testing activities that we partake in are guided only by the models that we’ve been provided, we run the risk of only reporting on what we thought we knew and not the actual behaviours of the system.

To counteract this risk, we testers can look to use modelling techniques ourselves, to explore and describe the system as it is, not just what it was supposed to do. Here’s a really simple modelling exercise that you can do right now.

Now It’s Your Go!

Firstly, pick a web site to model against. If you work on one, pick that because everything you do in this example will add real value to your testing efforts. If not pick, something relatively simple, maybe your favourite online store.

Next, pick a part of the site to model. Keep it a single page for now, so if it’s an online store, use the search page or product details page and use the mobile version as it’ll reduce the work we need to complete. I’m going to pick Alan Richardson’s Blog to demonstrate the exercise.

Head over to your chosen page in Chrome and once it’s finished loading, open the developer tools (instructions are here) and click the network tab. If they’re not there already, add the Domain and Method columns to the network table.  Order the list by Domain either descending or ascending.


Clear the log in case there’s anything in there already, then refresh the page. This will list out all of the network calls that the page makes client side and it’s these we’re going to model.

In your favourite diagramming tool or on a large piece of paper, stick the page we’re modelling in the centre. I’m going to use a mind map for now.

Head back to Chrome and take note of each of the domains your page calls out to. Create a node in the diagram for each domain and that’s the first part of your model finished.

Screen Shot 2018-07-31 at 17.32.52

You now have a useful, visual record of all of the domains that your site calls out to and you can sit with your friendly neighbourhood developer or architect to determine if there’s anything that looks odd.

A More Useful Model

So that exercise didn’t take too long did it! The great thing is, it won’t take much longer to make it even more useful. Follow these steps to add to your diagram and make that conversation even more interesting.

For each domain you’ve added to your model, refer back to the network tab in Chrome and make a note of the Type of request made and the address up to any query. You may find it’s useful to group some types: images and gifs are a good example. You can also see from my example, I’ve called out that there’s a redirect because I’m using the mobile view in Chrome.

Screen Shot 2018-07-31 at 17.38.08

Don’t stop there, keep adding more information to your model. There’s lots of information that you can add to spark off a conversation. Here’s some ideas on how what you can add to your model as you explore the software even deeper:

  • highlight a domain or address that you don’t recognise for questioning later;
  • call out a request to the same address requesting the same payload more than once;
  • annotate with the size of a response if it falls outside some boundaries
    • a really small response (single bytes)
    • a really large response (varies but something over a few hundred KB is worth questioning);
  • a response that takes a long time, start with anything >300ms (set the Waterfall column to show Total Duration);
  • a response that returns a 4** or 5** error code;
  • any further redirects
  • or if the response body mentions an error.

Now what?

That’s entirely up to you! The important thing to remember is that whilst you’re building this model you’re actually exploratory testing. You’re learning about the product and you’re questioning it. In this simple example we’re only questioning the various calls that the product is making client side, but they are good questions.

Amongst many other examples, this specific activity can help find performance issues through:

  • responses that take too long or are blocking
  • responses that are repeated
  • requests that are pointless and can be canned

It can also help find security issues

  • if you’re making calls that should be https but they’re not
  • any redirects that are happening that you don’t expect
  • any other calls that don’t make sense or may be sending information that they shouldn’t

You can use the model to decide which calls you should manipulate to help you understand how doing so might impact the user experience:

  • try to block a call
  • slow a call down
  • intercept a call and manipulate the response

And we’ve only covered client side requests. By using the same technique, you can take a capability and slowly break it down, level by level, building your knowledge of the product.  Share your knowledge with your team to help identify problems and then use it to give you a ideas on where to focus your testing based on dependencies and areas of brittleness.

I haven’t even mentioned how you can use these models to confidently share your test coverage, that’s a post for another time.

Why I like Testing in Production


This post is not a internal environment vs production environment face off nor is it an attempt to convince you to change your beliefs on testing in production.

It is an attempt to show, that testing in production can be another tool in the testers arsenal, to be used when the context fits and if the team you’re working with have the capability to do so now, or the desire to build that capability for the future.

If it does encourage you to investigate what testing in production might look like for you, share your story, I’d love to hear it.

But you should never test in production…

There is a school of thought which prescribes that testing of an application should only be completed in internal integrated environments.

I’ve recently seen a post discuss control – that new processes should never be tested in production. I’ll often hear about increased risk to the business and customers through shipping unfinished features. Occasionally someone will accuse me of treating our customers as guinea pigs.

Not forgetting data, I’m told that there’s a risk that my testing will be skewing production analytics for customer engagement, account tracking and stock levels if I’m testing in production.

These are all valid risks in their own context and each introduces varying degrees of impact should they realise. There is no wrong in any of these arguments.

Where would you prefer to test?

Ask yourself, given zero risk, would you ever test in production for any reason?

My answer for this is, given zero risk, I would test everything I could in production. I would test new features, integrations of new features with old features, integrations of my application with other applications and dependencies. I would also conduct all of my non-functional testing in production: performance, load, security, etc. Why would I use an approximation of the live environment if I could use the real thing?

But of course zero risk doesn’t exist, so I’m going to take my utopia and start to break it down until I find a level of risk that is suitable for the context in which I would like to test. As part of that exercise, I would need to be clear on what I mean by testing in production.

I define testing in production to be an encapsulation of two distinct states.

  1. Testing in production of an as yet un-launched, hidden version of the application that customers cannot see or use
  2. Testing in production following the launch of a new version of the application to customers

Both activities offer their own value streams but solve very different problems.

Everyone can benefit from and should think about spending some time with number 2. Your application is live, your customers are using it. Assuming you could learn something new about what you’ve already shipped or even test out some of your earlier made assumptions, why wouldn’t you want to do that in production? Run a bug bash in production, keep it black box (only things customers can do) if you’re particularly worried about it and observe. You may find something that’s slipped through your net and if you do, you’ve proven its worth.

Testing hidden features

It’s option 1 that I find most interesting. I’ve recently read an article introducing testing in production from the Ministry of Test Dojo – Testing in Production the Mad Science Way The article discusses two distinct models that you can implement to provide you with the means to test in production.

We’ve implemented a variation on the circuit breaker method referenced in the article. In doing so, we have the ability to use feature flags to determine which code path the application should go through and therefore, what behaviours the customer has access to.

In its default state a feature flag is set to off. This means that the customer sees no change despite the new code having been deployed to production. When it’s there, our circuit breakers allow us to turn features on client side. This means that testers can go to production, set the feature flag to on for the feature they want to test and are able to happily test against production code for the duration of their session. Once testing is complete and the feature is ready to go live, we can change the configuration of the feature flag for all customers, safe in the knowledge that we can turn it off again if something were to go wrong. The deployment of configuration is quick and we have two mechanisms to permeate the rollback to our customers – either slowly as sessions expire or we can force them through on their next page load. When rolling forward we only do so as sessions expire.

In shipping this changed code we make the assumption that we’re able to determine whether the introduction of this new code has not had a detrimental impact on the current feature set and customer experience. We derive this confidence through testing during the development phase of the new feature and through our automated checking suite which runs in our pipeline. We also have a third line of defence, a set of automated checks for our core journeys, created by a central team who own our path to live.

This mechanism takes time to mature and we’ve definitely made a few mistakes along the way. With perseverance we’ve been able to ship fairly large changes to our production site, with no impact to customers, test those changes and then turn them on when we’re confident we’re ready to.

Whilst we can mitigate concerns such as customer impacting stock levels by being careful to only use non-low stock items, there are still some structural areas which we do not test in production such as peak load tests and updates to our feature switch and rollback/roll forward mechanisms. Anything else will be considered on a case by case basis, discussed during 3 Amigos within the team(s) and agreed on before actioning.

My thoughts

For some contexts, I prefer testing in production over testing in internal integrated environments because it provides me with these key benefits :

  1. The likelihood of my testing being blocked by an issue with a dependency is greatly reduced
  2. The data is at peak scope, complexity and usefulness
  3. Any bug that I find in the application under test is an actual and real issue 
  4. Any issues that I find with the environment will be having real impact to our customers and/or the business

In my experience, these benefits derive from flaws with the practices put in place to build and support internally integrated environments. 

Internally integrated environments do provide their own benefits. There are scenarios and processes which I would be reluctant to test in production and I’ve outlined some of those above. This article also does not discuss inside out testing techniques such as those on code – unit tests and component tests.

How a Spanish joke reminded me of Software Testing.

Before the hustle and bustle of the work day set in, a colleague of mine and I were discussing a picture someone had drawn on a post-it and left on his monitor.

Screen Shot 2017-12-08 at 09.11.22

The message was clear, someone wanted to thank him for his efforts recently, which is a lovely gesture. What wasn’t so clear was the little picture in the bottom corner. Within the context of the message it was easy, the picture represented someone in the water. However, take the message away and the picture becomes less clear. I described it as one of those balsa airplanes with a rubber band propeller (indicated below), flying above the clouds.

Screen Shot 2017-12-08 at 09.18.12

This led my colleague to introduce me to a wonderful joke that he was told when he was growing up in Spain. The joke goes like this:

Teacher: “Listen carefully: Four crows are on the fence. The farmer shoots one. How many are left?”
Little Johnny: “None.”
Teacher: “Can you explain that answer?”
Little Johnny: “One is shot, the others fly away. There are none left.”
Teacher: “Well, that isn’t the correct answer, but I like the way you think.”

Little Johnny: “Teacher, can I ask a question?”
Teacher: “Sure.”
Little Johnny: “There are three women in the ice cream parlor. One is licking, one is biting and one is sucking her ice cream cone. Which one is married?”
Teacher: “The one sucking the cone.”
Little Johnny: “No. The one with the wedding ring on, but I like the way you think.”

In the first part, the teacher’s context was that of an arithmetic problem. The teacher was likely expecting Little Johnny to say 4, but Little Johnny may have been considering a less abstract interpretation of the information provided. Little Johnny hears the word ‘shot’ and his internal model converts that to a gun, a loud noise and the jittery nature of birds. In that context it’s perfectly understandable that Little Johnny gave the answer he did.

What was the trigger for this misunderstanding? I think it’s caused because the teacher, unknowingly at the time, left their question ambiguous in the face of their audience. This is something Little Johnny takes advantage of in the second part, setting a trap for the teacher to lead to the inevitable punch-line. Little Johnny provides very specific information to describe the scene, building a picture or model in the head of the teacher and in doing so, introducing biases of perspective that the teacher will use to answer the question. Little Johnny then comes in left field with a question, which the teacher will attempt to answer with the specifically limited information Little Johnny provides and laughter ensues.

Why is this an important lesson for Software Testers?

Each and everyday, the life of a tester is filled with information that is provided either knowingly incomplete, or more dangerously, obliviously incomplete.

It’s very much the role for us Software Testers to remember Little Johnny and the teacher when we’re communicating across the disciplines of the teams we work with. As we learn what it is that the product is expected to do, we must force ourselves to remember that it is very likely that our interpretation of the information is incorrect or incomplete based on our own biases and perspective.

We can counter this by asking questions, even if we feel like it might come across as dumb to do so – I’d almost argue that this is exactly when we must ask questions.

The 5 Why’s technique is a useful tool for understanding the primary objectives and drivers of an activity. With it we can challenge the assumptions made by ourselves and others and take them from the world of implicit to explicit.

Specification by Example is another technique that is practised throughout the software industry to provide a consistent language to describe behaviours and expectations, however I find that it’s rarely used to its full potential. Yes GWT scenarios can provide a suite of regression checks but the real power is in the conversation that can be had between a group of people to again, make the implicit, explicit – this will be the subject of another post, so keep tuned!

Even if we think we have a complete picture, the reality will be that we don’t. Rarely have I met anyone who can keep a whole network of related systems, dependencies, contracts and expectations in their head or even down on paper, in a sufficiently useful way, to remove any risk of misunderstanding or gaps in understanding.

That’s why for us Software Testers, our most useful tool can be our ability to explore the landscape in front of us, with a specifically chosen context, to build up a more complete understanding of the actual with respect to what we think we know.