Monday, July 20, 2015

Confirmation bias in testing ...

In my previous post I explored about some mistakes I made looking at the New Horizons mission to Pluto, introducing the concept of confirmation bias.

Like many people, although I was vaguely aware of the concept, it took James Bach's RST course to formally introduce me to the concept.  As I discussed in the previous article, our eyes will sometimes see what our mind and emotions want us to see.

In fact it could be said that the very profession of testing occurs because of this effect.

Take a developer who is writing a registration page for a new system.  He goes through it, provides his name, email, date of birth, and it taken to a "Welcome Stuart Cook!" message.  He concludes "it works for me" and moves onto his next exciting coding task.

Can we really say at this point that registration has been "tested"?  The developer enters some data, he sees what he want's to see - he gets a "welcome" message - and to him this confirms that everything is working.

We have all occasionally fallen into this trap (I know I have) - one test does not cover everything.  Being aware that we have a tendency through confirmation to jump to "that works" is a key part of the RST course.

Entering some details, and getting "Welcome Stuart" is a good start in testing.  But we're not there yet.

Is it for instance just a message?  How do we know for sure that account has been created?  We could check the database, and maybe do a login, to confirm we can use the account.  If Stuart hasn't made the login page yet, then maybe we will have to settle for checking the database.  You were going to check the database weren't you?

A minor thing - but what if we create an account for someone who isn't called Stuart - would we still get "Welcome Stuart" as a message?

Then there's the case of "what if we provide really long, diverse fields for answers to the registration questions - what happens then?", plus of course "what if I provide junk answer to the registration page - does it still make my account?".

All these many heuristic ways of testing a registration page were covered back in my Back To Basics series, which is worth a recap.

Sometimes as testers we're seen as being a bit tricky and difficult to work with.

When Stuart the developer checks his name in the scenario, he says "it works" and moves on.

But when we as testers do the same test we say (or rather we should say) "I have some confidence that when I provide details similar to the situation described that the system should work as expected for the scenario provided".

Isn't that written in a lot of safety language?  Some in software development feel uncomfortable.  Lets take some time to explore the phrase I used, and why I used it ...

"when I provide details similar to the situation described"

I provided the following details - they include first/middle/last name of a certain length using certain characters, a certain date of birth, and a certain password complexity.  If the system is presented with data which includes,

  • Characters I didn't include
  • Strings which are much longer/shorter
  • Date of birth which is significantly different to that I used
  • Password which radically differs in structure


Then I will be less confident of the outcome of the system, as these inputs differ considerably from the scenario that I used.

Are you comfortable with that?

"the system should work for the scenario provided"

If I've tested the "account creation" of a registered user, I cannot then say I've also tested the "account rejected" scenario.  This is pretty basic, but again in our confirmation bias, if we see an account has been created we can jump to "well this works", where there are really some scenarios that you really DO NOT want it to work.

"I have some confidence"

What a weird statement to start with?  Surely you're sure aren't you?  Okay - I take the above points, but you've created a Stuart Cook user - surely you're 100% confident that if provided with the same information, that you can create a Stuart Cook account?

Actually no - indeed if you've already provided details for a Stuart Cook account, then surely providing the same identical details will mean an error - you can't create two identical accounts after all.  And if you vary the inputs, then there is a level (however minor) of uncertainty.

Okay - so let's say we delete the Stuart Cook account after you've made it.  Surely you're 100% confident you can make it again?

Nope.  One thing I've learned recently since working on highly available test rigs with load balancers.  Sometimes you can get the most frustrating kinds of errors.  You try something and it fails - so you get a developer over, and run it 4 times, each time it passes.  Your developer walks away in disgust, and the problem comes back.


Rather than being transferred to the tester funny farm, it turns out that the problem you're experiencing happens only on one server, and just by pure luck, you've been pointing to the wrong server every time you've been on your own testing!

So what's the cure?

Well, the first step is to know that confirmation bias is a thing, and a very powerful motivator.  We, our developers and our project managers are going to want us to "race to the finish", so there's a lot of pressure to "tick the box and say it's okay".

That statement I used is correct,

"I have some confidence that when I provide details similar to the situation described that the system should work as expected for the scenario provided"

So what's the solution?

Firstly, try different scenarios - make sure you cover all the scenarios you'd expect.

Secondly, vary the inputs to try a broad range of situations.

This was covered as I mentioned in my Back To Basics Testing series, but here we're looking at the "why".  Why we need to do this - the more we've covered the more confidence that statement gives.

But we can never be 100% confident - testing is always a trade-off between trying different inputs, expecting different outputs, whilst being aware all the time that the project clock is ticking, and we can't take forever to do this.

Good testing is about choosing those cases wisely, trying to cast a net across possibilities and experience to try and find potential problems.

3 comments:

  1. Awesome article. I would happily purchase a poster containing that quote for the office wall! You're absolutely right about the trade-off, and for a number of changes your quote, when given to a stakeholder, can be deemed "good enough" for release.

    ReplyDelete
  2. Thanks to you I've been listening to the Deceptive Mind lectures and learned more about all the different biases. I'm working to be more aware of them and actively taking steps such as using heuristics to avoid the traps.

    At the same time, I wonder how to balance all the unhappy path scenarios with value. What if the UI I'm testing doesn't support Unicode 5 characters? Maybe no user would ever try them, or would care they aren't supported. On the other hand, what if a user enters an unsupported character, that causes an unrecoverable error and the user loses data? This balancing act is proving hard for me - combining the risk analysis with the critical thinking, I guess.

    ReplyDelete
  3. Great article, thanks.

    Lisa, getting that balance right to me is the holy grail of sw testing. To me it sounds that you can't or rather feel uncomfortable moving forward on your own, so get help? I mean this in the best possible way.
    There are usually roles like product or account managers who both know the customers better than testers do and know the road map ahead. Having a product for the local market has different risks than if you start shipping it to China. All of a sudden your risks have changed with no or little code changes.
    In short we can't know all business risks but it's our job to be aware and find the information. It also increases visibility of testing which can only be a good thing.

    ReplyDelete