The expected result was 42. Now what was the test?: Checking

Showing posts with label Checking. Show all posts

Monday, 5 June 2017

Usage of words

I came across the following tweet:

Testing vs. checking is a low value semantic argument used by a cult for proving worthiness in the cult. All it ever was or will be.
— Marlena Compton (@marlenac) May 18, 2017

I tried to reply on Twitter but the message I tried to portray did not come across in the way I wanted it to.

Disclaimer: I am not, nor ever have been a member of any cult.

Marlenas' words came across very strongly and appear to be based upon their negative experience when encountering discussions on the use of these words. Others stepped in with their own experiences and the main message seems to be that the use of these words have been to derailed important discussions. I find that a shame, since to me the distinction with these words has been useful to help talk to executives and others from outside the testing world about the risks of unfocused automation and testing.

My concern in the statement by Marlena is that the distinction is of low value and a semantic argument. Semantics and the meaning of words is vital for society to be able to flourish and this has been going on for a long time. People have argued over what certain words mean and over time the meaning of some words change. Some are taken over to deride or insult people and sometimes these words are reclaimed by those who are being insulted. For example the word "Queer" to some this is a hostile word to others it is a badge of honor.

I worked in Israel for awhile and often would get strange looks when running workshops and replying to a question I would say 'smallish' It was awhile before I figured out that 'ish; is Hebrew for 'man' and I was saying 'small man'. Culturally words can have different meaning and cause confusion, the same can be said of the words'checking 'and 'testing' Using these words in the right situation and context to inform and have a discussion can be useful however if used to make a point or win an argument it becomes less useful. If used in an attempt to show superior intellect then the discussion is already lost.

I use the distinction between the words when discussing the testing effort. How much checking has been done against the amount of testing that has been done. How much effort have we spent on putting in place explicit knowledge, information we feel we know, against the effort on information that we do not know, tacit. Knowing the difference between these two items can be vital to help mitigate risk. If all the effort and money is being spent on checking with very little testing then there could be a risk that something we do not know could be dangerous. Unless we spend a little more effort on testing to uncover more of what we do not already know then there is unknown risks. Another example could be that the product is mature and changes are minor so more effort is put into the checking.

For me having these meanings helps to inform and tell a story. I do not use them to score points or be a member of a cult I use them because they have a value to me in my context. I do not really care if you use these words or not. I have explained how I use them and the usefulness I find in them. Yes I will discuss with people why I feel the distinction has value but at the same time I respect others opinions and viewpoints. To me it is a useful tool to be able to communicate with teams around the world.

Monday, 21 December 2015

Test Execution Model - updated

This is a follow on to the post I recently did about the automation pyramid that Richard Bradshaw gave at MEWT.

Following on from this some in the testing Twitterverse asked if Richard and I would do a video for the new initiative created by Richard called Whiteboard testing. Instead of one video we decided to do two. The first looked at the history of the test automation pyramid and you can watch it here: A look at the test automation pyramid. The second video was a chance for me to present my own model which I have called the test execution model and can be seen here: - John Stevenson Test Execution Model.

After the video was posted I had a message from Dan Ashby who wanted to discuss more with me about the test execution model and what he felt was missing. We arranged a chat and discussed the model and broke down a few assumptions we were both making. The outcome from this chat was the need to add a few extra parts to complete the test execution cycle. With thanks to Dan I would like to present the next generation of the model.

The changes that Dan suggested for the model are as follows:

As you test you turn some of your tacit knowledge into explicit knowledge which then can be added to your checks and become known known information. This is represented by the flow from the testing arrow to the checking arrow at the top of the diagram.
As you create more checks these in turn become oracles which you can use as heuristics for your future tests. This is represented by the flow from the checking arrow to the testing arrow.

This then gives a cycle of test execution were the testing feeds into the checking which in turn can feed back into the testing. It should be noted that is model is for test execution which does not only apply to only testing the product deliverable. Testing is carried throughout the development cycle and as such testing activities can occur during artitecture & design discussion and during coding. Testing activities can and should occur anywhere during the development and deployment of the product and this model should be used in that context with regards to test execution.

Thursday, 29 October 2015

The Laws of Sport and Automation

I have had this idea in my mind and on my backlog for quite a while. It was only after speaking at MEWT in Nottingham that I felt I really should get around to writing it.

There are many debates in the software development world about ‘test automation’ and how we can ‘automate all of the testing’. I am in the context of this article ignoring the difference between testing and checking, more details of this discussion can be found here - Testing and Checking Refined . However some of my ideas and concepts will touch on the difference between checking and testing.

Many have put forward arguments about automating what we know and if we have defined requirements up front then it should be possible to automate these. My counter to this is that in many sports there are well defined upfront requirements (laws) of how the game should be regulated. For example the laws of football (Soccer to those outside of Europe) can be found online here: FIFA Laws of the Game. If this is the case and these requirements are defined upfront then why do we not have automated referees? I asked this question on twitter and some of the responses gave reasons due to the psychical limitations, such as battery power unable to run and so forth. My line of thought is on how these requirements can, and are, interpreted.

Looking deeper in to the football laws of the game it can be seen there are many ambiguous statements which given the current state of AI, at the time of the publication of this article, I feel are impossible to automate. For example on page 39 it states the following as a reason for a player to be cautioned.

“unsporting behavior”

What does this mean? Page 125 attempts to define this with a list of what constitutes unsporting behavior. One of this in particular I found interesting, it is one based on human nature of trying to con or cheat

“attempts to deceive the referee by feigning injury or pretending to have been fouled (simulation)”

This I feel would be a common sense decision made by the referee. How could an automated system know if is fake or not? Then again how would the ref know? It it is a common sense decision, being made depending on a multitude of factors and contexts.

How about this one?

“acts in a manner which shows a lack of respect for the game”

What would count as lack of respect? A player who in the last second of the game lets in a goal that allows the opposition to win the title. The player shows human emotion and frustration, there is a fine line between emotion and respect or the lack of it?

My issue with this automation debate is that at this time it is not possible to automate common sense and multiple contexts in the decision making process that a referee has to go though in their thinking process.

For example a team is winning 20 – 0 a machine would continue to officiate the game in accordance to the strict letter of the law. Whereas a human referee would allow some flexibility in the interpretation of the rules. They will allow some aspects of empathy to be applied to the game. Is it yet possible to automate empathy?

James Christie made a valid point on twitter that the reason in the majority of sports they are called laws and not rules is that:

“rules are detailed and specific whilst laws can be based on vague principles, which require interpretation and judgment. “

This makes sense since most countries have courts where lawyers debate how the laws of the land can or should be interpreted. Then a jury, judge or set of judges make a decision based upon the arguments presented. Another case of were the requirements are listed and known but given current AI limitations would be impossible to fully automate. Even though we know that human beings are flawed in the judgement that are made, would using an automated judgement machine be any less flawed, if at all possible to produce?

Returning back to the laws of sport and how ambiguous those laws are we can look at the laws of Rugby Union

Looking at the beginning of the laws on page 21 there is guidance on how the laws should be applied:

The Laws must be applied in such a way as to ensure that the Game is played according to the principles of play. The referee and touch judges can achieve this through fairness, consistency, sensitivity and, at the highest levels, management.”

How would you automate sensitivity in this context?

According to the Oxford English Dictionary this in this context is defined as:

“A persons feelings which might be easily offended or hurt”

Add into that equation “fairness”, we are now journeying down the automation rabbit hole.

Looking at the laws regarding fair play and the guidance that the document provides for foul play (Law 10) section m gives the following guidance.

“Acts contrary to good sportsmanship. A player must not do anything that is against the spirit of good sportsmanship in the playing enclosure”

What constitutes “the sprint of good sportsmanship”? How do you clarify between intentional and unintentional behavior? Again I am uncertain if this kind of decision could be automated.

If we look at the laws of Rugby League we can see similar issues in how difficult it can be for the laws to be interpreted. Rugby league was one of the early adopters of video technology to help assist the referee in the game. This is what Michael and James in their article would define as tool assisted testing. In this case a video referee can review certain decisions via the use of video technology.

Looking at the definition of a forward pass.

“is a throw towards the opponents’ dead ball line”

How do you define this in the context of a fast moving game? Under the section 10 which offers some guidance there is a distinction between deliberate and accidental forward passes. How do you make a distinction between these two actions? Also would an automated system be able to deal with factors such as the momentum of the player and the wind moving the ball. Yes they could process information quicker than a human could but would it be right?

This is not to say that referees are not fallible and there are many instances in sport of them making mistakes; however people are aware of this and can accept that fact. Would people be so willing to accept a machine making similar mistakes based upon our biases that machine are not fallible?

Many sports are implementing some level of automated systems which are used to aid the referees.

Tennis has been using Hawkeye since 2002
Football has started to implement goal-line technology
Cricket uses the Umpire Decision Review System

It is interesting to note that each of these automated systems have had some controversy regarding their accuracy and success especially with the cricket system.

To conclude when people discuss test automation and attempt to automate as much as possible there is a need to step back and think critically. Automation in software development has a place and is a useful tool to use, however, it should not be thought of as an alternative to testing as applied by a human being. Even when you think you have the requirements nailed down they are words and as such are open to a multitude of interpretations. Using a mixture of automation, tool assisted testing and human testing in a ratio that adds value to the quality of the product being delivered is a more thoughtful approach rather than the mantra of we can “automate all the testing effort.” Going forward we need to be thoughtful of what machines can do and what they cannot do. This may change as technology progresses but as of the publication of this article there are big limitations in automation.

Friday, 16 October 2015

MEWT4 Post #1 - Sigh, It’s That Pyramid Again – Richard Bradshaw

This is the first in a series of posts I plan to write after attending the fourth MEWT peer conference in Nottingham on Saturday 10th October 2015.

Before I start I would like to say thank you all the organizers and for inviting me along and a BIG MASSIVE thank you to the AST for sponsoring the event,

Abstract:

Earlier on in my career, I used to follow this pyramid, encouraging tests lower and lower down it. I was all over this model. When my understanding of automation began to improve, I started to struggle with the model more and more.

I want to explore why and discuss with the group, what could a new model look like?

___________________________

During the session Richard explained his thoughts about the test automation pyramid created by Mike Cohn in his book Succeeding with Agile and how the model has been misused and abused.

Richard talked about how the model has adapted and changed over the years from adding more layers..

...to being turned upside down and turned into an ice-cream cone.

Images taken from - http://watirmelon.com/2012/01/31/introducing-the-software-testing-ice-cream-cone/

Duncan Nisbet pointed out that this really is now an anti-pattern - http://c2.com/cgi/wiki?AntiPattern. The original scope of the diagram by Mike was to demonstrate the need to have fast and quick feedback for your automation and as such focused the automation effort to the bottom of the pyramid. Where the feedback should be fast. The problem Richard has been experiencing is that this model does not show the testing effort or tools needed to get this fast feedback. It also indicated that as you move up the pyramid less automation effort was needed or should be done. The main issue for Richard was how the pyramid has been hi-jacked and used as examples of the priority of effort should be on automation rather than focus on the priority of both in given contexts.

Richard presented an alternative model in which both testing and automation with the tools required could be shown on the ice-cream cone diagram.

With this diagram the sprinkles on the top were the tools and the flakes the skills. He then in real time adjusted the model to say it would be better as a cross-sectional ice-cream cone with testing throughout the cone and the tools across all areas of the original pyramid. Many attendees liked this representation of the model but some thought that it still encouraged the concept that you do less of certain testing activities as you move down the ice-cream cone.

At this stage I presented a model I had been using internally to show the testing and checking effort.

Again people thought this indicated that we need to do less as we move up the pyramid and it went back to the original point being made by Richard that the pyramid should die.

After MEWT I thought about this problem and tweeted an alternative representation of the diagram. After a few comments and some feedback the diagram ended up as follows:

With this model the pyramid is removed. Each layer has the same value and importance in a testing context. It shows that the further up the layers you go the focus should switch more from checking to testing and the lower down the focus should be on automating the known knowns. All of this is supported by tools and skills. As a model it is not perfect and it can be wrong for given contexts, however for me it provides a useful starting point for conversations with those that matter. It especially highlights that we cannot automation everything nor should we try to do so.

In summary the talk given by Richard was one of the many highlights of the day at MEWT and inspired me to look further into the test automation pyramid model and its failings. I agree with Richard that this original model should die especially in the way it is often misused. Richard provided some useful alternatives which could work and hopefully as a group we improved upon the original model. Richard did clarify that his ice-cream cone model with sprinkles is not his final conclusion or his final model and he will be writing something more on this in the near future. His blog can be found here - http://www.thefriendlytester.co.uk/.

Now it is over to you, please provide your feedback and comments on this alternative model.

Wednesday, 29 January 2014

Using games to aid tester creativity

Recently Claire Moss blogged about potty training and how this came about from a card game called Disruptus I introduced to the Atlanta Testing meet up while I was in the USA. This reminded me that I was going to blog about how I use this tool in a workshop and in my day to day testing to improve upon my own and teams testing ideas. The workshop is a creative and critical thinking and testing workshop which I intend to deliver at the London Tester Gathering in Oct 2014 – early bird tickets available.

The workshop is based upon a series of articles that I have written on creative and critical thinking part 1 here. As part of the workshop I talk about using tactile tools to aid your creative thoughts, having objects you can hold and manipulate have been shown to improve creativity (Kinesthetic learning). One part of the workshop introduces the game of Disruptus, which has very simple rules. You have about 100 flash cards which have drawings or photographs on and you choose a card at random. They even include some spare blank cards for you to create your own flash cards. An example of some of the cards can be seen below:

You then have a selection of action cards which have the following on them:

IMPROVE

Make it better: Add or change 1 or more elements depicted on the card to improve the object or idea
EXAMPLE From 1 card depicting a paperclip: Make it out of a material that has memory so the paperclip doesn’t distort from use.

TRANSFORM

Use the object or idea on the card for a different purpose.
EXAMPLE From 1 card depicting a high heel shoe: Hammer the toe of the shoe to a door at eye level and use the heel as the knocker.

DISRUPT

Look at the picture, grasp what the purpose is, and come up with a completely different way to achieve the same purpose.
EXAMPLE From 1 card depicting a camera: Wear special contact lenses that photograph images with a wink of the eye.

CREATE 2

Using 2 cards take any number of elements from each card and use these to create a new object or idea.

JUDGES CHOICE
PLAYERS CHOICE

For the purpose of this article I will only be looking at the first three. You can either choose which action card you wish to use or use the dice that is provided with the game. The rules are simple you talk about how you have changed the original image(s) in accordance with the action card and a judge decides which is the best to decide the winner. When I do this we do not have winners we just discuss the great ideas that people come up with, to encourage creativity there are no bad ideas.

The next step in the workshop is applying this to testing. Within testing there are still a great many people producing and writing test cases which are essentially checks. I am not going to enter into the checking vs testing debate here, however this game can be used if you are struggling to move beyond your ‘checks’ and repeating the same thing each time you run your regression suite. It can be used to provide ideas to extend your ‘checks’ into exploratory tests.

Let us take a standard test case:

Test Case: Login into application using valid username/passwordExpected result: Login successful, Application screen is shown.

Now let us go through each of the action cards and see what ideas we can come up with to extend this into an exploratory testing session

IMPROVE - Make it better: (Add or change 1 or more elements depicted on the card to improve the object or idea.)

Using the action described above can you think of new ways to test by taking one element from the test case?

Thinking quickly for 1 minute I came up with the following:

How we do start the application? Is there many ways? URL? Different browsers? Different OS?
Is the login screen good enough or can it be improved (disability issues/accessibility)
What are valid username characters?
What are valid password characters?
Is there a help option to know what valid username/passwords are?
Are there security issues when entering username/password?

Can you think of more? This is just from just stepping back for minute and allowing creative thoughts to appear. (Remember there are no bad ideas)

Let us now look at another of the action cards.

TRANSFORM - Use the object or idea on the card for a different purpose.

What ways can you think of from the example test case above to transform the test case into an exploratory testing session?

Again we could look at investigating:

What alternatives are there to logging in to application? Fingerprint, Secure token, encrypted key?
Can we improve the security of the login code?
What security issues can you see with the login and how can you offer improvements to prevent these issues

It takes very little time to come up with many more ways in which you can transform the test case into something more than a ‘check’

Now for the next (and final for the purpose of this article):

DISRUPT - Look at the picture, grasp what the purpose is, and come up with a completely different way to achieve the same purpose.

I may have already touched upon some of the ideas on how to disrupt in the previous two examples, that is not a bad thing since if an idea appears in more than one area it could be an indication of an idea that may very well be worth pursuing.

Some ideas on disrupting could be:

Do we need a login for this?
Is it being audited?
Is it an internal application with no access to the public?

I hope from this article you can see how such a simple game can help to improve your mental ability and testing skills, as Claire mentioned in her article.

Since software testing is a complex mental activity, exercising our minds is an important part of improving our work.

This is just a small part of the workshop and I hope you have enjoyed the article, if so I hope to see some of you soon when I run the full workshop.

PS – I intend to run a cut down version of the workshop for the next Atlanta Testing Meet Up whilst I am here in the USA. Keep a watch here for announcements in the near future.

Tuesday, 15 October 2013

Are you ‘Checking’ or ‘Testing’ (Exploratory) Today?

Do you ask yourself this question before you carry out any test execution?

If not then this article is for you. It starts with a brief introduction about the meaning of checking and testing in the context of exploration and then asks the reader to think about the question then evaluate the testing they are doing and determine from their answer if what they are doing has the most value.

There have been many discussions within the testing community about what the difference is between ‘checking’ and ‘testing’ and how it fits within the practice of test execution.

Michael Bolton started the debate with his article from 2009 on ‘testing vs checking’ in which he defined them as such:

Checking Is Confirmation
Testing Is Exploration and Learning
Checks Are Machine-Decidable; Tests Require Sapience

At that time there were some fairly strong debates on this subject and in the main I tended to agree with the distinctions Michael made between ‘checking’ and ‘testing’ and used this in my approach when testing.

James Bach, working with Michael, then came along with another article to refine the definitions in his article ‘Testing and Checking refined.

Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modelling, observation and inference.
(A test is an instance of testing.)
Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.
(A check is an instance of checking.)

From this they stated that there are three types of checking

Human checking is an attempted checking process wherein humans collect the observations and apply the rules without the mediation of tools.
Machine checking is a checking process wherein tools collect the observations and apply the rules without the mediation of humans.
Human/machine checking is an attempted checking process wherein both humans and tools interact to collect the observations and apply the rules.

The conclusion to me appeared to be that checking is a part of testing but we need to work out which would be best to use for the checking part is a machine better or a human? This question leads to the reason for putting this article together.

James on his website produced a picture to aid visualisation of the concept:

Since in my world checking forms a part of testing as I have interpreted the article by James we therefore need to pause for a moment and think about what we are really doing before performing any testing.

We need to ask ourselves this question:

Are we ‘checking’ or ‘testing’ (exploratory) today?

Since both form a part of testing and both could, depending on the context, be of value and importance to the project it is vital that we understand what type of testing we are doing. If we ask ourselves that question and our answer is ‘checking’ we now need to find out what type of checking we are doing. If the checking that is being performed falls under the category of machine checking then we need to think and find out why we are doing this as a human thinking person rather than getting a machine to perform it. Things that could fall under this category could be validation or verification of requirements or functions in which we know what the answer will be. If this is the case then you need to ask

WHY:

are you doing this?
is a machine not doing this?
can a machine not do this for me?

The problem with a person carrying out manual machine checking is that is uses valuable testing time that could be used to discover or uncover information that we do not know or expect. * When people working in software development talk about test automation this is what I feel they are talking about. As rightly stated by James and Michael there are many other automation tools that testers can use to aid their exploratory testing or human checking which could be classified as test automation.

*See my article on expectations and why we need to explore.

So even though checking is a part of testing and can be useful, it may not be the best use of your time as a tester. Sometimes there are various reasons why testers carry out machine checking such as:

Too expensive
Too difficult to automate
No time
Lack of skills
Lack of people
Lack of expertise.

However if all you are doing during test execution is machine checking then what useful information are you missing out on finding?

If we go back to the title of this article are you ‘checking’ or ‘testing’ today?

You need to make sure you ask yourself this question each time you test and evaluate which you feel would have most value to the project at that time. We cannot continue to have the same ‘must run every ~~test~~ check manually’ mentality since this only addresses the stuff we know and takes no account of risk or priority of the information we have yet to discover.

The information that may be important or useful is hidden in the things we do not expect or currently do not know about. To find this we must explore the system and look for the interesting details that are yielded from this type of testing. (Exploratory)

I will leave you with the following for when you are next carrying out machine checking:

..without the ability to use context and expectations to “go beyond the information given,” we would be unintelligent in the same way that computers with superior compututional capacity are unintelligent.

J. S. Bruner (1957) Going beyond the information given.