I have had this idea in my mind and on my backlog for quite
a while. It was only after speaking at
MEWT in Nottingham that I felt I really should get around to writing it.
There are many debates in the software development world about
‘
test automation’ and how we can ‘
automate all of the testing’. I am in the context of this article ignoring
the difference between testing and checking, more details of this discussion
can be found here -
Testing and Checking Refined .
However some of my ideas and concepts will touch on the difference between
checking and testing.
Many have put forward arguments about automating what we
know and if we have defined requirements up front then it should be possible to
automate these. My counter to this is that in many sports there are well
defined upfront requirements (laws) of how the game should be regulated. For example the laws of football (Soccer to
those outside of Europe) can be found online here:
FIFA Laws of the Game. If this is the case and these requirements
are defined upfront then why do we not have automated referees? I asked this
question on twitter and some of the responses gave reasons due to the psychical
limitations, such as battery power unable to run and so forth. My line of thought is on how these
requirements can, and are, interpreted.
Looking deeper in to the football laws
of the game it can be seen there are many ambiguous statements which given the
current state of AI, at the time of the publication of this article, I feel are
impossible to automate. For example on page 39 it states the following as a reason
for a player to be cautioned.
“unsporting behavior”
What does this mean?
Page 125 attempts to define this with a list of what constitutes
unsporting behavior. One of this in particular I found interesting, it is one based
on human nature of trying to con or cheat
“attempts to deceive the referee by feigning injury or
pretending to have been fouled (simulation)”
This I feel would be a common sense decision made by the referee. How could an automated system know if is fake or not? Then again how would the ref know? It it is a
common sense decision, being made depending on a multitude of factors and
contexts.
How about this one?
“acts in a manner which shows a lack of respect for the
game”
What would count as lack of respect? A player who in the last second of the game
lets in a goal that allows the opposition to win the title. The player shows human emotion
and frustration, there is a fine line between emotion and respect or the lack
of it?
My issue with this automation
debate is that at this time it is not possible to automate common sense and
multiple contexts in the decision making process that a referee has to go though in
their thinking process.
For example a team is
winning 20 – 0 a machine would continue to officiate the game in accordance to the
strict letter of the law. Whereas a
human referee would allow some flexibility in the interpretation of the rules. They will allow some aspects of empathy to
be applied to the game. Is it yet possible
to automate empathy?
James Christie made a valid point on twitter that the reason
in the majority of sports they are called laws and not rules is that:
“rules are detailed and specific whilst laws can be based
on vague principles, which require interpretation and judgment. “
This makes sense since most countries have courts where
lawyers debate how the laws of the land can or should be interpreted. Then a jury, judge or set of judges make a decision
based upon the arguments presented. Another case of were the requirements are
listed and known but given current AI limitations would be impossible to fully automate.
Even though we know that human beings are flawed in the judgement that are
made, would using an automated judgement machine be any less flawed, if at all
possible to produce?
Returning back to the laws of sport and how ambiguous those
laws are we can look at the
laws of Rugby Union
Looking at the beginning of the laws on page 21 there is
guidance on how the laws should be applied:
The Laws must be applied in such a way as to ensure that
the Game is played according to the principles of play. The referee and touch
judges can achieve this through fairness, consistency, sensitivity and, at the
highest levels, management.”
How would you automate sensitivity in this context?
“A persons feelings which might be easily offended or
hurt”
Add into that equation “fairness”, we are now journeying
down the automation rabbit hole.
Looking at the laws regarding fair play and the guidance
that the document provides for foul play (Law 10) section m gives the following
guidance.
“Acts contrary to good sportsmanship. A player must not
do anything that is against the spirit of good sportsmanship in the playing
enclosure”
What constitutes “the sprint of good sportsmanship”? How do
you clarify between intentional and unintentional behavior? Again I am uncertain if this kind of decision
could be automated.
If we look at the
laws of Rugby League we can see similar issues
in how difficult it can be for the laws to be interpreted. Rugby league was one of the early adopters of video
technology to help assist the referee in the game. This is what Michael and James in their article
would define as tool assisted testing. In this case a video referee can review
certain decisions via the use of video technology.
Looking at the definition of a forward pass.
“is a throw towards the opponents’ dead ball line”
How do you define this in the context of a fast moving game? Under the section 10 which offers some
guidance there is a distinction between deliberate and accidental forward
passes. How do you make a distinction
between these two actions? Also would an automated system be able to deal with factors
such as the momentum of the player and the wind moving the ball. Yes they could process information quicker
than a human could but would it be right?
This is not to say that referees are not fallible and there
are many instances in sport of them making mistakes; however people are aware
of this and can accept that fact. Would
people be so willing to accept a machine making similar mistakes based upon our
biases that machine are not fallible?
Many sports are implementing some level of automated systems
which are used to aid the referees.
It is interesting to note that each of these automated systems
have had some controversy regarding their accuracy and success especially with
the cricket system.
To conclude when people discuss test automation and attempt to automate as much as possible there is a need to step back and think critically.
Automation in software development has a place and is a useful tool to use, however,
it should not be thought of as an alternative to testing as applied by a human
being. Even when you think you have the requirements
nailed down they are words and as such are open to a multitude of interpretations.
Using a mixture of automation, tool assisted
testing and human testing in a ratio that adds value to the quality of the product
being delivered is a more thoughtful approach rather than the mantra of we can “automate all the testing effort.” Going forward we need to be thoughtful of what machines can do and what they cannot do. This may change as technology progresses but as of the publication of this article there are big limitations in automation.