The expected result was 42. Now what was the test?: The Laws of Sport and Automation

I have had this idea in my mind and on my backlog for quite a while. It was only after speaking at MEWT in Nottingham that I felt I really should get around to writing it.

There are many debates in the software development world about ‘test automation’ and how we can ‘automate all of the testing’. I am in the context of this article ignoring the difference between testing and checking, more details of this discussion can be found here - Testing and Checking Refined . However some of my ideas and concepts will touch on the difference between checking and testing.

Many have put forward arguments about automating what we know and if we have defined requirements up front then it should be possible to automate these. My counter to this is that in many sports there are well defined upfront requirements (laws) of how the game should be regulated. For example the laws of football (Soccer to those outside of Europe) can be found online here: FIFA Laws of the Game. If this is the case and these requirements are defined upfront then why do we not have automated referees? I asked this question on twitter and some of the responses gave reasons due to the psychical limitations, such as battery power unable to run and so forth. My line of thought is on how these requirements can, and are, interpreted.

Looking deeper in to the football laws of the game it can be seen there are many ambiguous statements which given the current state of AI, at the time of the publication of this article, I feel are impossible to automate. For example on page 39 it states the following as a reason for a player to be cautioned.

“unsporting behavior”

What does this mean? Page 125 attempts to define this with a list of what constitutes unsporting behavior. One of this in particular I found interesting, it is one based on human nature of trying to con or cheat

“attempts to deceive the referee by feigning injury or pretending to have been fouled (simulation)”

This I feel would be a common sense decision made by the referee. How could an automated system know if is fake or not? Then again how would the ref know? It it is a common sense decision, being made depending on a multitude of factors and contexts.

How about this one?

“acts in a manner which shows a lack of respect for the game”

What would count as lack of respect? A player who in the last second of the game lets in a goal that allows the opposition to win the title. The player shows human emotion and frustration, there is a fine line between emotion and respect or the lack of it?

My issue with this automation debate is that at this time it is not possible to automate common sense and multiple contexts in the decision making process that a referee has to go though in their thinking process.

For example a team is winning 20 – 0 a machine would continue to officiate the game in accordance to the strict letter of the law. Whereas a human referee would allow some flexibility in the interpretation of the rules. They will allow some aspects of empathy to be applied to the game. Is it yet possible to automate empathy?

James Christie made a valid point on twitter that the reason in the majority of sports they are called laws and not rules is that:

“rules are detailed and specific whilst laws can be based on vague principles, which require interpretation and judgment. “

This makes sense since most countries have courts where lawyers debate how the laws of the land can or should be interpreted. Then a jury, judge or set of judges make a decision based upon the arguments presented. Another case of were the requirements are listed and known but given current AI limitations would be impossible to fully automate. Even though we know that human beings are flawed in the judgement that are made, would using an automated judgement machine be any less flawed, if at all possible to produce?

Returning back to the laws of sport and how ambiguous those laws are we can look at the laws of Rugby Union

Looking at the beginning of the laws on page 21 there is guidance on how the laws should be applied:

The Laws must be applied in such a way as to ensure that the Game is played according to the principles of play. The referee and touch judges can achieve this through fairness, consistency, sensitivity and, at the highest levels, management.”

How would you automate sensitivity in this context?

According to the Oxford English Dictionary this in this context is defined as:

“A persons feelings which might be easily offended or hurt”

Add into that equation “fairness”, we are now journeying down the automation rabbit hole.

Looking at the laws regarding fair play and the guidance that the document provides for foul play (Law 10) section m gives the following guidance.

“Acts contrary to good sportsmanship. A player must not do anything that is against the spirit of good sportsmanship in the playing enclosure”

What constitutes “the sprint of good sportsmanship”? How do you clarify between intentional and unintentional behavior? Again I am uncertain if this kind of decision could be automated.

If we look at the laws of Rugby League we can see similar issues in how difficult it can be for the laws to be interpreted. Rugby league was one of the early adopters of video technology to help assist the referee in the game. This is what Michael and James in their article would define as tool assisted testing. In this case a video referee can review certain decisions via the use of video technology.

Looking at the definition of a forward pass.

“is a throw towards the opponents’ dead ball line”

How do you define this in the context of a fast moving game? Under the section 10 which offers some guidance there is a distinction between deliberate and accidental forward passes. How do you make a distinction between these two actions? Also would an automated system be able to deal with factors such as the momentum of the player and the wind moving the ball. Yes they could process information quicker than a human could but would it be right?

This is not to say that referees are not fallible and there are many instances in sport of them making mistakes; however people are aware of this and can accept that fact. Would people be so willing to accept a machine making similar mistakes based upon our biases that machine are not fallible?

Many sports are implementing some level of automated systems which are used to aid the referees.

Tennis has been using Hawkeye since 2002
Football has started to implement goal-line technology
Cricket uses the Umpire Decision Review System

It is interesting to note that each of these automated systems have had some controversy regarding their accuracy and success especially with the cricket system.

To conclude when people discuss test automation and attempt to automate as much as possible there is a need to step back and think critically. Automation in software development has a place and is a useful tool to use, however, it should not be thought of as an alternative to testing as applied by a human being. Even when you think you have the requirements nailed down they are words and as such are open to a multitude of interpretations. Using a mixture of automation, tool assisted testing and human testing in a ratio that adds value to the quality of the product being delivered is a more thoughtful approach rather than the mantra of we can “automate all the testing effort.” Going forward we need to be thoughtful of what machines can do and what they cannot do. This may change as technology progresses but as of the publication of this article there are big limitations in automation.

4 comments:

Thomas Ponnet29 October 2015 at 08:53
That was an interesting Twitter discussion and a good blog you wrote here. I'd like to pick out one example you gave:
" “attempts to deceive the referee by feigning injury or pretending to have been fouled (simulation)”

This I feel would be a common sense decision made by the referee. How could an automated system know if is fake or not? Then again how would the ref know? It it is a common sense decision, being made depending on a multitude of factors and contexts."

You used one of my pet hate words "common sense". It can be common sense to throw people into a volcanoe to safe yourself from the anger of the gods, perfectly valid approach...

I had a good discussion with Duncan Nisbet the other day explaining about the different levels of tacit knowledge and we spoke about situations where it's problematical making the tacit explicit. Your example is one of those.
What is the scenario, when someone is feigning injury? First the referee or the machine have to detect that there is a feigned or real injury. How is that detection going on? You'd have visual information (player rolling on the grass), audio (player screaming their heart out), cues in the facial expression, other stakeholders giving right or wrong information (other players, other referee, etc) and quite a few more besides.That is a lot of information to process with a short time span where the referee, automated or not, has to take a decision. It's one example where all the experience and accumulated tacit knowledge can't be made explicit and therefore it can't be programmed to the level that would be required.

Nice, thanks for making me think.
Christian28 November 2015 at 02:29
Having been a basketball referee for 15 years and being a tester, I really like your post and have actually applied some of the ideas you mentioned. Be it refereeing or testing, tools can only support, not take the whole job over. On the lighter side: for basketball there is not just a rule book, but an official book of interpretation. Automating the rules would in this context lead to wrong results as they might have to be interpreted, often in strange ways.

Thursday, 29 October 2015

The Laws of Sport and Automation

4 comments: