Ever heard of the Lovelace Test?
As part of my work in the Data Analytics for Resource Efficiency team and my interest in Machine Learning, I’m interested to know when machines can be considered intelligent. Over the years, the Turing Test has been regarded by many (definitely not by all) as THE definitive test for consciousness in a computer, but recently a new test, the Lovelace Test, has been proposed.
I had heard about the programming language Ada back at university and I know that it was named after Ada Lovelace. Other than her being a mathematician and being considered the first computer programmer I know relatively nothing about this seemingly interesting lady.
Did you know, that Ada Lovelace even has a day dedicated to her since 2009? Ada Lovelace Day celebrates the achievements of women in science, tech, engineering and math (STEM) and aims to increase the profile of women in STEM (You can read more about STEM in general in one of my other blog posts here: https://hssmi.org/inspiring-generation-tomorrow/). It is held every year on the second Tuesday of October (this year’s event will therefore take place on Tuesday 10 October 2017).
Augusta Ada King, Countess of Lovelace, lived in the first half of the 18th century and even though the first modern computers would not be invented before the 1940s, she wrote what can be considered the first computer algorithm for Charles Babbage’s Analytical Engine.
From a young age, Lovelace was privately schooled in mathematics and science and had a passion for machines and new inventions. In 1833, Mary Sommerville, scientist and Ada’s mentor, introduced her to Charles Babbage. Babbage wanted to build a mechanical calculating engine after finding erroneous calculations in manually created astronomical tables. Lovelace was fascinated by Babbage’s concept for his Analytical Engine, which would work with a punchcard operating system and was based on an earlier idea he called Difference Engine. In 1842, she translated an article by Luigi Menabrea describing the Analytical Engine adding her own observations and computer algorithms as she, according to Babbage, “understood the machine so well” (Ada’s extent of contribution is, however, disputed Her remarks on the potential use of the engine must have seemed quite visionary at that time, but Ada understood that a machine could do so many more operations other than just calculations on numbers. Although the engine was not realised until the early 2000s, the design included all the key aspects of a modern computer.
The Imitation Game
Alan Turing, amongst other things a mathematician and computer scientist, came across Ada Lovelace’s transcripts at a young age. From his paper“Computing Machinery and Intelligence” (1950) it can be understood that Turing does not agree with Lovelace’s statement about computers not being able to think independently:
“Our most detailed information of Babbage’s Analytical Engine comes from a memoir by Lady Lovelace (1842). In it she states, “The Analytical Engine has no pretensions to originate anything. It can do whatever we know how to order it to perform” (her italics). This statement is quoted by Hartree (1949) who adds: “This does not imply that it may not be possible to construct electronic equipment which will ‘think for itself,’ or in which, in biological terms, one could set up a conditioned reflex, which would serve as a basis for ‘learning.’ Whether this is possible in principle or not is a stimulating and exciting question, suggested by some of these recent developments. But it did not seem that the machines constructed or projected at the time had this property.”
I am in thorough agreement with Hartree over this. It will be noticed that he does not assert that the machines in question had not got the property, but rather that the evidence available to Lady Lovelace did not encourage her to believe that they had it. It is quite possible that the machines in question had in a sense got this property. […] In any case there was no obligation on them to claim all that could be claimed.”
A variant of Lady Lovelace’s objection states that a machine can “never do anything really new.” […] A better variant of the objection says that a machine can never “take us by surprise.”[…]”
Further in his paper, Turing proposed a thought experiment he called the “imitation game”, which is now widely known as the Turing Test. Here, a human being and a machine are questioned by an interrogator without the interrogator knowing which is which. The communication is supposed to take place in a written format or in some form similar. The goal of the game is for the interrogator to determine which of the two is the human being and which is the machine, and for the machine it is to assume “that the best strategy is to try to provide answers that would naturally be given by a man.”
Turing believed that “in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.”
Although never intended to be conducted, many competitions have taken place since Alan Turing’s paper. In 2014, the chatbot Eugene Gootsmanwith the persona of a 13-year-old Ukrainian could within five minutes lead 33% of the competition judges into believing it was human. And even though the organisers have claimed that the Turing Test was passed for the first time under most simultaneous comparisons tests, critics have been quite vocal as creating a chatbot cannot be set equal to creating artificial intelligence.
In their paper, Bringsjord et al., write:
“Unfortunately, attempts to build computational systems able to pass [the Turing Test] have devolved into shallow symbol manipulation designed to, by hook or by crook, to trick. The human creators of such systems know all too well that they have merely tried to fool those people who interact with their system into believing that these systems really have minds. And the problem is fundamental: The structure of the [Turing Test] is such as to cultivate tricksters.”
“Turing’s strategy for building a machine capable of passing [the Turing Test] is not to program a machine from scratch, injecting knowledge (and, yes, trickery) into it. His strategy […] is instead to first build what he calls a “child-machine,” and to then teach it in much the same way that we teach our own youth.”
Bringsjord et al., have therefore suggested an alternative to the Turing Test, the Lovelace Test, which is based on creativity.
Lovelace Test, Part 1
In 2001, Bringsjord, Bello and Ferrucci proposed the idea of the Lovelace Test, which builds upon the idea of story generation agents. To pass the test, an artificial agent would need to produce a creative output in such a way that the agent designer cannot explain how the agent created this output. Then, the argument is that as the output created is something new and surprising the machine must have somewhat of a consciousness.
So, is this test a sufficient measure for machine intelligence?
The answer seems to be no.
Based on the original Lovelace Test, Riedl proposed another version of the test in 2014 – the Lovelace 2.0 Test.
Lovelace 2.0 Test
In Riedl’s test, a human evaluator is specifying a set of constraints for an artificial agent. As an example given in his paper, the agent is asked to “create a story in which a boy falls in love with a girl, aliens abduct the boy, and the girl saves the world with the help of a talking cat.”
The Lovelace 2.0 Test looks for what is defined by Riedl as “computational creativity”, which is…
“[…] the art, science, philosophy, and engineering of computational systems that, by taking on particular responsibilities, exhibit behaviors that unbiased observers would deem to be creative.”
He concludes that…
“The Lovelace 2.0 Test is designed to encourage skepticism in the human evaluators. […] the evaluator is given the chance to craft a set of constraints that he or she would expect the agent to be unable to meet. Thus if the judge is acting with the intent to disprove the intelligence, the judge should experience an element of surprise if the agent passes the test. The ability to repeat the test with more or harder constraints enables the judge to test the limits of the agent’s intelligence.”
So the point of Riedl’s Lovelace 2.0 Test is to quantify and compare the creativity of different artificial agents.
If the Lovelace Test does not seem to be satisfactory enough, is maybe the Lovelace 2.0 Test?
Again, probably not.
Defining evaluation standards in particular for creativity is not easy. On the one hand, there is no common ground for a definition of creativity itself, and artificial agents are normally tailored to a specific problem domain.
Coming back to the question Can machines take us by surprise?, based on the tests above we don’t seem to be there yet. Clearly, in my case as a software programmer I might sometimes experience elements of surprise as I might not always get the result that I was expecting – but that is mostly due to incorrect use of syntax rather than actual machine consciousness and creativity.
- Figure: https://hotforsecurity.bitdefender.com/wp-content/uploads/2015/10/ada-lovelace-1024×536.png