For the opera named after the test, see under the composer,
Julian
Wagstaff.
The Turing test is a proposal for a test of a machine's capability to demonstrate
intelligence. Described by Professor Alan Turing in the 1950 paper "Computing machinery and intelligence," it proceeds as follows: a human judge
engages in a natural language conversation with one human and one machine, each of which try to appear human; if the judge cannot
reliably tell which is which, then the machine is said to pass the test. In order to keep the test setting simple and universal
(to explicitly test the linguistic capability of the machine instead of its ability to render words into audio), the conversation
is usually limited to a text-only channel such as a teletype machine as Turing suggested or,
more recently, IRC or instant
messaging.
History
The test was inspired by a party game known as the "Imitation Game", in which a man and a
woman go into separate rooms, and guests try to tell them apart by writing a series of questions and reading the typewritten
answers sent back. In this game, both the man and the woman aim to convince the guests that they are the other. Turing proposed a
test employing the imitation game as follows: "We now ask the question, 'What will happen when a machine takes the part of A in
this game?' Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played
between a man and a woman? These questions replace our original, 'Can machines think?'" (Turing 1950) Later in the paper he
suggested an "equivalent" alternative formulation involving a judge conversing only with a computer and a man.
Turing originally proposed the test in order to replace the emotionally charged and (for him) meaningless question "Can
machines think?" with a more well-defined one. The advantage of the new question, he said, was that it "drew a fairly sharp line
between the physical and intellectual capacities of a man."
Objections and replies
Turing himself suggested several objections which could be made to the test. Below are some of the objections and replies from
the article in which Turing first proposed the test.
- "'Heads in the Sand' Objection: 'The consequences of machines thinking would be too dreadful. Let us hope and believe
that they cannot do so.' " This objection is a fallacious appeal to consequences,
confusing what should not be with what can or cannot be.
- Mathematical Objections: This objection uses mathematical theorems, such as
Gödel's incompleteness theorem, to show that there are limits to what
questions a computer system based on logic can answer. Turing suggests that humans are too often
wrong themselves and pleased at the fallibility of a machine.
- Mechanical Objections: A sufficiently fast machine with sufficiently large memory could be programmed with a large
enough number of human questions and human responses to deliver a human answer to almost every question, and a vague random
answer to the few questions not in its memory. This would simulate human response in a purely mechanical way. Psychologists have
observed that most humans have a limited number of verbal responses.
- Data Processing Objection: Machines process data bit by bit. Humans process
data holistically. In this view, even if a machine appears human in every way, to treat it as
human is to indulge in anthropomorphic thinking. (Recent advances in parallel computing and fuzzy logic based systems raise
interesting questions regarding this specific objection [citation needed].)
- Argument From Consciousness: This argument, suggested by Professor Geoffrey
Jefferson in his 1949 Lister Oration entitled "The Mind of Mechanical Man", states that "not until a machine can write a sonnet or compose a concerto
because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain." Turing
replies by saying that we have no way of knowing that any individual other than ourselves experiences emotions, and that
therefore we should accept the test. Also, although people are capable of feeling emotion, few can actually write a sonnet or
compose a concerto.
- Theological Objection: This states that thinking is a function of man's
immortal soul and therefore a machine could
not think. Turing replies by saying that he sees no reason why it would not be possible for God to grant a computer a soul if He
so wished.
- Lady Lovelace Objection: One of the most famous objections states that computers are incapable of originality. This is
largely because, according to Ada Lovelace, machines are incapable of independent learning.
Turing contradicts this by arguing that Lady Lovelace's assumption was affected by the context from which she wrote, and if
exposed to more contemporary scientific knowledge, it would become evident that the brain's storage is quite similar to that of a
computer. Turing further replies that computers could still surprise humans, in particular where the consequences of different
facts are not immediately recognizable.
- Informality of Behaviour: This argument states that any system governed by laws will be predictable and therefore not
truly intelligent. Turing replies by stating that this is confusing laws of behaviour with general rules of conduct, and that if
on a broad enough scale (such as is evident in man) machine behaviour would become increasingly difficult to predict. (Later
research on recursive algorithms has found that, in any case,
deterministic systems are capable of a chaotic
diversity of behaviour.)[citation needed]
- Extra-sensory perception: Turing seems to suggest that there is
evidence for extra-sensory perception. However, he feels that conditions could be created in which this would not affect the test
and so may be disregarded.
Discussion of relevance
There has been some controversy over which of the alternate formulations of the test Turing intended. (Moor, 2003)
It has been argued that the Turing test is so defined that it cannot serve as a valid definition of machine intelligence or "machine thinking" for at least three reasons:
- A machine passing the Turing test may be able to simulate human conversational behaviour, but this may be much weaker
than true intelligence. The machine might just follow some cleverly devised rules. A common rebuttal in the AI community has been
to ask, "How do we know humans don't just follow some cleverly devised rules?" Two famous examples of this line of argument
against the Turing test are John Searle's Chinese room
argument and Ned Block's Blockhead argument.
- A machine may very well be intelligent without being able to chat like a human.
- Many humans that we'd probably want to consider intelligent might fail this test (e.g., the young or the illiterate).
On the other hand, the intelligence of fellow humans is almost always tested exclusively based on their speech.
Another potential problem, related to the first objection above, is that even if the Turing test is a good operational
definition of intelligence, it may not indicate that the machine has consciousness, or
that it has intentionality. Perhaps intelligence and consciousness, for example, are such
that neither one necessarily implies the other. In that case, the Turing test might fail to capture one of the key differences
between intelligent machines and intelligent people.
In the words of science popularizer Larry Gonick, "I personally disagree with this
criterion, on the grounds that a simulation is not the real thing." (Gonick assumed he can tell them apart)
These criticisms are directed to the Turing Test so defined, but other interpretations of Turing's "new question" have been
discussed. Sterret argues that two distinct tests can be extracted from Turing's 1950 paper, and that, pace Turing's
remark, they are not equivalent. The test that employs the party game and compares frequencies of success in the game is referred
to as the "Original Imitation Game Test" whereas the test consisting of a human judge conversing with a human and a machine is
referred to as the "Standard Turing Test". Sterrett agrees that the Standard Turing Test (STT) has the problems its critics cite,
but argues that, in contrast, the Original Imitation Game Test (OIG Test) so defined is immune to
many of them, due to a crucial difference: the OIG Test, unlike the STT, does not make similarity to a human performance the
criterion of the test, even though it employs a human performance in setting a criterion for machine intelligence. A man can fail
the OIG Test, but it is argued that this is a virtue of a test of intelligence if failure indicates a lack of resourcefulness. It
is argued that the OIG Test requires the resourcefulness associated with intelligence and not merely "simulation of human
conversational behaviour". The general structure of the OIG Test could even be used with nonverbal versions of imitation games
(Sterrett 2000).
Still other writers (Genova (1994), Hayes and Ford (1995), Heil (1998), Dreyfus (1979)) have interpreted Turing to be
proposing that the imitation game itself is the test, without specifying how to take into account Turing's statement that the
test he proposed using the party version of the imitation game is based upon a criterion of comparative frequency of success in
that imitation game, rather than a capacity to succeed at one round of the game.
Predictions and tests
Turing predicted that machines would eventually be able to pass the test. In fact, he estimated that by the year 2000,
machines with 109 bits (about 119.2 MiB) of memory
would be able to fool 30% of human judges during a 5-minute test. He also predicted that people would then no longer consider the
phrase "thinking machine" contradictory. He further predicted that machine learning
would be an important part of building powerful machines, a claim which is considered to be plausible by contemporary researchers
in Artificial intelligence.
By extrapolating an exponential growth of technology over several decades,
futurist Ray Kurzweil predicted that
Turing-test-capable computers would be manufactured around the year 2020, roughly speaking. See the Moore's Law article and the references therein for discussions of the plausibility of this argument.
As of 2007, no computer has passed the Turing test as such. Simple conversational programs such
as ELIZA have fooled people into believing they are talking to another human being, such as in an
informal experiment termed AOLiza. However, such "successes" are not the same as a Turing Test.
Most obviously, the human party in the conversation has no reason to suspect they are talking to anything other than a human,
whereas in a real Turing test the questioner is actively trying to determine the nature of the entity they are chatting with.
Documented cases are usually in environments such as Internet Relay Chat where
conversation is sometimes stilted and meaningless, and in which no understanding of a conversation is necessary. Additionally,
many internet relay chat participants use English as a second or third language, thus making it even more likely that they would
assume that an unintelligent comment by the conversational program is simply something they have misunderstood, and don't
recognize the very non-human errors they make. See ELIZA effect.
The Loebner prize is an annual competition to determine the best Turing test
competitors. Although they award an annual prize for the computer system that, in the judges' opinions, demonstrates the "most
human" conversational behaviour (with learning AI Jabberwacky winning in 2005 and 2006, and A.L.I.C.E. before that), they have an additional prize for a system that
in their opinion passes a Turing test. This second prize has not yet been awarded. The creators of Jabberwacky have proposed a
personal Turing Test: the ability to pass the imitation test while attempting to specifically imitate the human player, with whom
the AI will have conversed at length before the test. [1].
Trying to pass the Turing test in its full generality is not, as of 2005, an active focus of much mainstream academic or
commercial effort. Current research in AI-related fields is aimed at more modest and specific goals.
There is an ongoing $10,000 bet at the Long Bet Project between Mitch Kapor and Ray Kurzweil about the question whether a computer
will pass a Turing Test by the year 2029. The bet specifies the Turing Test in some detail.
Terminology
In Turing's paper, the term "Imitation Game" is used for his proposed test as well as the party game for men and women. The
name "Turing test" may have been invented, and was certainly publicized, by Arthur C.
Clarke in the science-fiction novel
2001: A Space Odyssey (1968), where
it is applied to the computer HAL 9000.
Variations of the Turing test
A modification of the Turing test, where the objective or one or more of the roles have been reversed between computers and
humans, is termed a reverse Turing test.
Another variation of the Turing test is described as the Subject matter
expert Turing test where a computer's response cannot be distinguished from an expert in a given field.
As brain and body scanning techniques improve it may also be possible to replicate the essential data elements of a person to a computer system.[citation needed] The Immortality test variation of
the Turing test would determine if a person's essential character is reproduced with enough fidelity to make it impossible to
distinguish a reproduction of a person from the original person.
The Minimum Intelligent Signal Test proposed by Chris McKinstry, is another variation of Turing's test, but where only binary responses are permitted.
It is typically used to gather statistical data against which the performance of artificial intelligence programs may be measured.
Another variation of the reverse Turing test is implied in the work of psychoanalyst Wilfred Bion (1979), who was particularly
fascinated by the "storm" that resulted from the encounter of one mind by another. Carrying this idea forward, R. D. Hinshelwood
(2001) described the mind as a "mind recognizing apparatus," noting that this might be some sort of "supplement" to the Turing
test. To make this more explicit, the challenge would be for the computer to be able to determine if it were interacting with a
human or another computer. This is an extension of the original question Turing was attempting to answer, but would, perhaps, be
a high enough standard to define a machine that could "think" in a way we typically define as characteristically human.
Another variation is the Meta Turing test, in which the subject being tested (for example a computer) is classified as
intelligent if it itself has created something that the subject itself wants to test for intelligence
Practical Applications
CAPTCHA is a form of Reverse Turing test. When,
for example, logging on to a website, the user is presented with a word or number in a distorted
graphic image and asked to enter it. If the value entered does not match what is expected, then the user is rejected. This is
intended to prevent automated systems from using the site. The assumption is that software sufficiently sophisticated to read the
distorted image accurately either does not exist or is not available to the average user, so any system that is able to do so
must be a human being.
References in Popular Fiction
In addition to HAL9000 in 2001: A Space Odyssey, mentioned above, Merlin's Ghostwheel project
in Roger Zelazny's Amber is
mentioned to be capable of passing the Turing Test.
In episode 2x13 of the Sci-Fi Channel series EUReKA the computer at Global Dynamics uses Fargo's voice to fool Taggart, only to be found out
when Taggart asks about the looks of Deputy Lupo's new relationship.
The Turing Test is referenced in XKCD #329.
In the Philip K. Dick novel Do Androids Dream of Electric Sheep? and in the Ridley Scott movie Blade Runner, replicants are subject to a
Voight-Kampf test, intended to discover whether a person is a real human or a robot.
References
- Alan Turing, "Computing machinery and intelligence".
Mind, vol. LIX, no. 236, October 1950, pp. 433-460. Online version: [2] (with copyright permission)
- A.P. Saygin, I. Cicekli, and V Akman (2000), 'Turing Test: 50 Years Later', Minds and Machines 10(4): 463-518. (reprinted in
The Turing Test: The Elusive Standard of Artificial Intelligence edited by James H. Moor, Kluwer Academic 2003) ISBN
1-4020-1205-5. (Thorough review. Online version at [3] )
- B. Jack Copeland, ed., The Essential Turing: The ideas that gave birth to the computer age (2004). ISBN
0-19-825080-0
- H. L. Dreyfus. What Computers Can't Do, Revised Edition, New York: Harper Colophon Books. (1979) ISBN 0-06-090613-8
- J. Genova. Turing's Sexual Guessing Game, Social Epistemology, 8(4): 313-326. (1994) ISSN
0269-1728
- Larry Gonick, The Cartoon Guide to the Computer (1983, originally The Cartoon Guide to Computer Science). ISBN 0-06-273097-5.
- Stevan Harnad (2004) The Annotation Game: On Turing (1950) on Computing, Machinery, and Intelligence, in Epstein, Robert and
Peters, Grace, Eds. The Turing Test Sourcebook: Philosophical and Methodological Issues in the Quest for the Thinking
Computer. Kluwer.
- Patrick Hayes and Kenneth Ford. 'Turing Test Considered Harmful', Proceedings of the Fourteenth International Joint
Conference on Artificial Intelligence (IJCAI95-1), Montreal, Quebec, Canada. pp. 972- 997. (1995)
- John Heil. Philosophy of Mind: A Contemporary Introduction, London and New York: Routledge. (1998) ISBN 0-415-13060-3
- Ray Kurzweil, The Age of Intelligent Machines (1990). ISBN 0-262-61079-5.
- James Moor, ed., "The Turing Test: The Elusive Standard of Artificial Intelligence" (2003). ISBN 1-4020-1205-5
- Roger Penrose, The Emperor's New
Mind (1990). ISBN 0-14-014534-6.
- S. G. Sterrett, "Turing's Two Test of Intelligence" Minds and Machines v.10 n.4 (2000) ISSN 0924-6495 (reprinted in The
Turing Test: The Elusive Standard of Artificial Intelligence edited by James H. Moor, Kluwer Academic 2003) ISBN
1-4020-1205-5
- S. G. Sterrett "Nested Algorithms and the 'Original Imitation Game Test'," Minds and Machines (2002). ISSN 0924-6495
- W.S. Bion, (1979) "Making the best of a bad job." In W. R. Bion (1987) Clinical Seminars and Four Papers. Abingdon: Fleetwood
Press.
- R.D. Hinshelwood (2001) "Group Mentality and Having a Mind: Reflections on Bion's work on groups and on psychosis." In
PsycheMatters at www.psychematters.com/papers/hinshelwood2.htm
- Saygin, A.P. & Cicekli I (2002) Pragmatics in human-computer conversations, Journal of Pragmatics, Volume 34, Issue 3,
March 2002, Pages 227-258. Abstract and links to pdf (if permitted: [4]
See also
External links
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)