The last blog of this series covered Turing’s famous “Computing Machinery and Intelligence” paper from 1950 and introduced his main ideas like The Imitation Game, nowadays better known as the Turing Test. It also covered the main objections that Turing himself imagined being used against his theory.
As a short recap Turing strongly believed in the computational nature of our minds, meaning it will be entirely possible for a digital computer to mimic the human mind and by doing so develop a (digital) mind of its own.
In this blog, I’m going to cover, in more depth, the strongest objections, as well as the new proposals for the potentially better tests, that came in the decades after the original Turing’s paper was published.
As well as some fun stuff I found along the way! Let’s jump into it!
Minds, brains, and programs
Written in 1980 by John Searle, this paper is considered as one of the strongest objections to the Turing Test.
It’s more famously known as the “Chinese room” argument. But before describing how the thought experiment goes let me first explain some terminology that is helpful to have in mind (whatever that is).
Searle distinguishes between 2 types of AI:
- Weak AI (a useful tool that helps us, humans, study the mind)
- Strong AI (programmed correctly the digital equivalent of the mind itself)
And we can all agree about the weak AI. Without a doubt AI is making our lives easier and is helping us better understand the world around us, minds included (there are certainly bad sides to technology but those are more of social problems than ones of technology).
On the other hand, strong AI implies that those, correctly-programmed, digital computers have cognitive states and from that, it follows that programs explain cognition. This is something that Turing strongly believed in.
And here is how Searle refuted this idea:
Imagine you have a person inside a room. That person doesn’t speak Chinese* nor has any, whatsoever, understanding of it. Now outside of the room is a native Chinese speaker, he/she writes down something in Chinese and passes the paper to the person inside of the room.
The person inside the room has been given some rules (think of computer instructions) on how to react to every input (Chinese characters) that he/she gets. So the person sees “squiggle squiggle” and just blindly following the rules, without any understanding, forms the output that is also in the correct “squiggle squiggle” form and passes it outside of the room.
From the point of a native speaker, if the rules were properly engineered, it seems like he/she is communicating with another native Chinese speaker, but we know that that’s not the case and that the result was created by a blind symbol manipulation without any understanding of the Chinese language.
*- I’m aware that there is no such thing as “Chinese” that would be equivalent to me saying ‘I speak Slavic’ instead of ‘I speak Serbian’. For my Chinese friends out there suppose we implicitly think of Mandarin Chinese when we say Chinese.
So the person in this thought experiment took the role of the computer and we see that without any understanding it is able to imitate a native Chinese speaker. We are allowed to do this since the “hardware implementation” does not matter — mind exists independent of its implementation details as per the theory we’re trying to refute.
And that is precisely what the Turing Test is all about — imitation.
Now there are some BUTs here that may pop into your mind. It’s one thing to make a thought experiment where the Chinese output that the person produces is perfect, but it’s a different thing whether that’s at all possible to do following a finite list of instructions.
My belief is the following. If and when we engineer a machine who’s behavior is truly indistinguishable from humans it will probably have to use some “novel” technology and it will produce some form of cognition in the process.
I don’t believe that it can be done using symbol manipulation that Turing proposed, which we use even today, including the learning methods that I’m aware of. We may get to ‘hey close enough’ in a 20-minute conversation, which would, from an economic standpoint, be damn useful! but that’s not it. That’s not the real AI that we’re trying to build.
Reiterating, what Searle was trying to say is that dualism* is a wrong idea to start with. You have to duplicate the causal powers of the brain in order to achieve cognition, running a program is just not enough. This idea, although it’s in conflict with a mathematician inside of me who’s prone to abstraction and simplification, resonates with me.
*- theory that we can split the mind and the brain in the same way that we split hardware and software. The opposite theory is that you can’t split them — mind requires this specific implementation that we call the brain.
I’ll leave you with an interesting formulation that Searle made:
No one supposes that computer simulations of a five-alarm fire will burn the neighborhood down or that a computer simulation of a rainstorm will leave us all drenched. Why on earth would anyone suppose that a computer simulation of understanding actually understood anything?
Now let’s see some of the most famous proposals on how to replace the Turing Test with something better.
Total Turing Test (TTT)
Dubbed like this by Stevan Harnad, the earliest mentions of TTT I’ve found originate in his paper Minds, Machines and Searle (1989), but was covered at least 1 more time in his Other Bodies, Other Minds paper (1991).
The basic idea of TTT is this, TT only tests for the linguistic capacities of humans whereas TTT also includes our “robotic” capacities. Moving, interacting with physical objects in our environment, seeing, and all the other sense-modalities (hearing, etc.) we have.
Or as Harnad puts it:
It is hard to imagine, for example, that a TT candidate could chat with you coherently about the objects in the world till doomsday without ever having encountered any objects directly — on the basis of nothing but “hearsay,” so to speak. Some prior direct acquaintance with the world of objects through sensorimotor (TTT) interactions with them would appear to be necessary in order to ground the candidate’s words in something other than just more words.
If by the word “world” he also thinks of the simulated world I’m happy. Now this may sound obvious and you may ask yourself why didn’t Turing include all of this, why did he abstract these “details” away?
Surely 1 reason was this:
Bodily appearance clearly matters far less now than it might have in Turing’s day
Back in the day, they weren’t accustomed to seeing robots everywhere. Nowadays in the post Star Wars era, we got really accustomed. We see them in cinemas, on television, in our favorite movies and series and we are quite comfortable with the idea that robots can have intelligence.
Turing thought that if the judge was to see the robot it needed to look exactly like a human (to have the same skin as we do, etc.) — which was an unfair and redundant constraint according to Turing and rightfully so, but we don’t hold the same biases anymore.
There is 1 more assumption that Turing implicitly made and Harnad made that explicit in his paper:
It is unlikely that our chess-playing capacity constitutes an autonomous functional module, independent of our capacity to see, move, manipulate, reason, and perhaps even to speak. The TT itself is based on the pre-emptive assumption that our linguistic communication capacity is functionally isolable.
The hypothesis that Harnad is making is that it may not even be possible to pass the TT without passing TTT, because in order to develop linguistic capability we’d have to develop the robotic capability in the first place:
It is highly unlikely that our linguistic capacities are independent of our robotic capacities: Successfully passing the teletype version of the Turing Test alone may be enough to convince us that the candidate has a mind (just as written correspondence with a never-seen pen-pal would), but full robotic capacities (even if only latent ones, not directly exhibited or tested in the TT) may still be necessary to generate that successful linguistic performance in the first place.
A further benefit of introducing TTT is that we’re more likely to converge to a “right” solution in the sense that the machine will more likely have a mind if it passes the TTT.
But, an interesting question is what’s the evolutionary advantage of having a mind in the first place? How did evolution produce the mind? Natural selection “knows” as much as the judge in the Turing Test. Better performing individuals have a greater chance of survival that’s it.
Then why aren’t we all mindless robots with amazing capabilities?
Harnad as well as Turing both acknowledge that there is no way to know whether others (humans or robots) have a mind (an age-old problem, known in philosophy as the “other minds” problem) and so accepts TTT as the best thing we can do at the moment.
You know that one:
“If it looks like a duck, walks like a duck, quacks like a duck…it’s a duck”
But there is always this one guy that still doesn’t buy it. And it’s Searle again.
Searle’s paper, discussed in the previous section, also had an objection against TTT, called “The Robot Reply (Yale)”. The argument goes like this:
By adding cameras and other sensors and actuators (fancy name for anything that moves objects in the physical world like a robotic hand) to a robot you don’t really add anything new in the sense of understanding.
The processing part of the robot still only sees a stream of meaningless formal symbols (this time coming from the camera or some other input sensor and not from a native Chinese speaker) which it just mindlessly manipulates and then sends back to actuators in the form of yet another meaningless stream of symbols.
Harnad's contra-argument is that this time Searle cannot do the job of a transducer (which converts, in the example of the camera, photons into “symbols”) and thus the “systems reply” holds — namely that although the person in the Chinese room experiment has no understanding, the system as a whole has.
I think I’m with Searle on this one but I still do think that from an engineering standpoint we want to go for TTT and develop amazing robots (even if we don’t unlock a true cognition by doing so).
Now if anyone is pushing robots beyond our imagination that’s Boston Dynamics — I guess most of you heard of them — take a look at this video:
A while back I listened to Marc Raibert, Boston Dynamic’s CEO and he said at that point in time that they had 0 machine learning in their systems.
Now I don’t know whether that’s true, but if it is, it’s utterly amazing how much can we accomplish by only using clever engineering and a bunch of domain expertise.
Fun fact: There is also a TTTT (Total Total Turing Test), where you also have to perfectly imitate the neuromolecular structure of humans.
And now for the final test!
Lovelace Test (LT)
Written in 2001 by Bringsjord and his colleagues, LT builds on the famous Lady Lovelace’s argument that we covered in the last blog that roughly states the following:
Computers can’t create, they can only do stuff that we program them to do i.e. only the things that we know how to do.
So the informal version of LT is pretty simple:
A machine will pass the Lovelace test if it manages to surprise us.
Now the surprise is a really vague term, so they formalize it explaining that if the architect of that AI agent can’t explain by any possible mean and given enough time how the agent produced some result, we say that the machine has passed the LT test.
They went on and dissected the state-of-the-art creative systems that were available back then, like BRUTUS (this one was developed by the authors of LT!) — which generated “creative” stories and COPYCAT, and they showed that these fail the Chinese Room argument by Searle (that guy again!).
They further took into account that Turing mentioned learning as the potential way to overcome Lovelace’s argument. So they ended up equating learning with deep learning by tackling only neural nets, showing that these, once fully trained, can be considered as a knowledge base, and thus they still don’t originate anything.
That’s an interesting way to look at the neural networks, but what they missed, in my opinion, is that the “creativity” happened during the training. We really can’t explain how the knowledge base was built in the first place, especially if we use some kind of stochasticity (randomness) in the optimization procedure*, which is a usual thing to do.
*- neural networks are usually “trained” i.e. their “neurons” are slightly modified in this procedure we call the optimization procedure, where they get better at the task of interest.
As a final desperate attempt to figure out how we can “engineer” these LT-passing machines they considered something called the oracle machines.
These are fictional machines (consider them as a helpful imaginary construct) that can basically solve problems that are proven to be unsolvable (like the halting problem, etc.) by the Turing machines *— we say that they can go past the so-called Turing Limit.
*- without going deeper into the theory of what Turing machines are, consider your computer as one specific implementation of these Turing machines.
Most popular of these oracle machines are analog chaotic neural nets, trial-and-error machines, and Zeus machines.
In an attempt to make it less abstract let me briefly explain the Zeus machine. It basically enumerates all of the natural numbers in say 1 second — that’s what makes Zeus an oracle machine. It outputs 0 in 1/2 second, 1 in 1/4 second, 2 in 1/8 second, etc. and if you’re familiar with this sequence it adds up to 1. So 1 second elapses and you’ll end up with all of the natural numbers “printed to your screen” (better crank up those fps). Khm. Yeah.
So using this one you can solve otherwise computationally unsolvable problems. But even Zeus will fail in the Chinese Room experiment. Imagine Searle mindlessly manipulating Chinese symbols only this time he does it unbelievably fast, that’s it. But where does understanding emerge?
They wrap up the paper saying well this didn’t go by the plan. 😃 And suggest that the possible way may be hidden in the agent causation — which would imply that certain mental activities (like when you make a decision/free will) happen without ordinary physical causal chain ever happening.
But I think we went too far into metaphysics at this point. Maybe they’ve “proved” that pure computation can not explain how the mind works and we’ll have to seek answers in quantum computing or these microtubules that Roger Penrose believes may hide some hints.
But let's step back and revisit the original question.
Have machines managed to surprise us?
Well starting in 1997 DeepBlue has beaten the world grandmaster in chess, Garry Kasparov, for the first time in history. And those chess-engines have gotten much better ever since. They are so good now that Magnus Carlsen, who is the current chess champion of the world, compares to them the same way as me and you compare to Magnus (except if you’re a grandmaster).
Take a look at this amazing video with Garry Kasparov where you’ll hear him explain how the whole story with DeepBlue went and much more:
But I hear you say “but DeepBlue used the brute force approach”, so the “architect” would be able to exactly tell you what happened and thus DeepBlue wouldn’t pass the LT by definition. And I’d agree.
But what about AlphaGo or better yet AlphaZero and AlphaStar?
I don’t know, I’d like to know whether it’s possible for David Silver and his crew to know exactly why their algorithm did the “Move 37” for example? The famous move that AlphaGo did that made all of the best Go players in the world go what??? And it won the game, of course.
I listened to David’s talk here, but I got the impression that he hasn’t got the faintest clue — not to say anything bad about David or anything he’s one of my heroes.
I know, that’s still not good enough for LT. Could they, the DeepMind team that was behind this marvelous feat of engineering, explain how it exactly came to be? I’ve got some friends at DeepMind I guess I’ll have to ask or hopefully somebody more knowledgeable than I am could comment on this blog, I’d really like to know.
I want to finish this blog series by recommending a movie. I know. Plot twist. You really expected to get some definite answers here, right?
I really can’t recommend it highly enough (please go and watch it!)
And as a fun fact, the algorithm that played Go against Lee Sedol lost 100–0 against the next iteration of the very same algorithm, and this went on in cycles.
They stopped developing it further because it would have turned into a massive black hole that would gulp down the whole Milky Way galaxy.
Just kidding it just wasn’t economical anymore, but what’s interesting is that they didn’t see any diminishing returns, the algorithm just kept on improving.
Than came AlphaZero which was even better and trained solely using self-play, no human intervention, no domain knowledge injection from human experts. And the very same algorithm generalizes to chess, Go and shogi. MuZero is their latest flagship (afaik)— and it’s even better than AlphaZero and can generalize to Atari games.
Nothing hints of AGI (artificial general intelligence) more than the stuff that DeepMind develops. These algorithms are, at the moment, deployed only in these constrained environments, but still, I’m damn impressed.
Final thoughts v2.0
Phew, we went all the way from 1950 to 2020, I hope you liked the journey!
I just want to wrap this journey up by mentioning that there are some concrete tests that we used and still use to test the machine intelligence.
- The Loebner prize — which is most similar to what the original Imitation Game was, but the competitors ended up using pure trickery to win, so the community lost interest and so this competition died off.
- Alexa prize — organized by Amazon obviously, is another test where the goal is to have a natural voice conversation with a bot for X minutes. But it’s only for students. It’s like a really weak form of TTT.
- Hutter prize — which is totally non-intuitive at a first glance, where the goal is to do a lossless compression of some knowledge base such as the Wikipedia content — Mark Hutter says this correlates with intelligence.
- Winograd schema challenge — where you’re trying to solve problems that require common-sense knowledge.
Go ahead and explore these at your own pace.
Also, take a look at this video covered by Lex Fridman, who probably has the best tech/AI podcast in the world at the moment. I’ve been following him since 2018. It inspired me to finally take the Turing’s paper “Computing Machinery and Intelligence” off the backlog and read it!
In one of the next blogs, I’ll cover Francois Chollet’s “On the Measure of Intelligence” that’s one of the most beautiful tests that I’ve recently seen.
Take care, stay safe, and never stop learning and improving! ❤️
If there is something you would like me to write about — write it down in the comment section or send me a DM, I’d be glad to write more about maths, ML, deep learning, software, landing a job in a big tech company, preparing for ML summer camps, electronics (I actually officially studied this beast), etc., anything that could help you.
Also feel free to drop me a message or:
- Connect and reach me on 💡 LinkedIn and Twitter
- Subscribe to my 🔔 YouTube channel for AI-related content️
- Follow me on 📚 Medium and 💻GitHub
- Subscribe to my 📢 monthly AI newsletter and join the 👨👩👧👦 Discord community!
And if you find the content I create useful consider becoming a Patreon!
Much love ❤️