In fact, the gap between imagination and reality is exceptionally wide. Although it would be a foolhardy prophet who predicted that this gap will never be closed, the human race is not on the threshold of replicating itself in machinery. To imagine otherwise is to underestimate the complexity of the mental processes that humans employ as a matter of routine. The truly awesome aspect of this complexity is not the intellectual output of a William Shakespeare, a Ludwig van Beethoven or an Albert Einstein but the routine calculations performed by the mind of a five-year-old child. They are no less marvellous because they happen below the horizon of consciousness.
Take vision. The cinematic convention for the view through a robot’s eyes is either that of a distorted fish-eye lens or the cross-hairs of a gunsight. We, as viewers, understand the convention because we have brains that have evolved in order to interpret what our eyes see, but such a view would be useless to a robot. The first requirement of any visual system is that it be able to determine where an object ends and the background begins. However, the world is not like a child’s colouring book, with comforting black lines to delineate the boundaries, and the light falling on a human retina does not produce a ready-made picture. It merely generates a series of electrical impulses, which to be of any use must then be interpreted by the brain.
The best analogy for what a robot would ‘see’ is a rectilinear grid of numbers in which each number represents the brightness of a small portion of the visual field, with larger values indicating brighter cells within the grid. We may reasonably expect the robotic brain to be able to interpret any significant difference in the values of adjacent cells as indicative of a boundary, but this is where the process of interpretation becomes vastly more complicated. A large number next to a small number could be the result of a light object against a dark background, a dark object against a light background, a dark and a light object touching each other, the edge of a shadow, two different shades of the same colour on the same surface, and many other combinations, all of which the human brain can distinguish so easily that we fail to appreciate how difficult an achievement this is.
And the difficulties are just beginning. Once the objects and background in the visual field have been delineated and distinguished, a robotic brain then needs to identify these objects, which means that it needs to identify fundamental properties such as colour and composition. At first glance, the problem appears to be a trivial one. Compare a lump of coal and a snowball, one black, the other white. If larger numbers represent brighter regions of the visual field, then large numbers must, intuitively, indicate the presence of a light material such as snow (and small numbers the presence of a dark material such as coal). Not always. More light bounces off a lump of coal outdoors than off a snowball in a typical indoor setting, because the brightness of a given object is merely a measure of the amount of light it reflects. Nevertheless, the human visual system can see a bright outdoor object as black and a dark indoor object as white. Unlike a robotic system, it is not easily fooled by trompe l’oeil confabulations, perceiving the world as it is rather than the world as it is represented on the surface of the retina.
The next problem is the estimation of depth and relative size. In the absence of clues, even the human system can fail here, unless there is a basis for comparison. A postage stamp in your hand and a building on the horizon that is the same shape will produce the same effect on the retina, and it is only by experience that we learn which is which. In my own experience, when working in the barren, featureless landscape of the Sahara Desert, I found it almost impossible to decide whether a distant object was a stick driven into the sand a quarter of a mile away or an oil rig ten miles away.
Finally, having solved the problems of shape, brightness, colour and size, our artificial vision module will also need to assign identifying tags (names) to the objects it detects and be aware of their purpose. This is difficult enough for simple geometric shapes and letters of the alphabet, but it is an almost intractable problem to construct any kind of artificial template for the recognition of human faces. Yet, unless a person has sustained damage to one or more parts of their brain that deal with vision, they will still be able to recognize people whom they haven’t seen for twenty years or more, despite the subtle changes that will have taken place in the intervening period. The template would also have to take into account what a face looks like under a huge range of different lighting conditions, a compensation that a human viewer performs effortlessly.
In other words, a seeing robot simply cannot be built with just the fish-eye viewfinder of movie convention, and it should not come as a surprise to learn that the human visual system is not built this way either. And there is one more factor to consider: an early evolutionary adaptation of our sensory input systems ensures that we pay attention only to signals that are changing. This adaptation is most obvious in relation to our senses of hearing and smell—how often have you totally forgotten an annoying racket emanating from a nearby construction site, or a nauseating smell that made you feel sick when you first encountered it—but even our vision shuts down if absolutely nothing is happening in the visual field, to give our video-processing circuitry a break. There is more to seeing than meets the eye.
Once we have mastered the problems associated with artificial vision, the next challenge is locomotion. The optimum engineering solution to the difficulties that arise in moving an object around is to set that object on wheels, especially if it is heavy. For this reason, it is tempting to think that a robot would be much better off with wheels than with legs, except that wheels are of limited use on rough or uneven terrain.
However, a mere two legs is not the intuitive choice. As one leg moves, the other has to maintain the body’s balance, which involves constant monitoring and feedback in order to make instant fine adjustments. All of this requires processing power. While four legs may be a better technical solution—creatures with four legs can move far more quickly, and far less effort is required to maintain balance—there is an unexpected advantage in bipedalism that offsets these assets.
The human hand, even more than the hand of other primates, is a machine exquisitely adapted to its purpose. And the evolution of that hand, over the last million years, has driven the evolution of the brain, to the extent that a significant amount of grey matter is now dedicated to the operation of the hand. While the human brain has not increased much in size during this period, there has been a considerable increase in the size of the cerebral cortex, especially the frontal lobes, which are associated with such so-called ‘executive functions’ as self-control, planning, reasoning and abstract thought. However, like every part of human behaviour that we take for granted, building a robot with these functions presents challenging engineering problems.
We often hear on the news about the latest in robotic hands, and what they are capable of doing, but if you think about the range of grips that the hand has available, you will quickly realize just how versatile is the human hand, and how much further researchers in the field have to go to design a robotic version of equivalent versatility. There is the grip between thumb, index finger and middle finger used to hold a pen, the grip between thumb and index finger for turning a key in a lock, the grip between two fingers to hold a cigarette or a spliff, the grip that employs all five fingertips to pick up a flat disc like a beermat, and the way we grip a glass of beer or a hammer.
In this last case, the grip is the same, but the amount of pressure applied by the fingers differs considerably, illustrating another characteristic of the human hand: the ability to vary the amount of pressure applied depending on the object to be picked up. And then there is the dazzling dexterity required to manipulate a pair of chopsticks effectively, a skill at which, incidentally, most Chinese demonstrate only moderate competence. The point to bear in mind here is not that the human hand is a masterpiece of engineering, which it is, but that achieving this level of dexterity requires a colossal amount of processing power.
Imagine that we have finally succeeded in designing suitable visual, locomotor and manipulation systems, but there is another problem lurking on the horizon. An intelligent system cannot treat every object that it encounters as a unique entity unlike anything else it has ever seen. It has to have some means of deciding whether a new object belongs in a previously seen category or whether it should be assigned to a new category, and in making that distinction it has to have some way of distinguishing between essential and incidental properties. At the risk of repetition, this is another skill that humans are good at, but designing a comparable artificial system is a massive engineering challenge.
One of the most interesting commentaries on robotics is Isaac Asimov’s novel I, Robot, in particular the book’s three laws of robotics:
• A robot may not injure a human being or, through inaction, allow a human being to come to harm.Asimov showed remarkable insight by including the third law, because self-preservation is not an automatic property of an intelligent system. However, with the first and second laws, the author fell into the trap of echoing the ancient fear, illustrated, inter alia, by the rampaging golem of Jewish legend, Faust’s bargain with the devil, the sorcerer’s apprentice, Frankenstein’s monster, and the computer in 2001: A Space Odyssey, that artificial intelligent systems would one day become so smart and so powerful that they would turn on their creators.
• A robot must obey orders given it by human beings except where such orders conflict with the first law.
• A robot must protect its own existence as long as such protection does not conflict with the first or second law.
Unfortunately, Asimov was unable to step outside his own thought processes and recognize them as artifacts of his mind rather than universal and scientifically verifiable laws. The human capacity for evil is never far from our thoughts, and it is disarmingly simple to imagine evil to be an inescapable aspect of our existence, just as it is almost instinctive to think that a self-aware system must possess an ego, as envisaged by the title of the novel (and ego, or intention, is a necessary component of evil).
On the other hand, although machines built originally by humans are unlikely to turn on their creators, we have no way of knowing whether other civilizations on other planets have developed machines that are programmed to kill. Even now, it is possible that a civilization in a not too distant star system, having picked up I Love Lucy on its radio telescopes, many years after the original broadcasts, has dispatched a fleet of murderous robots, like the Cylons in Battlestar Galactica, to exterminate the perpetrators of this outrage.