A recently developed predictor-corrector robot successfully catches balls about 80% of the time by solving differential equations that get updated with corrective trajectory data. About 3-trillion calculations are carried out over a one second trajectory.1 The results are isomorphic to humans catching the same balls, the difference being in computational efficiency. Over the same period of time the visual data from our eyes has only enough time to propagate signals down neuronal chains no longer than 200 neurons given the slowness of biological neuronal signal conduction. Assuming (very coarsely) that 3-trillion divided by 200 implicates the firing of 15-billion highly parallelized neurons, we'd use up nearly 75% of our neocortical brain cells to catch balls.

Evidently the approximation is extremely crude as humans playing catch can do so while carrying on deep discussions over Shakespeare or Bugs Bunny at a public park while simultaneously using their peripheral vision to check out the sexy ladies or dudes passing through the background, cognizing the weather, keeping track of time, etc., while the robot can only catch. To be this efficient something else has to be going on in human brains. Repeatedly catching (or missing) things many times over during our lifetime culls, grows, and maintains topologically related neuronal prediction circuits. You recognize the initial conditions of a ball being tossed your way, even if by surprise, and predefined spatial-temporal neocortical sequences fire off directing your arm, hand, and eyes, etc., to where they need to be sans quadrature of any differential equation. Jeff Hawkins elaborates on this deeply in his book "On Intelligence" as the basis to his AI company Numenta partnered with IBM.

Current deep learning artificial intelligence systems generating both today's successes and hype fully two decades after Kasparov's defeat still lack the innate, strong, general intelligence depicted by HAL in the movie "2001: A Space Odyssey." This isn't to say that current deep learning AI systems don't solve increasingly useful problems. Google Translate, for example, performs real-time translation between arbitrary language pairs with relatively good accuracy, a feat that physicist and AI thinker Douglas Hofstadter wouldn't have thought possible without capturing the full complexity of being human. The machines of Google’s Professor Geoff HintonProfessor Fei-Fei Li at the Stanford Vision Lab, and Professor Yann LeCun at Facebook’s AI lab produce intelligent behavior without harboring any actual intelligence. 

There’s a big difference between intelligence and intelligent behavior. When you stop feeding today's systems input, their intelligent behavior goes away. Whereas your mind, even as you lay in the darkness of your bedroom falling asleep, continues computing unabated, thinking, musing, possibly stumbling into an epiphany or two. Pacemaker-like neurons keep your brain active, tying the complexities of your life together

One sure-fire method to get machines to this point is through copying the salient spatial-temporal topological features of the mammalian neocortex, the only intelligent, auto-associative, predictive memory pattern processing machine that we know of. The idea has traction given the recent partnership between IBM and Numenta to develop such machines.

Approximately the size and thickness of a dinner napkin, six-neuronal layers deep, the human neocortex is a recent mammalian evolutionary addition. It serves as the seat of your intelligence. Your dog has one, albeit smaller, but your Gecko doesn’t. The computing elements of the neocortex are columns of cells running up the six layers of neurons repeated laterally over and over again, with each column doing roughly equivalent work while talking to near and distant neuronal neighbors via dense thickets of axons and dendrites. If you swap the optical nerve of a baby ferret with its auditory equivalent in the neocortex, the ferret will still grow up hearing and seeing because the auditory region is architecturally and functionally identical with the optical part of the neocortex. 

The first few layers of neocortical neurons take input from your senses and begin classifying primitive abstract concepts: colors, horizontal and vertical edges, upward and downward motions, sounds, textures, smells, etc. These first sets of primitives get subsequently passed upwards to the next higher layers. The next layers classify the primitives from below into more general primitives: ovals become faces, sounds become voices, and smells become scents. The final layers up columns classify deep objects: a face becomes a happy wife humming her favorite song while weeding her flowerbed. These processes involving intricately convoluted backward and forward feedback loops between the layers of neocortical neurons are fairly well understood up through the first four layers, but less so up the remaining layers across larger and larger areal swaths of the neocortex. When humans and tiny dogs with tinier brains catch balls, we do so based on a lifetime of predictive, auto-associative memory pattern operations that are always going on in our neocortex. 

You mold your neocortex throughout your years by both strengthening old connections and creating new ones as you improve at playing the piano while letting other connections atrophy or even die according to your neglect of, say, high school French. A new connection between distant neurons carries a lot more weight than merely tuning the strength of an extant pathway as today's neural nets do. The sum result of this activity (you + universe + time) is the standing up of a hierarchical predictive collection of associative models of irreducible representations of the universe based on what you’ve experienced in the past right up through the evanescent experiences you’re having while reading the period at the end of this sentence.

To make this clearer consider the following thought experiment. Imagine yourself outside your house about to unlock the door without being aware that some joker has barricaded it shut from the inside. You’re preoccupied with preparing dinner while fully expecting things to go as they always have: inserting key, twisting, hearing deadbolt shift, rotating doorknob, pushing door forward, and going in. Other streams of predictive consciousness are also going on in your head while you go through the motions with your key, from shushing the barking dog to seeing your kids watching TV when they should be doing their homework. You push your door forward but the extra inertia stops you displacing your thought processes with momentary puzzlement even as you hear your kids snickering from behind the door. Anger wells up as entirely new streams of predictive consciousness take charge. Continuous prediction/correction based on current input and past memory is the hallmark of our intelligence. Even our eyes are always making tiny, involuntary moves to compare visual prediction against actuality.

It's in this sense, Jeff argues, that in order to make intelligent machines, machine architectures need to be based on memory processing neocortical-like neural networks capable of operating with a sense of past, present, and predicted futures to build richer and richer representations of the world from babyhood though adulthood, from catching baseballs with mitts to catching ladies with Corvettes.

So while the recent result of the Stanford Vision Lab is impressive, creating and training a neural net system capable of writing simple English sentences about pictures pulled off the internet, it did so with a 140-parameter mathematical model on a neural net sporting 24-million nodes tied together by 15-billion fixed connections on processors producing tens of gigaflops of processing power. The machine only produces intelligent behavior, not intelligence. It’s as alive as a cursor flashing signaling the need for more input. 

1. “Kinematically Optimal Catching a Flying Ball with a Hand-Arm-System,” Berthold Bauml, Thomas Wimbock and Gerd Hirzinger, Institute of Robotics and Mechatronics, 2010).