Neuromorphic Chips and other Ways to Extend the Life of Moore’s Law
Thanks to Deep Learning, artificial intelligence is getting plenty of media attention. Increases in hardware performance, in particular ever faster and cheaper storage and computing power, are what’s driving those impressive advances.
When a certain Gordon E. Moore, co-founder of Intel Corp., published a paper in 1965 in which he postulated that the number of components and, by extension, the computing power of an integrated circuit (IC) would double every one to two years, he was met with disbelief. The semiconductor industry was in its infancy and there was scarcely historic data that Moore was able to draw on for his prognosis. ICs, after all, hadn’t been built prior to 1958. During the 1960s, the established wisdom was that a drastic miniaturization of transistors would quickly lead to inefficient and, most importantly, overheated chips.
It took Carver Mead, one of the pioneers of semiconductor electronics, to demonstrate that much smaller transistors were not only feasible, but also faster, cheaper and cooler than previously thought. Mead understood the correlation that Moore had described and introduced it to other experts by the name of Moore’s Law.
Mead predicted in the 1970s already that we’d be able to squeeze millions of transistors on a microchip, which back then were only prototypes in the R&D labs of semiconductor manufacturers. He taught a new subject that would soon be called Very-Large-Scale Integration. Mead, in short, came up with the concept of separating chip design and chip fabrication at scale that’s widely accepted today. He had another brilliant idea in the 1980s, but we’ll come back to that shortly.
It’s been half a century since Moore’s Law was formulated, and Gordon Moore has become a billionaire while Carver Mead is still a professor at the California Institute of Technology, or CalTech. Chip manufacturers such as IBM, Intel and AMD are still following the law, turning out ever-more powerful processors. Today, everybody carries a supercomputer in their pockets -- devices that not too long ago needed to be housed in their own, air-conditioned building. Yet slowly but surely the path to IC innovation is literally narrowing. The electrons’ race tracks are becoming thinner and thinner. Before long, they’ll measure just a few atoms across, leading to undesired consequences.
Silicon, the semiconductor that acts as isolator in a transistor, thereby barring electrons from getting through, loses its most important property across such minuscule distances and lets electrons pass. So-called quantum tunneling is to blame for very small leakage currents in the tiniest circuits which increase power consumption, affect their functioning and ultimately raise the temperature of the chip. Fabrication also becomes more demanding and expensive, driving manufacturers to be innovative when it comes to distributing their product. An 8-core processor with some fabrication defects will be sold as a cheaper 4- or 6-core processor.
Nowadays, chip structures are measured in nanometers. The current CPUs made by Intel and AMD, for instance, use 14 nm structures. And even though the “International Technology Roadmap for Semiconductors” lists going down to 10 nm next, Intel has declared that it will stick to 14 nm architecture for the eighth generation of processors. There are only two miniaturization cycles left until we hit the physical barrier of 5 nm. Consensus has it that below that number, things can’t get any smaller. Many experts have been predicting the end of Moore’s Law, but by 2020 exponential growth could finally have run its course.
There’s room for sub-units
Then what? Shrinking the components inside a CPU won’t yield any 2x performance improvements anymore. Manufacturers therefore play other tricks to tweak their hardware. The current trend to run an increasing number of processor cores in parallel will not be sustainable over the course of the next decades. You can stack circuits that used to be fabricated in two dimensions on top of each other, but 3-D processor blocks would need to run at lower clock speeds or they’ll burn out when they’re powered up.
It’s not clear if humankind’s insatiable appetite for computing power during the next half-century can be satisfied by the promised doubling of processor speeds every 18 months. We’ll probably see a blend of different optimization strategies. Already, the space opened up by ever smaller components is being used by specialized sub-units. Modern chips integrate dedicated hardware for encoding and decoding audio and video, as well as specialized hardware to process and decrypt data. Deep Learning gets a boost from specialized hardware, too. Neural networks are already run in parallel through graphics card cores, or GPUs, making them much zippier than a normal CPU.
Google has announced that the neural algorithms for Google Translate, AlphaGo & Co. run on special TPUs, short for tensor processing units, which have been expressly designed to implement neural networks and are much faster and efficient than GPUs. Nvidia, the largest manufacturer of graphics cards, begs to differ on speed, but there’s consensus when it comes to cutting power consumption in half. Those Application-specific Integrated Circuits (short ASICs), however, are one-trick ponies that can only do what they were designed for. They incur relatively high, one-time development costs and have to earn those investments back by higher performance and power savings. It’s telling that Google has embarked on this path, raising expectations that we’ll see future generations of Tensor-ASICs.
It’s getting hot in here
Before we take a look into the not-too distant future, we have to briefly return to the past. John von Neumann, the father of the architecture of modern PC chips, described in First Draft how calculator and memory have to interact to execute any type of program. Back then, there were only switches designed to solve a specific problem, meaning it required a lot of work to change them over to solve a different calculation.
Von Neumann’s design was groundbreaking and quickly found broad adoption. A von Neumann computer basically consists of a calculator and memory as a repository for both programming instructions and data. Separating the computer and memory unit, however, quickly became a problem known as the “von Neumann bottleneck.” That’s because every time an instruction is read, the machine can‘t process data. If, on the other hand, data is read, the CPU has to wait for instructions. Nowadays, we are mostly preoccupied with processing large amounts of data which struggle to squeeze through the von Neumann bottleneck and in the process literally produce heat.
It’s worth noting that von Neumann was inspired by research into how biological neurons work. Two years earlier, a brilliant team of researchers by the names of McCulloch and Pitts had described how nerve cells digitally process information by adding up the data and “firing” once the sum reaches a threshold. The notion that the brain could be reduced to a large number of interlinked logical operations was bordering on heresy back then. Von Neumann, though, was fascinated by it and turned the idea into the first generic computer architecture.
A new field of research is born
While the initial neuronal model devised by McCulloch and Pitts was very simple, neurophysiological understanding grew over the following decades and – in combination with rapidly growing computing power – led to a new field of research called neuroinformatics. It attracted mathematicians, physicists, biologists and computer scientists to conduct neuroscience work with a computer.
Modeling how nervous systems function is a central task of computational neuroscience, based again on its smallest unit, the neuron. Its condition is no longer described as binary as McCulloch and Pitts did, but time-dependent and so complex that it’s fair to speak of a neuron’s “behavior.” A complex neuronal model doesn’t just replicate the fact that different inputs are integrated but also in what time sequence it happens.
The human brain has hundreds of different types of neurons. Each type has its own morphology and biochemistry which lead to different behaviors. No wonder then that basic research is trying to find the models that can replicate this broad spectrum more accurately than the simply neuron models of the past.
Neuromorphic hardware busts the bottleneck
Back to Moore’s Law. Talk about its demise started in the late 1980s. Carver Mead suggested to take von Neumann’s rather conservative neuro-inspiration to its logical conclusion. He argued for using large amounts of biologically more accurate silicon neurons in chips which then could be networked like their biological role models to perform highly parallel computing tasks. This was supposed to work in analog, not digital fashion just like the complex neuronal models.
Inbound impulses are accumulated spike by spike just like inside a biological neuron and then sent along conductive tracks to connected silicon neurons. This could finally break the von Neumann bottleneck. For one, neuromorphic hardware reduces energy demand to a small trickle. The human brain only consumes a few milliwatts while a supercomputer needs more than a billion times more. The fact that the brain exhibits superior intelligence in such a small space is due to its highly parallel processing and how it stores information. Memory and processor are distributed across all neurons, and information is stored where it’s needed.
New hardware, new coding
Neuromorphic hardware today encompasses many different architectures. A former student of Carver Mead’s has developed the NeuroGrid system at Stanford which mostly relies on analog circuits. The University of Heidelberg is working on BrainScaleS, a chip that was fabricated with the relatively old 180 nm architecture but nevertheless houses 200,000 neurons with a total of 50 million synapses. Inside the chip, a combination of digital and analog components work in unison, allowing it to process electric signals at lightning speed. In just one day you could simulate 30 years of biological brain activity!
The big chip manufacturers, too, have unveiled similar architectures in recent years. The biggest challenge, though, is yet to come: how to program such neuromorphic hardware? Since there’s nothing of the old architecture left, you can’t run traditional programs on those chips. There is no central processor which executes the instruction set. No controller that defines in what order information will be processed. All there is are thousands upon thousands of small circuits that are all connected to each other.
Our coding paradigms therefore have to drastically change, and developers have to rethink everything. IBM has embarked on a remarkable effort with its neuromorphic system TrueNorth, designing not only an impressive chip but a whole ecosystem around it. It comes with a neuro-simulator, a new programming language with programming libraries and even a curriculum for future students.
Searching for the next big mystery
The evolution of computing then spawns hardware specialization and the growing need to organize these components along some division of labor. Neural architectures represent a key subset of those new specialists. Among the many approaches, you have on the one side chips optimized to run classic or “deep” neural networks and on the other side “neuromorphic” hardware. These brain-inspired approaches are still worlds apart. Conventional neural networks are resting on a solid theoretical foundation, can solve tough real-world problems and are penetrating markets with all possible use cases imaginable.
Carver Mead’s idee of neuromorphic hardware was, and still is, revolutionary because it challenges, among other things, our notion how the brain works. And therein lies the rub. We really would have to understand how the brain functions in order to translate this knowledge into truly useful hardware that’s able to solve real problems. There is no explanation so far how a biological brain can accomplish all those wonderful feats that have fascinated mankind since the dawn of time.
Neuromorphic systems reflect this conundrum, and that’s why there exists no neuromorphic system so far that is able to solve tough cognitive problems. The yet again impending demise of Moore’s Law hasn’t only inspired the creativity of engineers but helped expand research budgets. The European Union in 2013 poured more than a billion euros into the Human Brain Project to study the brain’s functions, including significant work on developing neuromorphic hardware. While the project has often been rightfully criticized for its size and lack of focus, it does demonstrate one thing. The world is ready to crack the next big mystery. Chasing Moore’s Law of exponential growth is a case in point that we simply can’t wait to more fully explain our own existence.