With recent advances, the tech industry is leaving the confines of narrow artificial intelligence (AI) and entering a twilight zone, an ill-defined area between narrow and general AI.
To date, all the capabilities attributed to machine learning and AI have been in the category of narrow AI. No matter how sophisticated – from insurance rating to fraud detection to manufacturing quality control and aerial dogfights or even aiding with nuclear fission research – each algorithm has only been able to meet a single purpose. This means a couple of things: 1) an algorithm designed to do one thing (say, identify objects) cannot be used for anything else (play a video game, for example), and 2) anything one algorithm “learns” cannot be effectively transferred to another algorithm designed to fulfill a different specific purpose. For example, AlphaGO, the algorithm that outperformed the human world champion at the game of Go, cannot play other games, despite those games being much simpler.
Many of the leading examples of AI today use deep learning models implemented using artificial neural networks. By emulating connected brain neurons, these networks run on graphics processing units (GPUs) – very advanced microprocessors designed to run hundreds or thousands of computing operations in parallel, millions of times every second. The numerous layers in the neural network are meant to emulate synapses, reflecting the number of parameters that the algorithm must evaluate. Large neural networks today may have 10 billion parameters. The model functions simulate the brain, cascading information from layer-to-layer in the network – each layer evaluating another parameter – to refine the algorithmic output. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human, such as digits or faces.
(Above: Deep Learning Neural Networks. Source: Lucy Reading in Quanta Magazine.)
While it is possible to further accelerate these calculations and add more layers in the neural network to accommodate more sophisticated tasks, there are fast approaching constraints in computing power and energy consumption that limit how much further the current paradigm can run. These limits could bring about another “AI winter,” where expectations of the technology fail to live up to the hype, thus lowering implementation and future investment. This has happened twice in the history of AI – in the 1980s and 1990s – and required many years each time to overcome, waiting for advances in technique or computing capabilities.
Avoiding another AI winter will require additional computing power, perhaps from processors specialized for AI functions that are in development and expected to be more effective and efficient than current-generation GPUs while reducing energy consumption. Dozens of companies are working on new processor designs, designed to speed the algorithms needed for AI while minimizing or eliminating circuitry that would support other uses. Another way to possibly avoid an AI winter requires a paradigm shift, going beyond the current deep learning/neural network model. Greater computing power and/or a paradigm shift could lead to a move beyond narrow AI towards “general AI,” also known as artificial general intelligence (AGI).
Are we shifting?
Unlike narrow AI algorithms, knowledge gained by general AI can be shared and retained among system components. In a general AI model, the algorithm that can beat the world’s best at Alpha Go would be able to learn chess or any other game. AGI is conceived as a generally intelligent system that can act and think much like humans, albeit at the speed of the fastest computer systems.
To date there are no examples of an AGI system, and most believe there is still a long way to this threshold. Earlier this year, Geoffrey Hinton, the University of Toronto professor who is a pioneer of deep learning, noted: “There are one trillion synapses in a cubic centimeter of the brain. If there is such a thing as general AI, [the system] would probably require one trillion synapses.”
Nevertheless, there are experts who believe the industry is at a turning point, shifting from narrow AI to AGI. Certainly, too, there are those who claim we are already seeing an early example of an AGI system in the recently announced GPT-3 natural language processing (NLP) neural network. While NLP systems are normally trained on a large corpus of text (this is the supervised learning approach that requires each piece of data to be labeled), advances toward AGI will require improved unsupervised learning, where AI gets exposed to lots of unlabeled data and must figure out everything else itself. This is what GPT-3 does; it can learn from any text.
GPT-3 “learns” based on patterns it discovers in data gleaned from the internet, from Reddit posts to Wikipedia to fan fiction and other sources. Based on that learning, GPT-3 is capable of many different tasks with no additional training, able to produce compelling narratives, generate computer code, autocomplete images, translate between languages, and perform math calculations, among other feats, including some its creators did not plan. This apparent multifunctional capability does not sound much like the definition of narrow AI. Indeed, it is much more general in function.
With 175 billion parameters, the model goes well beyond the 10 billion in the most advanced neural networks, and far beyond the 1.5 billion in its predecessor, GPT-2. This is more than a 10x increase in model complexity in just over a year. Arguably, this is the largest neural network yet created and considerably closer to the one-trillion level suggested by Hinton for AGI. GPT-3 demonstrates that what passes for intelligence may be a function of computational complexity, that it arises based on the number of synapses. As Hinton suggests, when AI systems become comparable in size to human brains, they may very well become as intelligent as people. That level may be reached sooner than expected if reports of coming neural networks with one trillion parameters are true.
So is GPT-3 the first example of an AGI system? This is debatable, but the consensus is that it is not AGI. Nevertheless, it shows that pouring more data and more computing time and power into the deep learning paradigm can lead to astonishing results. The fact that GPT-3 is even worthy of an “is this AGI?” conversation points to something important: It signals a step-change in AI development.
This is striking, especially since the consensus of several surveys of AI experts suggests AGI is still decades into the future. If nothing else, GPT-3 tells us there is a middle ground between narrow and general AI. It is my belief that GPT-3 does not perfectly fit the definition of either narrow AI or general AI. Instead, it shows that we have advanced into a twilight zone. Thus, GPT-3 is an example of what I am calling “transitional AI.”
This transition could last just a few years, or it could last decades. The former is possible if advances in new AI chip designs move quickly and intelligence does indeed arise from complexity. Even without that, AI development is moving rapidly, evidenced by still more breakthroughs with driverless trucks and autonomous fighter jets.
There’s also still considerable debate about whether or not achieving general AI is a good thing. As with every advanced technology, AI can be used to solve problems or for nefarious purposes. AGI might lead to a more utopian world — or to greater dystopia. Odds are it will be both, and it looks to arrive much sooner than expected.
Gary Grossman is the Senior VP of Technology Practice at Edelman and Global Lead of the Edelman AI Center of Excellence.