Understanding GPT-3 by OpenAI

It can understand a Series of Photos, Calculate Inflation Rates, and Tell Us What the Moon Looks Like

Jul 21, 2020

OpenAI is a nonprofit artificial intelligence research organization, founded by Elon Musk, Peter Thiel, and the Koch brothers. They seek to make artificial intelligence systems that are even better than humans at their most fundamental tasks: recognizing images, learning and reasoning.

OpenAI recently released a paper by DeepMind spinoff GPT-3, written by Paul Bettner, DeepMind CTO, and Paul Merickel, a professor of applied neural networks at ETH Zurich, that describes the neural networks used in the project. In the paper, the authors describe how they were able to build on previous GPT-2 research and refine it further to be able to achieve the new capability of automatic knowledge retrieval, deep learning for science, and training the system to perform a wide range of tasks.

GPT-3

GPT-3, which stands for Grand Unified Processes of Knowledge, is a new type of machine learning model that utilizes the concept of having a single state in the form of a corpus of documents that, with training and expansion, creates a single model.

There are three types of knowledge retrieval: causal retrieval (it retrieves information from directly relevant data), non-causal retrieval (it retrieves information from indirect data), and prokaryotic retrieval (it retrieves from symbolic information), each of which is distinct, yet all achieved by a single neural network. GPT-3 allows us to build systems that are both causal and non-causal, in addition to learning from different kinds of representations (semantic, prokaryotic, causal).

How is it made?

GPT-3 has been trained by a group of experiments to learn how to handle a range of activities (e.g., how to overcome a first-order obstacle) by solving the problems themselves. Then, when a task comes up again, the system will choose whether to recognize it in the context of the train's prior knowledge of how to deal with first-order obstacles. This is a trick that AIs have done for decades but only the best possible tests have previously enabled AIs to answer yes or no when asked the question directly (like the Beagle 2 robots and Google's Knowledge Graph).

In its online version, GPT-3 uses a type of autonomous reasoning — but it is an exact parallel to what humans are already doing. For example, in the first little experiment on the subject, they controlled a group of N1 monkeys, and their test group could learn to recognize a specific species of plant by observing the patterns of the structures in the primary vegetation in the area. A simple result would be for the monkeys to form the word MEMA, for pink pigeon, in the order in which they viewed the pattern of structures in the primary vegetation. However, this neural network could learn at the same time by finding the exact form of the word.

What is it used for?

I see GPT-3 as a useful and powerful tool for large-scale cognitive computation, for example, for solving language problems. In the first paper, the authors explained how a GPT-3 model could be trained to decipher a series of images of a face, such as a male or female, and how it could be trained to recognize features in a novel picture. This shows that the intelligence in GPT-3 is as computationally complex as the representation of faces. Some people speculate that similar technologies could be used for facial recognition, such as by translating such neural network into databases, and doing the work of machines that are absolutely incapable of doing that task on their own.

GPT-3 against other comparable AI models

Normally the program can easily process between 100 and 2000 input/outputs per second. This is true of any program, but we believe with GPT-3, 1/8 of a node is always at the top of the network running its program; that would allow GPT-3 to process 98 billion,000,000,000,000,000,000 inputs per second. The other 8 units are working directly to try to understand what it is we’re looking at.

OpenAI is able to train the algorithms (e.g., the main computer to teach itself how to recognize and “recognize” specific values such as places in pictures) to recognize their data by training a “global training network” to work on network decisions. The training network picks up the signals from each input and combines them in a way that helps the main computer determine what the main computer needs to do to retrieve the proper network byte. The actual machine, instead of picking up the exact output, goes for its input to feed it that input.

Real-world examples

Some applications built by the OpenAI community include:

Code: React components generation, code translation, text->bash
Articles: summarizing articles, book reviews, writing articles, resume, cover letter, sales pitch, tweets, captions, email and idea generation etc.
Chatbots: Q/A, generate analogies, songs
Games: play chess, DOTA, text to emoji and vice versa etc
Much more coming in the future.

Now if you’ve made it this far, here comes the fun part: This entire articles is written by GPT-3 AI. Yes, really. :)

Understanding with Jenil

Discussion about this post