Share it

GPT-3 (acronym for Generative Pre-trained Transformer) is a autoregressive language model that uses deep learning to produce realistic text of a quality similar to that written by a human from a previous context. It was created in 2020 by OpenAI, an AI research lab founded with the aim of promoting and developing AI beneficial to humanity. The GPT-3 model has been trained with massive amounts of data crawled from the Internet, such as books, articles or news and has more than 175 billion parameters (it is two orders of magnitude larger than its predecessor, GPT-2).

Some of the practical applications that can be created with GPT-3 are: complete a text similar to what a human would do, classify a text, or answer questions formulated in natural language. In this link You can find an article from La Vanguardia that perfectly exemplifies the impressive behavior of GPT-3, which, having seen just an introductory paragraph, is capable of generating an entire article about AI.

In recent years, the popularity of GPT-3 and OpenAI has increased due to the wide media coverage that has been received by the launch of several models that use GPT-3 as a base such as Codex and DALL-E. The first, Codex, is a model trained with millions of public code repositories from GitHub and is capable of creating functional code of impressive quality in more than a dozen programming languages. The second, FROM-E, consists of an application that, based on textual descriptions, creates images with a level of reality and abstraction that has impacted the entire world.

A few months after its creation, in November 2021, OpenAI made access to GPT-3 models available to users through theAPI release. However, end users do not have access to the underlying model, and OpenAI requires API applications to go through a brief review to ensure they comply with its security policies and requirements.

GPT-3 is a pre-trained language model, which means it understands natural language to a certain capacity. Developers can take the basic GPT-3 model and tweak it to solve specific tasks. Precisely, the power of GPT-3 lies in its way of specializing it. Traditional machine learning models used in NLP (Natural Language Processing) problems require large amounts of data to achieve acceptable levels of accuracy. In contrast, GPT-3-based solutions require a small amount of data to achieve very high accuracy in many NLP tasks. Additionally, tuning a GPT-3 model is simple and does not require advanced machine learning knowledge. This lowers the barrier to entry for developers and companies looking to build sophisticated NLP solutions. GPT-3 and other similar products have become a lever for the rapid and easy development of a wide range of AI solutions.

At the local level, while GPT-2’s training data was only in English, GPT-3 expanded the scope of its training dataset to include representation of other languages, although these only represent 7% of the total data. Despite this limited percentage of non-English data, GPT-3 performs surprisingly well in non-global languages ​​like Catalan, yielding outstanding results such as the following:

  • Ask GPT-3: Who is the most famous Catalan poet?
  • GPT-3 answer: The most famous Catalan poet is probably Salvador Espriu.

At the same time, in 2020 the Generalitat of Catalonia launched AINA, a €13.5M project whose main objective is the creation and expansion of a corpus of the Catalan language in order to obtain linguistic models that cover its different variants and registers.

GPT-3 is undoubtedly an impressive achievement in technology and software engineering. However, carries risks and limitations that we should all be aware of. First, GPT-3 is capable of creating texts so realistic that it could lead to an increase in spam and fake content. We should probably have some kind of regulation that obliges content creators to indicate if they use these types of models. Secondly, the GPT-3 model is what is called a “black box model” (there is no knowledge of the reason for its results) and we should develop explainability tools to better understand how it makes predictions. Third, the cost of training GPT-3 from scratch is in the order of millions of dollars. This means that few companies have the means to train, maintain and execute these types of models. This could lead to a concentration of powerful AI capabilities in the hands of a few giant entities. Finally, it should be noted that training a GPT-3 or other large NLP models is computationally expensive and therefore consumes a lot of resources. According to a recent study by Google and the University of California at Berkeley, training GPT-3 produced the same amount of tons of carbon dioxide as 120 passenger cars would produce in a year. We must be aware of the environmental impact of these models and ensure that we use green energy in their training and operation.

The public launch of GPT-3 as an API was just a few months ago and we are already seeing developers using it in a variety of impressive applications. We are confident that the next generations of linguistic models will be even more impactful and have the potential to transform natural language-based business processes and augment human capabilities.

Marina Alonso Poal, Adil Moujahid

Marina Alonso Poal, Senior Data Scientist @ NTT DATA AI Center of ExcellenceAdil Moujahid, Technical Manager @ NTT DATA AI Center of Excellence

Other articles

In a digitalized world, data is the new oil, but only a few have them in adequate quantity and quality and in time to be […]

Today, no one doubts the potential of Artificial Intelligence to change the world around us. In line with what Brad Smith, president of Microsoft, has […]

To date, security models against network incidents and threats are based on adding different layers of security to the infrastructure. We find the simile with […]

CIDAI