Share it

Marina Alonso Poal, Senior Data Scientist @ NTT DATA AI Center of Excellence

Adil Moujahid, Technical Manager @ NTT DATA AI Center of Excellence

 

GPT-3 (acronym for Generative Pre-trained Transformer) is an autoregressive language model that uses deep learning to produce realistic text of a human -written quality based on a previous context . It was created in 2020 by OpenAI , an AI research lab founded with the goal of promoting and developing AI beneficial to humanity. The GPT-3 model has been trained with massive amounts of data crawled from the Internet, such as books, articles or news, and has more than 175 billion parameters (it is two orders of magnitude larger than its predecessor, GPT- 2).

Some of the practical applications that can be created with GPT-3 are: completing a text similar to what a human would do, classifying a text, or answering questions posed in natural language . At this link you can find an article from La Vanguardia that perfectly exemplifies the impressive behavior of GPT-3, which, having seen just one introductory paragraph, is able to generate an entire article about AI.

In recent years, the popularity of GPT-3 and OpenAI has increased due to the wide media coverage received by the release of several models that use GPT-3 as a base such as Codex and DALL-E . The first, Codex , is a model trained on millions of GitHub public code repositories and is capable of creating functional code of impressive quality in over a dozen programming languages. The second, DALL-E , consists of an application that, based on textual descriptions, creates images with a level of reality and abstraction that has impacted the entire world.

A few months after its creation, in November 2021, OpenAI made available to users access to GPT-3 models by releasing an API . However, end users do not have access to the underlying model, and OpenAI requires API applications to go through a brief review to ensure they comply with its security policies and requirements.

GPT-3 is a pre-trained language model, which means it understands natural language up to a certain capacity. Developers can take the basic GPT-3 model and tweak it to solve specific tasks. Precisely, the power of GPT-3 lies in its way of specializing it. Traditional machine learning models used in NLP (Natural Language Processing) problems require large amounts of data to achieve acceptable levels of accuracy. In contrast, GPT-3 based solutions require a small amount of data to achieve very high accuracy in many NLP tasks. Also, fitting a GPT-3 model is simple and does not require advanced machine learning knowledge. This lowers the barrier to entry for developers and companies looking to build sophisticated NLP solutions. GPT-3 and other similar products have become a lever for the rapid and easy development of a wide range of AI solutions.

Locally, while GPT-2’s training data was only in English, GPT-3 expanded the scope of its training data set to include the representation of other languages, although these only they represent 7% of the total data. Despite this limited percentage of non-English data, GPT-3 performs surprisingly well in non-global languages such as Catalan, yielding outstanding results such as the following:

  • Question to GPT-3: Who is the most famous author of Catalan poetry?
  • Answer GPT-3: The most famous Catalan poet is probably Salvador Espriu.

At the same time, in 2020 the Generalitat de Catalunya launched AINA , a €13.5M project whose main objective is the creation and expansion of a corpus of the Catalan language in order to obtain linguistic models that cover its different variants and records.

GPT-3 is undoubtedly an impressive achievement in technology and software engineering. Even so, it carries risks and limitations that we should all be aware of . First, GPT-3 is capable of creating such realistic text that it could lead to an increase in spam and fake content. We should probably have some kind of regulation that forces content creators to state if they use these types of models. Second, the GPT-3 model is what is called a “black box model” (there is no knowledge of the why of its results) and we should develop explainability tools to better understand how it makes predictions . Third, the cost of training GPT-3 from scratch is in the millions of dollars. This means that few companies have the means to train, maintain and execute these types of models. This could lead to a concentration of powerful AI capabilities in the hands of a few giant entities. Finally, it should be noted that training a GPT-3 or other large NLP models is computationally expensive and therefore resource intensive. According to a recent study by Google and the University of California at Berkeley, training the GPT-3 produced the same amount of tons of carbon dioxide as 120 passenger cars would produce for a year. We must be aware of the environmental impact of these models and ensure that we use green energy in their training and operation.

The public release of GPT-3 as an API was just a few months ago, and we’re already seeing developers using it in an impressive variety of applications. We are confident that the next generations of language models will be even more impactful and have the potential to transform natural language-based business processes and increase human capabilities.

Other articles

Dr. Vicent Ribas, Head of research line in Data Analytics in Medicine at Eurecat. Over the past decade, deep learning research efforts have been directed […]

The current business landscape is characterized by its complexity and dynamism. In this context, no organization can navigate in isolation , collaboration is a key […]

Today no one doubts the potential of Artificial Intelligence to change the world around us. In line with what was commented by Brad Smith, president […]

CIDAI