From ChatGPT to SLMs: Towards a more efficient and autonomous Artificial Intelligence

We will probably all remember the impact that the presentation of ChatGPT had. It was the end of November 2022 and the news and reactions about it were constant. Its dissemination was so extraordinary that the OpenAI engineers themselves, creators of the model, were surprised by the coverage and attention it received.^[1].

ChatGPT gained one million users in five days^[2] and 100 million in two months, becoming the fastest growing internet application in history^[3]. During that December, ChatGPT was widely evaluated as possessing some unprecedented innovative and powerful capabilities. The The New York Times called it “the best artificial intelligence chatbot ever released to the general public”^[4]. The magazine The Atlantic mentioned it as one of the most important advances of the year, highlighting how “the eruption of generative AI can change the way we think about how we work, how we reason, and what human creativity is”^[5]. The Vox portal pointed out: “ChatGPT is the general public’s first practical introduction to how powerful modern AI has become, resulting in widespread amazement”^[6]. Paul Graham, co-founder of the technology portal Y Combinator tweeted: “The most surprising thing about the global reaction to ChatGPT is not just the number of people who are impressed, but their profiles and who they are. People who don’t get excited about any cutting-edge innovation. “Something really big is happening”^[7].

ChatGPT is a generative artificial intelligence chatbot based on a Large Language Model (LLM). ChatGPT can generate human-like conversational responses and allows users to fine-tune and direct a conversation towards a desired length, format, style, level of detail and language^[8]. Since its appearance, other chatbots based on LLMs have emerged such as Gemini, Claude, Llama, Ernie, Grok, DeepSeek and Qwen2.5^[9], and a large part of society has integrated these applications into their daily lives, whether for work, home or educational use. Users use LLMs for a wide range of applications, possibly the main one being to obtain information quickly and as a useful alternative to traditional search engines, as it can provide instant and specific answers without the need to navigate through multiple links, advertising and superfluous text.^[10]. Other common uses are assistance with writing texts, help with programming, text translation, educational support, idea generation, character and dialogue creation, content recommendation or simple accompaniment and entertainment.^[11].

The use of LLMs such as ChatGPT may present limitations or not be optimal in some of these applications for several reasons. On the one hand, these are models that generate content based on the knowledge they acquire during training with large volumes of data. It is necessary to consider, however, the ethical and practical implications of this training. The use of personal data and protected content raises issues of privacy and intellectual property^[12], while the need for high-quality data may favor the monopoly of large technology companies^[13]. In this way, when requesting specific data, these models generate the most probable answer and can make errors known as “hallucinations“, where they generate incorrect or invented information^[14]. This can be especially problematic in fields that require precision, such as academic research or writing factual content. For example, in a more domestic context, if we ask for a cooking recipe based on the ingredients we have in the fridge, the generated recipes may seem correct in theory, but not work in practice.^[15]. On the other hand, generalist LLMs are trained from a wide and diverse range of content from various sources, including websites, books, scientific literature, press, portals such as Wikipedia, code repositories, social media content or legal documents.^[16]. For the models to be able to “remember” all this knowledge, it is necessary that the neural networks that form them are very large so that they can store the information in their parameters, called weights. Therefore, if we want an LLM to be able to answer questions about literature, cooking or engineering and do so in multiple languages, the size of the model will necessarily be very large. It is estimated that the model ChatGPT-4 has 1.8 trillion parameters^[17], which would imply an infrastructure of some 3600GB of RAM to deploy a single instance of the model^[18].

However, if we think about more specific applications that require deep but more limited expert knowledge, the so-called Small Language Models (SLMs) come into play. These are artificial intelligence models designed to process and generate natural language with significantly fewer parameters compared to LLMs, the large language models.^[19]. Them SLMs excel at specialized tasks with faster performance and lower power consumption. Researchers are increasingly focused on developing more sophisticated SLMs that balance performance, efficiency, and computational constraints, making them attractive for practical, real-world AI implementations.^[20]. Furthermore, coupled with a lower need for computational resources for their training and deployment, the availability of open models allows developers and researchers to adapt and improve the models according to specific needs, in contrast to closed commercial models, accessible only via API, which often limit the ability to adapt and customize.

Beyond retaining information, a perhaps even more relevant factor is the ability to understand and reason. These functionalities allow generating agents based on LLMs (or SLMs) that are capable of using tools to obtain information from search engines for information retrieval, interacting with APIs to access external services or databases, accessing file systems to read and write data, enabling complex mathematical operations or providing access to linked models.^[21].

In this context of growth and evolution of constant language models, it is important that we consider adopting and adapting small language models when developing tools based on natural language models. SLMs offer us the opportunity to customize solutions with lower resource consumption, allowing us to develop more efficient applications adapted to our specific needs. Working with these models forces us to be attentive to the state of the art and new trends, but it provides us with a capacity for adaptation that gives us greater autonomy and allows us to add significant value to our technological solutions. In this regard, it should also be remembered that the BSC, within the framework of Aina Project, has developed the salamander model, of 7 and 2 trillion parameters^[22], specially trained with content in Catalan language. We must jointly promote these initiatives to continue to be autonomous and competitive.