Presentation High Impact Project "Smart Tools for Summarising and Generating Audiovisual Content in News"

In an era where information is abundant and news is consumed via multimedia platforms, there is a greater need than ever for efficient summarising tools.

Led by Eurecat with the support of the Computer Vision Center and the Polytechnic University of Catalonia, this transformational project at the intersection of technology and journalism introduces cutting-edge automated tools designed for end-to-end synthesis of video content and dynamic illustration, all driven by the power of deep learning methods.

The development of this project goes beyond simple summary as it recognises the critical role of illustration in enhancing comprehension. We aim to harness advanced deep learning techniques to dynamically generate visual representations which encapsulate the key elements of news stories.

The results of the project enhance the consumer’s experience and also allow content creators to convey complex information with clarity and impact. In short, this presentation will show the results of the creation of innovative tools for new forms of storytelling, revolutionising the way we interact with news in the digital age.

In this session, CIDAI in conjunction with 3Cat will showcase the results and knowledge acquired during the implementation of one of the High Impact Projects in which multimodal generative AI tools have been used.

Programme

9.30 am Registration.

Introduction to the event: Marco Orellana, Manager of CIDAI.

10 am Welcome.
Mr Joan Mas i Albaigès, Director of CIDAI and head of the Digital Area atEurecat.
Mr Medir Plandolit, Digital News Editor at 3Cat.

10.15 am High Impact Project – Smart Tools for Summarising and Generating Audiovisual Content in News.

Rafael Redondo Tejedo, head of the Image Line in the Multimedia Technologies Unit at Eurecat.
Automatic video summarisation is a tool which can help, for example, in spreading news or increasing the productivity of editing and archiving audiovisual content. It is a complex challenge when addressed at the image level.
However, in the case of news and interviews the semantic information is mainly in the narration. To meet this challenge, speech transcription models providing word-level time codes have been combined with large natural language models which can generate extractive textual summaries.
A graphic interface has been developed for user interaction which makes it possible to select the number of most significant sentences in the summary, add sentences manually using keywords and optionally automatically select the most significant segments of a sentence providing a second level of compression. It also allows for deleting silences of a specified length to make the final summary video more vibrant.
Development based on open source software is a key part of this project.

Bogdan Raducanu, Senior Researcher/Project Director at CVC.
Generative graphic news narrative: the purpose of this project is to develop a smart system for automatically generating images to support the news narration which are tailored to the overall content and tone of the text.
The method uses an LLM to summarise the news text and a diffusion model to generate the image from the summary. The tool has an extremely versatile graphical interface to ensure that the end result meets the requirements of the professional user.

José Adrián Rodriguez Fonollosa, Director of the TALP Research Centre at the UPC.
To extract a summary of a video from its transcription, we first studied the performance of current speech-to-text conversion systems in terms of quality and speed in Catalan and bilingual documents. We have also analysed the accuracy with which they are able to indicate the start and end time of each sentence.
We have further developed a system for generating extractive summaries that assigns a relevance index to each sentence with the help of a Large Language Model (LLM) and/or keywords. This makes it possible to flexibly and quickly adjust the length of the summary according to the number of sentences or time duration of the resulting video.

Rafael Bermudez Guijo, Head of Technological Research Projects at 3Cat.
Assessment, applications and conclusions of high-impact projects by 3Cat

11.45 pm Institutional closure.
Mr Lluís Juncà, Director General of Innovation and Digital Economy of the Government of Catalonia.

11.50 am Coffee & Networking.

12.15 pm End of the event.

Presentation High Impact Project “Smart Tools for Summarising and Generating Audiovisual Content in News”

Programme

Venue

Intelligent Tools for Summarising and Generating Audiovisual Content in News