Presentation Impact Project "Intelligent assistance to the contextualized description of photos for news"

The project, led byEurekain collaboration with 3Cat, has embarked once again on the manufacture of new tools for journalistic assistants, in this case for photo description, exploring the unprecedented capabilities of current multimodal language models.

Describing photos in the media is often a routine task, far from the intellectual activities necessary in writing a news story. Captions usually address editorial criteria, while photo descriptions require a description in purely visual terms. These descriptions, included in non-visible fields of digital formats, such as web pages, also called “alternative texts”, are often not present due to different factors in current workflows. This means that people with accessibility limitations cannot be informed in the same way.

The development of new tools based on multimodal models of generative artificial intelligence, optimized for the description of photographs for informational purposes, can constitute a journalistic assistant that has a positive impact on the quality of information, both in the creation of photo captions and in visual descriptions for accessibility purposes.

In this session, the CIDAI, in collaboration with 3Cat will present the results and knowledge acquired during the execution of one of the High Impact Projects where multimodal generative AI tools have been used.

Programme

09:15h Registration Present the event: Mr. Marco Orellana, manager of the CIDAI.

10:00 a.m. Welcome.

Mr Joan Mas i Albaigès, Director of CIDAI and Digital Scientific Director at Eurecat.
Mrs. Rosa Romà Monfà, president of the Catalan Audiovisual Media Corporation – 3Cat
Ms Maria Galindo, Secretary of Digital Policies in the Government of Catalonia

10:15h Impact Project Intelligent assistance in contextualized description of photos for news 10:15am- Mr. Medir Plandolit, Digital Head Informative 3Cat Society needs quality information to guarantee a full democracy, but today there is growing disaffection: Two out of three Catalans are not satisfied with the democratic system. Consumption habits have changed, with social networks as the main source of information. This makes it difficult to identify reliable information and weakens the connection with traditional media. In this context, AI must be a tool at the service of journalism, freeing professionals from repetitive tasks to focus on actions of greater value.

10:30am- Mr. Rafael Redondo, Head of Image Line, Multimedia Technologies Unit of Eurecat Multimodal models (MVMs) can handle both image and language processing tasks by integrating both visual and textual data. This way, they can create coherent descriptions or summaries from different types of entries. These multimodal models, through prompts and chains of thought, such as specific questions or the broader context of a news article, can guide the process of captioning and describing images more effectively towards the development of new journalistic assistants.

11.00h | Project evaluation – Intelligent assistance in contextualized description of photos for news – Mr. Rafa Bermúdez, research and cross-cutting services 3Cat The evaluation of tools based on artificial intelligence, especially due to their stochastic nature, is particularly complex. Rigorous procedures must be defined that reduce distractions that could affect the results, but at the same time it is essential to avoid introducing biases. For this reason, methodologies such as directed tests, personal interviews, and the analysis and visualization of results are combined, which allow the most relevant information to be distilled to assess the effectiveness and usefulness of the tools.
Through these techniques, it has been possible to identify the most interesting aspects of the proposal, as well as to outline the most promising lines of evolution to continue developing this tool.

11:30h Closing 11:35am Networking Coffee 12:00h End of the event

Presentation Impact Project “Intelligent assistance to the contextualized description of photos for news”

Programme

Venue

Intelligent assistance in contextualized description of photos for news