May 1 | 4 pm CET | 10 am GMT-4

PDFs – When a Thousand Words Are Worth More Than a Picture (or Table)

This talk explores the challenges of using PDFs in Retrieval-Augmented Generation (RAG) systems and how multimodal models can enhance text retrieval. Discover how to overcome PDFs semantic limitation and learn techniques to improve the quality of extracted data for vector databases.

06

Days

20

Hours

15

Minutes

Date

May 1, 2025

Time

4 pm CET | 10 am EST | 7.30 pm IST | 5.00 pm KSA

Location

Online – Zoom

Language

English

Duration

1 hour

Transform How Your Models Understand PDFs!

In this hands-on, insight-packed session, you’ll explore how multimodal models can go beyond basic parsing and reverse engineering—breaking down complex tables and figures into meaningful, question-ready chunks. You’ll learn how to craft better input for vector databases and bring robustness to your retrieval layer, the foundation of any Retrieval-Augmented Generation (RAG) system.

The Webinar will cover:

  1. Understanding the semantic bottlenecks of PDFs and their impact on retrieval
  2. Reviewing chunk formats for vector database ingestion
  3. Using multimodal models to decompose tables into structured text
  4. Experimenting with retrieval quality using different extraction techniques
  5. Applying the same decomposition approach to figures for better knowledge retrieval

Who should attend?

This talk is ideal for AI practitioners, data scientists, and engineers working with RAG systems and document processing.

During this interactive session, you’ll learn how to enhance retrieval by leveraging multimodal models, improving PDF parsing strategies, and structuring extracted knowledge for better AI-driven insights.

Speaker

Meet the Computer Science and AI expert who will share his knowledge and expertise with you during AI Learning Week.

Caio Benatti Moretti

Caio holds a PhD in Computer Science and has been acting as a data scientist both in academia and industry since 2014. Currently working as a Data Science Consultant at Xebia, he created SlackGPT and is particularly keen on neural networks in its many forms and applications. His enthusiasm even led him to make a neural network fit inside a business card. Caio has been giving seminars on how to empower businesses with LLMs from use cases to technical tooling. He is focused on how LLMs can augment human productivity and hence helping businesses to leverage novel technologies to achieve their goals.

AI Learning Week Agenda

This year’s Academy open week will be held under the banner of AI.
This is a unique opportunity to develop your AI competencies and enter the Xebia Academy world.

We have prepared two tracks:

  • Base Track is for anyone who wants to use AI to the best effect for themselves and their organization.
  • Tech Track is designed for professionals who wish to explore prompt engineering, AI coding, and more.

April 28 | BASE TRACK

AI Strategies for Leaders: Ethics, Governance, and Business Success

Steven van Duin

Analytics Educator

April 29 | TECH TRACK

Practical GenAI: Building LLM-powered Applications

Jeroen Overschie

Machine Learning Engineer

April 30 | BASE TRACK

Prompt Engineering Made Easy: Quick Tips to Get More out of ChatGPT

Lysanne van Beek

Data Science Educator

May 1 | TECH TRACK

PDFs – When a Thousand Words Are Worth More Than a Picture (or Table)

Caio Benatti Moretti

AI consultant | PhD

Reserve your spot

Sign up for the PDFs in the RAG Webinar. Remember, if you are interested in other AI Leadership Week webinars, you should register for each separately.

Explore Our Data and AI Trainings

Learn More