🍰

Six AI recipes for the modern journalist

Rui Barros

AI Recipes

Rui Barros

  • Senior Data Journalist at PΓΊblico
  • - ruibarros.me
  • - rui.barros@publico.pt

Slides and code

QR Code

AI in a Newsroom

Description

What a lot of people don't get about LLM's

1.

It's an API πŸ₯‚

2.

AI can have personality and be personalized

Data journalism expert

You are an expert data journalist with deep knowledge of investigative reporting, data analysis, visualization, and storytelling. You've studied the methodologies of renowned data journalists like Amanda Cox, Alberto Cairo, Sarah Cohen, and teams at outlets like ProPublica, The Upshot, and FiveThirtyEight. Your role is to mentor an aspiring data journalist by providing practical, actionable advice on their projects and career development...

"Regular journalism" expert

You are an experienced editor with 15+ years in publishing across magazines, newspapers, and digital media. You've worked at both legacy publications and modern content platforms, understanding how editorial standards have evolved while maintaining core principles of clarity, accuracy, and reader engagement. Your expertise spans:Story structure, pacing, and narrative flow Headline writing and subheading optimization Lead paragraph construction and hook development...

3.

LLM's are all about probabilities

Ok, but what does it has to do with data journalism?

🍰 6 AI recipes for the modern data journalist

Heavily inspired by the Building effective agents by Anthropic

Prompt Chaining

A workflow where the output of one LLM call becomes the input for the next. This sequential design allows for structured reasoning and step-by-step task completion.

Prompt Chaining

  • - Generating investigative findings summary, then adapting it for different publication formats
  • - Creating an investigation outline, validating it meets journalistic standards, checks if there is any data available and generates a methodology
  • - Generate custom tooltips for a map visualization

Routing

A workflow where user input is classified and directs to a specific task (a cheaper model?, a specific prompt?). This allows you to optimize for many inputs in isolation.

Routing

  • - The newsroom gets big amount of FOIA requested documents. But each document type requires a different analysis approach.
  • - You are building a claim verification tool. Some claims might need a special service like consensus.app, others need to fetch Eurostat data...

Parallelization

The user's prompt is passed to multiple LLMs simultaneously. Once all the LLMs respond, their answers are all sent to a final LLM call to be aggregated for the final answer.

Parallelization

  • - Have multiple kind of "editors" reading your piece and giving feedback;
  • - Different "experts" "reading" government contracts and giving feedback on their area of expertise;

Orchestrator

LLM breaking down the task into subtasks that are dynamically determined based on the input. These subtasks are then processed in parallel by multiple worker LLMs

Orchestrator

  • - The orchestrator creates a plan to create the amount of workers it needs to create the best methodologies for your story.
  • - An orchestrator is instructed to create the amount of workers it needs to have a full data journalism story on wildfires.

Evaluator-optimizer

An LLM performs a task, followed by a second LLM evaluating whether the result satisfies all specified criteria.

Evaluator-optimizer

  • - Build a scraper that collects data from mutiple sources. You know the output you want, but but the data is so messy that the model will need to "adapt"

Autonomous Agent

LLMs act autonomously within a loop, interacting with their environment and receiving feedback to refine their actions and decisions.

Thank you!