How to Find Hidden APIs Using AI

Discover how AI can find and document hidden APIs in government portals and data websites in minutes instead of hours. Step-by-step tutorial using Chrome DevTools MCP with real examples.

Two things can turn my daily work into a nightmare: PDF files and public data trapped inside portals. Every data journalist learns to extract data from both, but it always feels like wasted time.

For PDFs, we have Tabula and AI-powered OCR I have been using olmOCR recently, but the hot new model is Deepseek OCR. Might write something about it soon!. Data portals usually require bringing the big guns—and suddenly, web scraping becomes an art in itself.

Getting data from public institutions that hide it in fancy PowerBIs or paginated tables without a download button involves as much art as skill. The usual approaches:

Build a robot that scrapes data from rendered HTML
Use browser automation (Playwright, Selenium) to simulate a user
Peek into browser dev tools and find the API calls

With everyone using APIs, the latter has become my starting point. Finding hidden APIs Leon Yin has a great resource about this is kind of pleasant—you’re about to save some hours and, c’mon, you gotta feel a bit like a hacker.

The problem: APIs are sometimes hard to spot, and they’re usually undocumented. You need to reverse engineer what was going on in their developer’s mind. This can take hours.

With AI and coding agents increasingly part of my workflow, I’ve been asking for help with this task more often. AI, when given browser developer tools, can try repeatedly and document findings into the documentation you wish you always had.

In this post, I’ll show you how.

What You Need to Know About APIs

Think of a data portal like a restaurant. Loading all 1,000+ records at once would be slow (or impossible). Instead, portals serve part of the data and let you request more as needed.

You (the client) interact with the page. That triggers a request to the API (the waitress), which fetches data from the server (the kitchen) and serves it back.

Here’s a practical example. There are 1,025 Pokémon. Loading all of them at once would crash most browsers. Instead, PokéAPI lets you request them one at a time:

https://pokeapi.co/api/v2/pokemon/1   # Bulbasaur
https://pokeapi.co/api/v2/pokemon/6   # Charizard
https://pokeapi.co/api/v2/pokemon/25  # Pikachu

Try clicking this link. Even if the JSON response looks intimidating at first, you’ll recognize some data. This is a REST API in action. Change the number, get different Pokémon:

Select Pokemon:

This is what every data portal does under the hood—they just don’t tell you about it.

The Problem: Undocumented APIs

PokéAPI has full documentation. Developers tell you exactly what endpoints exist and how to use them.

Government portals? Not so much. The APIs exist, but if you want that data, you’re on your own.

My usual web scraping process: Open Chrome DevTools (right-click > Inspect), click Network tab, interact with the portal, spot patterns in XHR/Fetch requests. Then manually document each endpoint, test parameters, figure out pagination.

I’ve spent entire afternoons doing this. It works, but it’s tedious.

AI That Can Use a Browser

If you’ve followed me, you know I’m neither an AI doomer nor evangelist. But AI has been changing how I work, especially with the rise of MCP (Model Context Protocol).

MCPs bridge AI systems with other apps and services. One I use constantly: Chrome DevTools MCP—it lets AI interact with Chrome and use dev tools.

Instead of manually clicking around, AI can:

Open any portal URL
Interact with filters and search boxes
Monitor all network traffic
Identify API endpoints
Document parameters and responses
Write example code

It’s like having a junior developer who’s really good at reverse engineering APIs and never gets bored.

Setup: Claude Code

To do this in the most efficient way, I use Claude Code — a terminal-based Claude that supports custom slash commands and perfect for creating reusable data journalism workflows. I know the terminal can feel scary, but if you use Claude with a paid subscription, I really would welcome you to give it a try In a first version of this tutorial, I wrote about how to do this on Claude app. It felt like the post was too complex and ended up only writing about how to do it with Claude Code. Let me know if you need a Claude app version of it!.

If you are using any other tool, I would recommend you to check out this link.

What You’ll Need

Before we start, make sure you have:

Claude Pro subscription ($20/month) - Required for Claude Code access
Node.js (v16 or later) - Download from nodejs.org
Chrome browser - Any recent version works
Basic terminal comfort - Don’t worry, I’ll walk you through each command

If you’re new to the terminal, don’t let that stop you. The commands are copy-paste friendly, and the time savings make the learning curve worthwhile.

1. Install Claude Code

If you have Node.js installed:

npm install -g @anthropic-ai/claude-code

On macOS with Homebrew:

brew install claude-code

After installation, authenticate:

claude-code auth

2. Install Chrome DevTools MCP Server

With Claude Code configured, add the MCP server:

claude mcp add chrome-devtools npx chrome-devtools-mcp@latest

3. Verify It’s Working

Run claude and try this:

Use chrome dev tools, visit this link and tell me if there's a message for me there. Check the console!
https://ruibarros.me/blog/finding-hidden-apis-using-ai

If configured correctly, you’ll see Chrome open and navigate to my blog.

That’s it. You’re ready!

Creating a Reusable Slash Command

Instead of pasting the same prompt every time, let’s create a slash command you can reuse on any portal.

Claude Code lets you create custom commands in .claude/commands/. Here’s how:

1. Create the command file

mkdir -p .claude/commands
touch .claude/commands/discover-api.md

2. Add this to discover-api.md

Navigate to .claude/commands(if you dont enjoy using your terminal, remember to enable hidden folders (cmd+shift+. on MacOS)). Open the discover-api.mdon the code editor of your choice and paste the following.

---
description: Reverse engineer and document hidden APIs in any web portal
---

I need you to reverse engineer the API for {{url}}.

Here's what I need you to do:

1. Open the URL in Chrome using Chrome DevTools MCP
2. Take a snapshot of the page structure
3. Interact with search filters, date pickers, pagination controls
4. Monitor all network requests while you interact
5. Identify the API endpoints being called
6. For each endpoint, document:
   - Full URL pattern
   - HTTP method (GET, POST, etc.)
   - Required parameters
   - Optional parameters
   - Response structure
   - Pagination mechanism (if any)
7. Create example requests using curl and {{language:R}}

Focus on endpoints that return data (JSON). Ignore assets like images and css.

When you're done, create two files:

1. `API_DOCUMENTATION.md` - Clean markdown documentation
2. Working code examples in {{language:R}} using appropriate HTTP libraries

Be thorough and document edge cases.

3. Use it

Now you can discover APIs with one command.

claude

# In the Claude Code session:
/discover-api url=https://airindex.eea.europa.eu/AQI/index.html

# Or specify your preferred language:
/discover-api url=https://airindex.eea.europa.eu/AQI/index.html language=Python

The {{language:R}} syntax sets R as the default, but you can override it with any language you prefer (R, JavaScript, Ruby, etc.).

Real Example: European Air Quality Index

Let me show you this with the European Air Quality Index website—the public website with air quality map for all Europe.

I gave Claude the prompt above and watched it work.

What happened:

Opened the portal in Chrome
Took a snapshot of the search interface
Tried different search terms and filters
Monitored Network tab for API calls
Identified the main search endpoint
Tested parameters to understand the API
Generated documentation and code examples

In the end, here’s what Claude produced, on it’s own words:

Here's what I discovered and documented:

  Key API Endpoints Found:

  1. Station Metadata - Complete list of all monitoring stations across Europe with coordinates,
  types, and classifications
  2. Hourly AQI Data - Time-stamped air quality index values for all stations (±47 hours historical,
  +48 hours forecast)
  3. Location Search - Geocoding service to find cities and places
  4. Modelled Air Quality Maps - ArcGIS Image Service providing raster tiles of predicted air quality
  5. Geographic Boundaries - GeoJSON data for clickable country/region polygons
  6. Content Index - Version tracking for data updates

  Files Created:

  API_DOCUMENTATION.md

  Complete technical documentation including:
  - All endpoint URLs and parameters
  - Request/response formats with examples
  - AQI level definitions and color schemes
  - Data update frequency
  - Error handling best practices
  - Common use cases

  eea_aqi_examples.R

  10 working R examples demonstrating:
  1. Fetching all station metadata
  2. Getting current AQI data
  3. Joining stations with AQI values
  4. Searching for cities
  5. Finding nearest stations to coordinates
  6. Creating time series for stations
  7. Comparing AQI across cities
  8. Exporting data to CSV
  9. Statistical analysis by country/type
  10. Accessing modelled map tiles

  The R code includes:
  - Helper functions for API calls with error handling
  - Distance calculations (Haversine formula)
  - Data visualization with ggplot2
  - Export functionality
  - Comprehensive examples with real-world use cases

And even if some stuff is yet to be found, It’s already a great starting point! You can now do it with sub pages, or that specific thing you suspectit’s possible to get there programtically.

When API Discovery Works (And When It Doesn’t)

It’s worthless to say that with a great power comes a great responsibilty. And this command is not magic.

It won’t work if:

The portal requires authentication you don’t have
APIs are behind CAPTCHA or bot detection
Data is server-side rendered (no API calls)
The site actively blocks automation

In those cases, you’re back to traditional scraping with Playwright or Selenium. But for modern JavaScript portals with hidden APIs? This works.

But the real take way is: any portal with a search interface probably has an API underneath. Now you can find it and find a way to extract the data you need from it!

And, needless to say: because this data should have a download button you have no right to take a public website down. Respect rate limits, use this on public websites, check terms of service, and consider just ask the public institution Associated Press stylebook has pretty good rules for web scraping for journalists. I really recommend the read.

Go back to your ideas list and take a look at the ones you never did because you did not knew how to get the data. Maybe this little slash command is the help you always needed!

A data journalism marketplace

I’m building a collection of these workflows for common data journalism tasks. The idea is that you can install commands like this easily, but also share your own commands and tools.

Come build it with me!