Oliver Wehrens
Seasoned Technology Leader. Mentor. Dad.
Oliver Wehrens

How Agents Find Their Own Tools: Agentic Resource Discovery

· Oliver Wehrens - 12 min

TLDR: How do agents find capabilities of web sites? Agentic Resource Discovery (ARD) is a draft spec that lets any domain advertise its AI tools the same way websites advertise themselves to search engines. I built a small demo that discovers a site’s tools four different ways and then actually calls one. You can play with it at adr-demo.4004.fyi.

The problem

More and more websites are offering capabilities. Very often that are MCPs server going with that. But how do agents know about it? If you don’t have out of band communication with the LLM provider, I think there is no way. I by myself waiting for a solution. I have a user case where I want to provide more information in a structure way to agents and I can’t do that on the website, in html for reason. There are discussions out there on how to do that, like llm.txt or such. So far no clear winner.

That’s the question Agentic Resource Discovery tries to answer. It’s a draft spec — v0.9, written by people from Google, Microsoft, and Hugging Face — and it describes how a domain can publish a catalog of its agentic resources (MCP servers, A2A agents, skills, APIs) so an agent can find them instead of being told about them up front.

Announcement from Google, Microsoft. Spec is available at agenticresourcediscovery.org.

I wanted to see if it actually holds together end to end. So I built a demo. It’s live at adr-demo.4004.fyi.

A small example

Say I ask my agent a simple question: “What’s the weather in Berlin?”

Today that only works if I already did the dance — found a weather MCP server, pasted its URL into my config, restarted (out of band communication). If I didn’t, the agent just shrugs. It has no idea such a tool exists, even if a perfectly good one is sitting one DNS lookup away.

Here’s what I would like much more. The agent takes a domain — adr-demo.4004.fyi — and asks it a plain question: do you have any tools? The domain answers with a catalog. The agent spots a weather tool in it, reads how to call it, calls it, and tells me it’s 21°C and windy in Berlin. I never touched a config file.

That’s the whole pitch. The agent finds the tool at the moment it needs it — the same way your browser finds a site’s RSS feed without you registering anything. The rest of this post is just how that question — “do you have any tools?” — gets asked and answered.

What ARD actually is

The thing I like about ARD is how little it tries to do.

It handles discovery, not invocation. It tells an agent where a tool lives and what kind of thing it is. It says nothing about how you call it — that’s left to the tool’s own protocol. In my demo the tool is an MCP server, so the actual call happens over MCP. ARD walks you up to the front door and then steps aside.

A few ideas hold the whole thing up:

  • A thin, type-tagged envelope. Every catalog entry carries a type field that’s just an IANA media type — application/mcp-server-card+json, application/a2a-agent-card+json, and so on. ARD doesn’t care what’s inside. The type tells the agent which protocol to speak next. One catalog can list an MCP server, an A2A agent, and a dataset side by side.

  • Names anchored to domains. Every entry has an identifier like urn:air:weather.example:mcp:weather. The middle bit has to be a real domain. That separates a tool’s name (a stable thing) from its location (a thing that moves), and it gives a trust anchor — you can’t claim urn:air:google.com:… unless you actually control google.com.

  • Discovery built to live outside the model. Most agents pick tools by dumping every tool description into the prompt and reasoning over the pile. That doesn’t scale past a certain point — you run out of context window. ARD’s bigger idea is to push selection into a dedicated search service, so an agent can ask “find me a flight-booking tool with a SOC2 attestation” and get back a short list, without paying for it in tokens.

That last idea splits the world in two. Static discovery is publishers hosting a plain JSON file. Dynamic discovery is registries crawling those files, indexing them, and offering a search API across the whole network. My demo lives in the static half — and that’s where the interesting part is, because that’s where the four mechanisms come in.

The four ways an agent can find your tools

A publisher can advertise the same catalog in four different ways, and a good agent tries all of them. None of them is new technology. See Section 6.1 of the spec.

Web Ingestion (Required): Crawling ai-catalog.json files from discovered URIs. All ARD implementations MUST support this.

1. A well-known URL

https://example.com/.well-known/ai-catalog.json

The canonical path. Same idea as /.well-known/security.txt — a fixed, predictable place where a thing always lives. You host one JSON file and you’re discoverable. No DNS, no extra files, nothing clever.

This is the simplest mechanism. If you only ever do one of these, do this one. One caveat, this means you are able to modify your /.well-known directory. Not every software allows you to do that.

2. A line in robots.txt

# robots.txt
Agentmap: https://example.com/catalog.json

You already have a robots.txt. It already has a Sitemap: line telling search engines where your page index is. ARD adds an Agentmap: line telling agents where your tool index is. Exact same pattern.

<head>
  <link rel="ai-catalog" href="/.well-known/ai-catalog.json">
</head>

This is the same move browsers use to find your RSS feed or your stylesheet — a <link> tag in the page head. It puts discovery right next to the human-facing site. An agent that’s already on your page (because a user sent it there) finds your tools without having to guess. It does not say anything about how the agent gets to your source. Later javascript injected link might work or not - depends on the agent scraper implementation.

4. A DNS record

_catalog._agents.example.com.  3600  IN  TXT  "https://example.com/.well-known/ai-catalog.json"
_search._agents.example.com.   3600  IN  TXT  "https://registry.example.com/api/v1/"

This is the one I find most interesting. Here discovery happens at the naming layer — before the agent makes a single HTTP request to your site. Under an _agents label you publish two kinds of record: one pointing at a static catalog, one pointing at a live, searchable registry.

The spec asks for SVCB records (the modern DNS service-binding type). In practice plenty of DNS providers still can’t author those, so a plain TXT record carrying the URL is the reliable fallback — and that’s what I’d reach for first.

Two things make this mechanism stand out. It’s the only one that works before you touch the website at all. And it’s the only one that can point at a dynamic registry instead of a static file — which is the bridge to the search-first future I mentioned earlier.

Why four?

Not for redundancy’s sake. Different agents reach into different layers — one is crawling HTML, one is resolving DNS, one just knows the well-known path. A publisher who advertises all four is findable by all of them. The spec doesn’t even rank them; if they all point at the same catalog, they’re equal.

What about more than one catalog?

I just said the four mechanisms are equal if they all point at the same catalog. But what if they don’t? And while we’re there — what if a domain simply has more than one catalog?

The spec talks about a catalog. One file, advertised four ways. It quietly assumes there’s exactly one.

Nothing actually says there has to be. A robots.txt can carry more than one Agentmap: line, the same way it can list several sitemaps. You can drop more than one <link rel="ai-catalog"> into a page head. You can publish more than one DNS record. The spec does describe nested catalogs — an entry can itself point to a sub-catalog, say one per department — so hierarchy is clearly intended. Several independent catalogs sitting side by side on one domain, though? That’s neither blessed nor banned. It’s just… not described.

So conceptually it’s allowed. You could split your tools across a few catalogs today and nothing would break.

The open question is what an agent does with that. In my testing I wrote an agent trying the four mechanisms, loads the first catalog it finds, and stops. A different agent might gather every catalog it can reach and merge them. Without the spec saying which, you can’t rely on either. And that matters the moment this leaves the demo stage: if a publisher splits their tools across three catalogs and agents only ever read one, two-thirds of those tools are invisible — and the publisher has no way to know.

For a v0.9 draft, fair enough. But it’s exactly the kind of detail a real standard has to pin down before anyone leans on it.

From discovery to a real tool call

Discovery gets you a file. Here’s the one my demo serves, trimmed down:

{
  "specVersion": "1.0",
  "host": { "displayName": "ARD Demo Publisher" },
  "entries": [
    {
      "identifier": "urn:air:weather.example:mcp:weather",
      "displayName": "Weather MCP Server",
      "type": "application/mcp-server-card+json",
      "url": "https://adr-demo.4004.fyi/mcp/weather-card.json",
      "description": "Live current-weather lookups over MCP.",
      "capabilities": ["WeatherTool", "get_weather"],
      "representativeQueries": [
        "what is the weather in Berlin",
        "current temperature in Tokyo"
      ]
    }
  ]
}

From here the agent does the obvious things. It validates the file against the spec’s own schema — so a malformed catalog gets caught, not trusted. It reads the entry, sees it’s an MCP server, and follows the link to find the actual endpoint and tool list. It connects and asks the server, live, what tools it has — rather than trusting whatever the file claimed.

Then comes the only step that needs a brain. Given a goal like “what’s the weather in Berlin?”, the agent picks the right tool and fills in its arguments.

And then it actually calls the tool — a real round-trip to a real weather API — and prints the answer.

That’s the bit I wanted to prove. ARD carried the agent from a bare domain name all the way to “here’s an endpoint and a tool schema,” and then got out of the way. MCP did the rest. The handoff is clean.

What about a plain REST API?

MCP is great but most of the capabilities sitting on the web today aren’t MCP servers. They’re plain old REST APIs. The nice thing about ARD’s type-tagged envelope is that it doesn’t care. A REST API is just another entry with a different type.

Instead of pointing at an MCP server card, the entry points at an OpenAPI document:

{
  "identifier": "urn:ai:adr-demo.4004.fyi:rest:weather",
  "displayName": "Weather REST API",
  "type": "application/openapi+json",
  "url": "https://adr-demo.4004.fyi/openapi.json",
  "description": "Current weather for a city over plain HTTPS.",
  "representativeQueries": ["what is the weather in Berlin"]
}

Discovery is identical — same four mechanisms, same catalog, same validation. Only the very last step changes. The agent sees application/openapi+json, fetches the OpenAPI doc, and now it knows everything it needs: the base URL, the path, the method, the parameters, the auth.

From there it’s just an HTTP call. No client library, no special transport — the thing every language and every agent already knows how to make:

GET https://adr-demo.4004.fyi/weather?city=Berlin
Accept: application/json
{ "city": "Berlin", "temperature_c": 21.3, "wind_kmh": 12, "humidity": 58 }

The discovery layer is the same whether the thing on the other end speaks MCP, A2A, or plain REST. The type field is the fork in the road — it tells the agent which kind of door it just found, and the agent brings its own key. ARD never has to learn about REST, or MCP, or whatever comes next. It just carries the label.

Since this is still a draft: the exact media type for “this is an OpenAPI-described REST API” isn’t pinned down — the spec leans on each artifact ecosystem to define its own. application/openapi+json is an obvious candidate, but don’t be surprised if the string the world settles on ends up a little different.

Just to make sure. The REST endpoint does not exists in my demo service. Id di not built it.

Try it yourself

The whole thing is running at adr-demo.4004.fyi. Point an agent at the domain, give it a goal, and watch it discover the catalog all four ways, validate it, list the weather tool, pick it, and call it.

It’s a really tiny example — one weather tool — but the idea is clear.

My example agent does this (have not yet setup the DNS discovery).

╔════════════════════════════════════════════════╗
║   ARD Agent — Agentic Resource Discovery demo  ║
╚════════════════════════════════════════════════╝

▸ STEP 1 — Discovery: trying all four ARD mechanisms (spec §6.1)
────────────────────────────────────────────────────────────────
  target         https://adr-demo.4004.fyi

  ✓ 1. Well-Known URI          https://adr-demo.4004.fyi/.well-known/ai-catalog.json
  ✓ 2. robots.txt Agentmap     Agentmap: https://adr-demo.4004.fyi/.well-known/ai-catalog.json
  ✓ 3. HTML <link>             <link rel="ai-catalog"> → https://adr-demo.4004.fyi/.well-known/ai-catalog.json
  · 4. DNS Service Binding     no _catalog._agents.adr-demo.4004.fyi SVCB/TXT records

Discovered the catalog via 3/4 mechanisms.
✓ Loading catalog via "Well-Known URI": https://adr-demo.4004.fyi/.well-known/ai-catalog.json

▸ STEP 2 — Fetch & validate catalog
───────────────────────────────────
  specVersion    1.0
  host           ARD Demo Publisher
  entries        1
✓ Manifest is valid against ai-catalog.schema.json (Draft 2020-12).

▸ STEP 3 — Catalog entries & tools
──────────────────────────────────
  [MCP] Weather MCP Server  urn:ai:weather.example:mcp:weather
      capabilities: WeatherTool, get_weather

Tools (1):
  get_weather [live]
      Get current weather (temperature, wind, humidity) for a city.

▸ STEP 4 — Select a tool for the goal
─────────────────────────────────────
  goal           what's the weather in Berlin?
  method         deterministic fallback
  tool           get_weather  (from Weather MCP Server)
  arguments      {"city":"Berlin"}
reasoning: Keyword overlap (10 matches) between the goal and "get_weather" / its catalog metadata. Arguments extracted heuristically.

▸ STEP 5 — Invoke tool over MCP (Streamable HTTP)
─────────────────────────────────────────────────
  endpoint       https://adr-demo.4004.fyi/mcp

  Current weather in Berlin, State of Berlin, Germany: 26.4°C, wind 8.3 km/h, humidity 33% (observed 2026-06-22T12:45).

✓ Tool call complete.

Thoughts

My demo stops at static discovery, but that _search._agents DNS record is a noteworthy. The four publishing mechanisms are how one domain becomes findable. The dynamic layer — registries crawling those files and exposing a search API — is how the whole ecosystem becomes searchable. That’s the part that actually matters at scale: an agent that asks a registry for what it needs instead of carrying the entire toolbox around in its context.

What I keep coming back to is how simple the four mechanisms are:

  • well-known path
  • line in robots.txt
  • link tag
  • DNS record

We’ve been using every one of these for decades to help machines find things on the web. ARD just points them at a new kind of thing.

I don’t know if ARD specifically is the standard that wins — it’s still a draft, and this space moves fast. But I’m fairly convinced the shape is right. The out of band communication of knowing where tools are has to end. Letting agents discover tools the way browsers and crawlers always have feels like the obvious next step.

If you want to see it work, it’s at adr-demo.4004.fyi.

Reminder: The spec is evolving. Everything stated above is as of 2026/06/22. In some parts the spec is not clear, e.g. it mentions in one place just one discovery, in another two and also 4 discovery mechanisms. Stay tuned.