Introduction to Agent2Agent (A2A) Protocol

By Google Cloud Tech

Summary

## Key takeaways - **A2A: Open Standard for Agent Communication**: The Agent2Agent or A2A protocol is an open standard designed to be the common language for AI agent communication and collaboration regardless of how each agent is implemented. It's like how Langchain made it easy to swap between models, making it simple for agents to communicate in a consistent way. [00:41], [01:00] - **Trip Planning Needs Agent Coordination**: Imagine planning a trip where you'd ideally want a flight agent, a hotel agent, and an activity agent to coordinate, but it's a lot of work to build every single one of these. You might want to use someone else's agent for some of these tasks, but you have no idea how it's implemented or how it works, as it's completely opaque to you. [00:26], [00:34] - **Agent Cards Enable Discovery**: Agent B publishes an agent card, a standard JSON file served at a well-known URI on the agent's domain, which tells agent A everything it needs to know to start a conversation, including its name, what it does, HTTP endpoint URL, specific skills, special capabilities like streaming, and authentication methods. You can think of this functioning like robots.txt for web crawlers or service registries in microservices architectures. [02:47], [03:09] - **Tasks Handle Long-Running Processes**: A task is the job an agent needs to do, with an ID and a status lifecycle including submitted, working, input required, completed, or failed. When agent A sends a request, agent B responds with a task ID and status, allowing agent A to poll for updates or use streaming for real-time progress like task status events or artifact chunks. [04:34], [05:17] - **A2A Complements MCP Protocol**: MCP is all about how an agent connects to its tools, APIs, and resources, serving as the standardized way an agent performs function calling and interacts with its own capabilities or external services. A2A facilitates dynamic communication between different independent agents acting as peers, about how agents collaborate, delegate tasks, and manage shared workflows, so you'll often see both in a sophisticated agentic system. [06:35], [06:50] - **Agents Stay Opaque with Standardized Interfaces**: Every A2A agent transmits standardized information about itself and supports the same public methods so it can be called by any other agent to complete tasks, opening up new orchestration scenarios. Importantly, every agent is opaque, meaning the implementation details never need to be exposed to follow the protocol. [01:07], [02:12]

Topics Covered

Why do AI agents need a common language?
How do agents discover each other?
What powers A2A's secure conversations?
How do agents handle long tasks?
Does A2A compete with MCP?

Full Transcript

Hi there, my name is Holt Skinner.

I'm a developer advocate for Google Cloud AI.

And today we're going to discuss the A2A protocol created by Google.

Special thanks to Ivon Nardini and Lashmi Harikumar for their contributions in putting this together. First off, what is A2A and why is it useful?

AI agents are popping up everywhere from lots of different companies and using all sorts of different frameworks. But how do we get them to work together on complex problems?

Imagine planning a trip.

You'd ideally want a flight agent, a hotel agent, and an activity agent to coordinate, but it's a lot of work to build every single one of these.

You might want to use someone else's agent for some of these tasks, but you have no idea how it's implemented or how it works.

It's completely opaque to you.

The world needs a standard way for all these agents to communicate and collaborate.

This is where agentto agent or A2A protocol comes in. It's an open standard designed to be the common language for AI agent communication and collaboration regardless of how each agent is implemented. Kind of like how Langchain made it easy to swap between models.

ATA can make it easy for agents to communicate in a consistent way.

I like to think of it like Lego blocks.

Every ATA agent transmits standardized information about itself and supports the same public methods so it can be called by any other agent to complete tasks which opens up all sorts of new orchestration scenarios.

ATA facilitates communication between the end user, a client agent and a remote or server agent. A client agent is responsible for creating requests and handling enduser interaction while the remote agent is responsible for acting on those requests in attempt to provide the correct information or take the correct action.

Something important to note is that any given agent can act as both a client and a remote agent depending on the context.

Now a standard is only as useful as its adoption and A2A has already become widely popular in the software development industry.

These are just a subset of the partners who have agreed to support A2A. You can find the most up-to-date list in the ATA documentation.

Let's go over some of the core capabilities of the protocol.

ATA agents can dynamically discover each other, collaborate via standardized tasks, share multimodal content, handle longrunning processes, and do all of this with enterprisegrade security.

Importantly, every agent is opaque, which means the implementation details never need to be exposed to follow the protocol.

A toa is focused primarily on the bridge between agents. Let's go over how the protocol works.

We'll walk through how we build a simple system where agent A needs agent B to do something.

We'll go through the core building blocks of A to A and how they enable collaboration.

In this case, agent A is the client agent and agent B is the remote agent. First, how does agent A even find agent B and know what it can do? Agent B publishes an agent card.

Think of this as its digital business card.

It's a standard JSON file which is served at a well-known URI on the agents domain. This card tells agent A everything it needs to know to start a conversation.

This includes agent B's name, what it does, its HTTP endpoint URL for A2A communication, the specific skills it offers, any special capabilities like streaming, and how to authenticate.

You can think of this functioning like robots.txt for webcwlers or service registries and microser architectures.

Once agent A found agent B, how do they actually talk?

A to A uses standard HTTPS for secure communication.

The envelope for their messages is JSON RPC 2.0.

0 a simple way to call functions on a remote server. Inside these JSON RPC messages, we have key A to A objects.

A message represents one turn in the conversation like agent A asking a question.

It has a role, user or agent and contains parts. A part is the actual content.

It could be plain text, a file, multimodal or structured JSON data.

when agent B gets a request. If the request is simple and completes quickly, it might respond directly with a message containing the answer. But how does agent B actually process a request when agent A calls it? That's the job of the agent executor.

It's a class that you write and it links the generic A2A protocol plumbing handled by the ATA SDK and the specific logic of our agent.

This is what makes the agent into a Lego that can be connected to other agents.

The SDK worries about HTTP, JSON RPC, and event management. With the executor, we focus on what happens when the agent processes its responses. What if agent B's task takes a long time?

We can't just make agent A wait on one request.

This is where the task object comes in.

A task is the job an agent needs to do.

This task has an ID and a status with a life cycle.

Submitted, working, maybe input required if agent B needs more info, and finally completed or failed.

So when agent A sends its initial request, agent B might quickly respond saying, "Got it. I've created task one 123 and it's now working." Agent A then knows this task ID. To get the final summary, agent A can periodically call another Ato method, tasks get asking what's the status of task 123.

Agent B will respond with the latest task status.

And eventually that method will return the task is completed and the summary will be in task.

Now polling can work, but it's not very efficient if you want quick updates.

For that, A2A supports streaming using server sent events or SSE. If agent B's agent card says it supports streaming, agent A can use the message stream method.

Now, the HTTP connection stays open and agent B can push updates to agent A as they happen.

These updates can be the initial task object, task status update events. These are messages like I'm now working on the specific part or the task is now complete.

Then task artifact update events.

If the result is large, like a long summary, agent B can stream it in chunks, so the first paragraph, second paragraph, and so on.

This is much better for user experience.

Think of live progress updates or seeing a document appear as it's being generated. You can find an in-depth tutorial with some sample agents showing how this implementation works at goo.go/2a-tutorial.

We've discussed implementing an A2A agent, which begs the question, is this it?

Do I just need A2A to productionize my agent?

Nope. A toa is just one part of a possible agent stack. This is the stack that Google recommends, but you can use whatever components you prefer.

Now, you might also notice another protocol in this stack which has been very popular recently. MCP.

Does A2A replace MCP or they just going to be competing forever?

Absolutely not.

Let's understand how A2A and MCP relate because they are complimentary, not competing.

MCP is all about how an agent connects to its tools, APIs, and resources.

Think of it as the standardized way an agent performs function calling.

How it interacts with its own capabilities or external services it has direct access to.

ATA on the other hand facilitates dynamic communication between different independent agents acting as peers.

It's about how agents collaborate, delegate tasks and manage shared workflows.

So you'll often see both in a sophisticated agentic system.

Agents use Atogs and they use MCP to interact with their tools to get part of the job done.

Of course, the best way to learn all of this is to try it out for yourself.

For all of the latest information about A2A and the full specification, check out the documentation at goo.gold/2a.

And we have an official Python SDK at pip install aa-dk.

You can find lots of sample code in the A2A samples GitHub repo. And check out all of the repositories in the Google AA GitHub organization right here.

All official ATA code will be added to repo this in this organization. If you have suggestions or questions about the protocol, file GitHub issues in the appropriate repositories and please make pull requests if you want to suggest improvements to the protocol or share your sample implementations.

It's rapidly evolving and we want as much community involvement as possible.

Let us know what your plan to create with the multi- aent world using A to A.

Loading...

Loading video analysis...