Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!
By KodeKloud
Summary
## Key takeaways - **LLMs can't take action independently**: Large Language Models (LLMs) like ChatGPT can generate text, pictures, or videos, but they cannot perform actions or interact with third-party services on their own. To enable actions like booking a flight, an AI agent is required. [01:23], [02:11] - **AI Agents perform tasks, not just answer questions**: Unlike traditional chatbots that provide single answers, AI agents can handle complex tasks by executing multiple AI calls, interacting with codebases, and using tools. They continue working until the entire task is completed. [04:04], [04:24] - **APIs enable application-to-application communication**: APIs (Application Programming Interfaces) provide a structured way for applications to communicate with each other, allowing them to exchange data and perform actions without needing to scrape websites or understand complex HTML. [07:30], [07:40] - **MCPs standardize API interaction for AI agents**: Model Context Protocols (MCPs) act as a guide for AI agents, providing the necessary context to choose the correct APIs and interact with third-party platforms. They standardize how agents discover and use the capabilities of different applications. [09:47], [09:56] - **Agent-to-Agent (A2A) model facilitates collaboration**: The A2A model allows multiple AI agents to collaborate by defining standards for discovering each other's capabilities, assigning tasks, and communicating. This enables specialized agents, like one for flight booking and another for hotel booking, to work together seamlessly. [12:54], [13:40] - **Hands-on lab for configuring flight MCP**: A free lab environment allows users to practice configuring a flight MCP server and client within a VS Code editor. This demonstrates how an agent can use an MCP to search for and book flights by interacting with flight simulation tools. [15:25], [19:09]
Topics Covered
- AI Agents: Beyond Text Generation to Action
- AI Agents Automate Complex Software Tasks
- How MCPs Enable Agents to Use External APIs
- Agent-to-Agent Model: Scaling AI Capabilities
- Real-World Impact of AI Agents and MCPs
Full Transcript
So, everyone is talking about MCPs, AI
agents, and agentto agent protocol. If
you feel left out, this is the only
video you need to watch to catch up. In
this video, we'll talk about AI agents,
MCPS, and agentto agent model in a super
simplified manner with visualizations
that will make it easy for anyone to
understand. No background knowledge in
AI or AI models or coding or programming
required. In the first part, we'll
explain the why and the what behind
these concepts. And then in part two,
we'll dive into some code and understand
the how behind its implementation. And
you'll also gain access to a hands-on
lab that you can use to practice this as
you watch this video. So we'll start
with something we already know, chat
GPT. So what is chat GPT? It is really
two things, a chat application and a GPT
attached to it. The chat part itself is
just an application like any other chat
app that we know. The GPT part is the
large language model. That's the AI.
There are so many other LLMs that you
have probably already heard of such as
Claude from Enthropic, Deepseek, Gemini,
Llama, etc. But we're not going to get
into more details about specifics of LLM
in this video. We'll just refer to the
AI as LLMs and that LLM could be any of
these models for the remainder of this
video. That's all you need to know about
LLMs. So the way it works is when a user
asks a question in charge, the
application sends a request to the AI
which is the LLM and the AI generates a
response and sends it back to the
application and displays the results on
screen. Now let's say we are building an
application called fly GBT similar to
chat GBT and we wanted to book a flight
for the user. If I ask it a question say
I would like to fly to North London it
should book a flight for me. So we need
a magical something that my application
can interact with that would understand
my request and do as I say based on what
we just discussed that magical thing
would be AI in the form of LLM. But if
you look at the response it's just
returned instructions in the form of
text to me. It did not actually book the
flight for me. You see that's all an LLM
can do natively. An LLM can generate
responses in the form of text, pictures
or videos, but cannot by itself do
anything or take any action. But what
does taking an action mean here? When I
say I would like to fly to North London,
my application should be able to
interact with these third party flight
services such as Joy Air or Dra Air or
Aeroggo and retrieve flight details from
these sites and then also compare that
against say my preferences such as
whether I prefer cheap or luxury
flights, my seat preferences or meal
preferences and based on all of that
information make a decision for me and
not stop until it's retrieved enough
information to be able to make a
decision. and then book the flights for
me and tell me the flight details and
booking reference numbers. So, I'd like
my AI to take action for me. So, we need
something magical, something that can do
that for us. And what is that? Those are
called AI agents. AI agents are able to
interact with third party platforms or
websites, gather information and combine
that with a memory that it has based on
our previous conversations and then
interact with an LLM, which is a real AI
here, to make a decision for me. And
that magical thing is known as an AI
agent. An AI agent can interact with
thirdparty tools, have its own memory,
and interact with an LLM and go back and
forth between these operations multiple
times to eventually be able to have
enough knowledge to make a decision for
me and then also take an action to book
a flight for me and not stop until it's
done that job. That's what an AI agent
does. Now, one of the most common
examples of agents that we work with are
ideides that we work with every day. If
you have worked with cursor, winer or VS
code and used GitHub copilot, they have
this agent mode that makes them work as
an agent. And what does that really
mean? Work as an agent. In the past, in
the non-aggent era, if you ask a
question, it would give you an answer.
That's all. In an agent mode, you can
give it one task, a big task even, such
as build an entire app or troubleshoot
an issue. It goes through this sequence
of multiple AI calls and interacting
with the codebase as well as the
terminal if needed and does not stop
until it's done. what you asked it to
do. That's difference between a chatbot
calling one LM called to an AI agent
that performs a series of different
tasks until it gets things done.
Now, one of the real world use cases of
AI agents is in software development.
You can ask it a question like we
recently noticed that a button was
missing on the UI will help me identify
when and how this changed and share a
plan to revert it. And the AI agent now
scans through the code bases, looks at
the front end and the backend code and
also the git history via the terminal
and finally tells you exactly which
commit caused this change and even how
to revert that change or fix it. So how
do I get started with agents? There are
platforms that have built pre-built
agents that you can call like agent.ai
for example where people have built
hundreds of agents that perform
different kinds of tasks like video
script generators or web design graders
etc. You can integrate these directly in
your application by invoking them
remotely or you could build your own
agents using tools like NA10 without
having to actually code. NA10 gives you
the ability to drag and drop and build
your own agents. Some example workflows
available on NA10 include automating
generating AI videos uh on YouTube,
intelligent email organization with
content classification, etc. Another
option would be for you to build an
agent from scratch using platforms like
Langchain or Langraph. But we won't get
into this in any more detail for now. We
have an entire course that covers these
topics on our platform. So coming back
to this, we said that an agent can
interact with third party platforms this
way. But how does an agent really
interact with a third party platform? It
does that through what are known as
tools. A tool allows the agent to
interact with another platform. Let's
take a closer look at that. So, here the
agent has the ability to interact with
these airlines using a tool for each
one. But how does that tool interact
with an airline? So, here's a quick
heads up. If you know about APIs
already, uh you may want to skip ahead a
few minutes. If you don't or need a
refresher, stay on and allow me to
explain. Well, let's forget about AI and
tools for a second and see how we
interact with these airlines as a human
user. So, as a human user, say I would
go to the airlines website at say
www.irates.com.
com and see that it's returns me a web
page and click around to find a flight
as per my preference and book the flight
and this is known as the UI or the user
interface of uh the airline. So you have
the websites or mobile apps all fall
into this category. But if I were not a
human user instead if I was a third
party website like make my trip or
booking.com or cheap slides I'm an
application trying to communicate with
another application. In the past, what
these thirdparty applications did is
scrape the websites which is basically
saving the website as a text file and
when they scrape the website you get a
junk text like this which is the HTML
but within that junk lies the
information you need which are the
flight details and so they would run
complex algorithms against these to pull
the required flight details from here.
So eventually the airlines realized it's
beneficial for them too to be on these
third party platforms. So the airlines
told them instead of going to
emirates.com you can go to
emirates.com/apiflights
and when you do that we will just send
you the flight details in a structured
format. So you don't have to do any
crazy parsing algorithms and that
interface that applications provide to
other applications is known as an
application programming interface or
APIs.
Now, not only did they say you could
retrieve flight details if you call the
/appi/book flights, then we would book
the flight tickets for you and return
the booking reference number. So, you
can let your customers book the flights
from your own website without even
coming to our site. Now, I'm super
simplifying this. So, if you go to these
URLs, it won't work like this because it
requires authentication and
authorization and other mechanisms. But,
we have a hands-on lab that will help
you learn all about this MCP. So, check
it out using the link in the description
below. I'll also walk you through the
lab at the end of this video. So just to
summarize that the interface that users
use to interact with the site is called
the user interface and the interface
that the applications use to interact is
called as an API. So back to this now
that we know how applications interact
with applications. How do you think
tools interact with airlines? Well
through APIs. So each tool is a piece of
code that interacts with the API of the
respective airlines to retrieve flight
information and those details are then
shared with LLMs to make a decision and
then based on the decision, the agent
uses the tool again to make another API
call to book the flight on the
respective airlines. In this case, the
agent made another API call to the Joy
Air to book the flight on that airline.
Now, if you take a closer look at that
call, you'll see that each call is
different. The first one is /
API/flights. The second one is
/flights-list.
The third one is list flights. And also
their responses are different too. The
first flight returns uh information in a
format that has flight number, origin,
destination. The second one returns uh
information that says flight number from
and to. And the third one uh says
detailed flights and flight and start
and finish etc. So each airline has its
own standard when it comes to their
APIs. There are hundreds of airline
sites and there are millions of other
third party sites. And if I want my
application to interact with all of
them, do I now need to write all of
these adapter codes? Now we are in the
AI world and I shouldn't have to do
this. Well, gone are those days where I
would sit and write programs to connect
to these different flight service
providers one by one. Why can't AI just
do it for me? Only if there was some
magical solution that existed that could
do that for me. And so comes MCPs or
model context protocols. Well, think of
MCPS as a guide for the AIS to choose
the right APIs and interact with the
third party platforms. Well, MCPs
provide agents the context they need to
make the right API calls. What does that
mean? For example, it might look like
this. In this case, the MCP tells the
agent that Joy Air has search flights
and bookflight capabilities. And the
input structure looks like this. And the
output structure looks like this. And
we'll dig deeper into the implementation
of this uh in the part two when it comes
to building an MCP server. So MCP was
introduced by Anthropic, the company
behind Claude, and has since been open
source and is now the default standard
used by everyone to build AI agents.
Now, so if you go to model context
protocol/servers, you can find MCP
servers for a long list of applications.
Now, every agent has an MCP
configuration file located at mcp.com at
some location depending on what agent
you're using. So you must specify the
name of the MCP. In this case, it's
MongoDB. The command and arguments
associated with it uh associated with
running the MongoDB MCP server. In this
case, the arguments are MongoDB's
connection string to reach the database
which is my local database. This allows
the agent to use the MCP server's
abilities to connect to the database and
retrieve information as well as make
modifications to the data. Now, in my
case, it's a local database. The
location of this file depends upon the
tool being used. Cursor for example has
this path specified at the dotcursor
directory in the user's home directory.
For windsurf it's under
thecodium/winsurfs
directory and this is a path for the
configuration file for claude.
So going back to this MCP server works
in a client server model. So instead of
interacting with the API directly, we
now have the MCP server for each of
these airlines. And then you have an MCP
client at the agent that interacts with
these MCP servers.
And so a combination of AI agents that
has memory, has cold-driven behavior,
and has access to AI as an LM with MCP
servers that helps AI agents discover
the capabilities of third party
applications help us build magical
solutions to problems. Now it's time for
us to expand and scale up. This agent
can only book flights. But what if we
want to expand our use case to book
hotels too? So one thing I could do is
expand this agent to add more MCP
servers to also connect to hotels. But
that's going to add bloat to my agent.
Uh now my agent needs to be good at two
things and remember uh my preferences
for two things. I might have amenities,
beds, and other preferences for my hotel
which are different from those for
flights. So we ideally want one agent to
do one thing and do that thing really
really well. And so our next option is
to build a new agent that can do the
hotel booking really well that has its
own integration with MCP servers and has
its own memory with those specific
preferences. And my original agent is
going to call this agent. So that's an
agent to agent call. Now I have one
flight agent that's really good at
finding and booking flights. And I have
another hotel agent that's really good
at booking hotels.
But how does one agent talk to another
agent? How does one agent know what are
the capabilities of another agent? What
format can one agent pass information to
another agent? Well, this is where the
agentto aagent model comes in. The
agentto aagent model was developed by
Google with the goal of making it
possible for agents to be able to
collaborate in a dynamic multi- aent
ecosystem with support and contributions
from a lot of other partners in the
ecosystem. So, how does it work? The
agentto agent model allows one agent to
discover capabilities in the other
agent. For example, the flight agent can
ask the hotel agent, "What can you do?"
The hotel agent responds with its
capabilities that it can search and book
hotels. Then the flight agent gives the
hotel agent a task to search for the
best hotels and then the hotel agent
responds back with the results of that
task. So agentto agent uh model defines
a set of standards that allows agents to
discover each other's capabilities. It
defines a standard to assign task to
another agent and check its status. It
defines a standard on how agents
communicate with each other and also
defines how context and results are
shared back and forth between agents.
We'll see this in more detail in part
two of this video. Well, let's take a
look at some real world use cases of
agents and MCPs. Here's one use case
that we spoke about earlier with
reference to development. Say if I have
an issue with my application, I could
say we recently noticed that a button
was missing. help me identify when and
how this changed and should a plan to
revert. The AI agent interacts with the
get history uh reads the back end and
front end code and then identifies the
exact change or commit that caused this
change. The next use case is using AI
agents and MCPS to build backend
applications. In this case, we are
developing APIs and I'd like my agent to
have access to the MongoDB database so
that during the development of the APIs,
the agent can test these APIs and make
sure the data is available in MongoDB.
So this is a very helpful use case. And
then here's another one that we had
internally. So we have uh three uh data
sources. Stripe, Google, bequery which
is our data infrastructure and then we
have metabase which is a visualization
platform. We had an issue where uh we
were missing an invoice detail from a
particular stripe record and we were not
able to identify which user that was for
and so we t the AI agent provided it
access to these three data sources
through NTPB servers and it was able to
go on a 5 to 10 minute uh
troubleshooting journey and eventually
come back and um tell us uh why it was
missing and the particular transaction
ID associated with that and the amount
associated with that. So uh those are
some examples of real use cases that the
uh MCP servers can be used for. Next
we'll get uh some hands-on experience.
So we'll head over to the lab using the
link given in the description below and
uh let's uh quickly take a look at the
hands-on labs.
All right. In this lab we're going to
walk through simulation of the flight
MCP with client in our code you know in
the VS code editor. So this is a free
lab that's hosted on codecloud. So use
the link in the description below to
gain access to this lab so that you can
walk through it yourself. So once you
open the lab environment, you're given a
a set of instructions on the left side
here and then you have a set of the VS
code editor here on the right. So the
there's some code here that you can
ignore for now. I'll explain all about
it in a little bit. So we'll start with
a quick walk through. So in this lab we
will explore how to configure the flight
NCB server in client. So client is
similar to cursor or or winds surf but
there's another agent here. So if you
click on this agent button here, this is
a VS code extension or plug-in that
behaves just like the cursor or Windsor
or other or GitHub copilot that you
might have worked with. Right? So let's
click okay and go ahead. So in the next
step here it says let's set up client.
So we want to open the client interface
by clicking on the robot icon on the
left side of the MWS code server. So
that's we've just already done that. So
the first step is to set up a client so
that you can chat with it, chat with the
AI.
So let's go to the next step. Okay. So
the instruction here is to configure API
key. So you don't have to bring your own
keys. We we provide you the keys that
are needed. So but here you need to
select use your own API key and then in
the API provider you'll need to select
open API compatible this one. Right.
Right. So now we need to provide a set
of information including the keys and
those details are actually available
here. So it's already available in your
environment. So if you go to your home
directory and bash
profile here you have the keys that are
needed for you to work with any of these
endpoints during this lab. And this is
free for you to use and play around with
as much as you want. So in this case we
need the base URL. So the base URL we're
going to use OpenAI. So I'm going to
copy this base URL from here. Paste it
here. And then I need the API key. So
I'm going to copy the API key for the
OpenAI
uh from here to here. And then I need
the model ID. So the model ID is going
to be OpenAI/DP-4.1.
And that's it. And then you click let's
go.
Okay. So client is set up. So we're just
going to close these messages. And we'll
send a quick test message to check if it
can hear us.
So, it's going to send an API request
and we'll see. Yep, it says I can
receive a message and respond to your
request. Let me know what do you need
help with. Okay. All right. So, this
step is complete. So, we're going to go
ahead click okay and go to the next
step. Okay. So, now we're going to ask
for flight details and observe that it's
not working. So, we have not set up the
MCP server yet, but I'm going to ask it
to share flight details. So I'm going to
say can you check flight details for me
from
Safo to JFK for today and let's see if
it's able to do that.
Okay. So what it's not going to do is
it's understood my request and it's
going to open Google and try and access
you know publicly available flight
information. But that's not really what
we need because
we don't want it to go out and access
the browser. Instead, we wanted to use
our MCP tool that already has that
information, right? So, I'm just going
to reject that request and prevent it
from going out. And let's see how we can
configure the MCP server. It says I'm
unable to access the browser to look up
flight details. Would like to provide
access to a specific API or connect MCP
server. So, that's what I'm going to do.
So, to connect the MCP server,
the steps here are to click on the
server button at the bottom. So here you
have manage MCP servers. I'm going to
click on that and then there's a
settings icon and then here I have
configure MCP server. So I'm going to
click on that and what this does is it
opens up this client MCP settings.json
file. So for client this is the the file
that needs to be updated. Then I'm going
to go here and copy this configuration
and I'm going to paste it here. So what
this means is this is a list of MCP
servers. I'm going to call mine flight
sim or flight simulator. And then
there's this is basically a simple
script that's located at root flights
sim mcp and flights sim-mcp.sh. So as
soon as I put it here, it's already
become available here. As you can see,
it's green. This means this MCP server
is ready to use. So I'm going to click
on done now. But I just want to take a
minute and show you what the location of
this file. So you have this flight sim
MCP here, which is this path. And if you
look into this, there is all the code
that's written to run the this server.
So this is basically a Python file and
if you expand this here you're able to
see this list of uh the code here on the
src and then you can see the prompts and
resources and tools and everything
defined. Don't look at this for now
because we're going to have another
video where we explain these in much
more detail. So let's go back to this
and let me ask the same question again.
Can you help me find flights from SFO to
JFK?
and we'll give it a minute for it to
interact. Okay, so it says there's now
an MTP server available flight sim that
provides details of first search flights
and all that. It's given me this details
and it's done the search and it's
actually got back with some of these
details and it's going to make a call
with some of these details. It's asking
me for my permission. So I'm just going
to say approve.
All right, so it's identified the flight
and uh you can see the response here.
But yeah, here's a more human readable
format. So it says there are these
airlines that are available.
And so I'm going to say book the
cheapest flight for me. And let's see
what it what it does.
Now the capabilities of the flight MCP
tool tells it that it can book the
flight which is this particular one. But
it needs these inputs. So the inputs are
first name and last name. So I'm just
going to give my name and email. Let's
say cloud.com
and they're going to give me a phone
number,
right? And let's see if it can pick
that.
Okay, it's got the passenger details and
it's going it's going to do the booking.
It's asking me for an approval. I'm
going to say approve.
I think there's some error in terms of
validation. I probably did not give the
right information. Yeah, the phone
number is not valid. So let's say let's
say I'm going to give it another number.
Okay, I'm going to prove again.
Yeah, I think there's uh probably still
got the phone number wrong.
Let's just copy and paste that.
Now, while it does that, here's
something else we could do. Okay, so
that task is complete. So, it's able to
book my flight and it's given me all the
flight details and booking numbers and
all of that, which is pretty cool. Okay,
so that's a quick demo of using flight
simulator MCP. So, go ahead and try this
out yourself. In the upcoming video and
lab, we will have we'll see how to build
your own MCP servers. But for now, if
you'd like to play around with it, you
can take a look at this codebase and you
can basically ask client to explain read
and understand
code at this location and explain it to
me.
Now, as an agent, what it's able to do
is it has access to the file and folder
structure. So, it's going to spend some
time reviewing the directory, reading
the files, and understanding its
structure. and it's going to be able to
tell me and explain to me how it's all
set up.
Okay, so there it is. So it's able to
tell me that it uses a fast NCP server.
That's what I've used. And there are
these core features which is search
flight and get flights and all of that.
Then there are resources and API prompts
and it has a modular design. All right.
So yeah, that's quick intro to the MCP
server lab. Take a look at it and let us
know how it goes. Right. In the next
part of this video, we will dive deeper
into how to build your own MCB servers
and clients and the agent to agent
models. So, do subscribe to our channel
to be notified when it's out.
Loading video analysis...