LongCut logo

A Practical Introduction to Agentic Coding

By Association for Computing Machinery (ACM)

Summary

Topics Covered

  • Agentic Coding Grants Full Workspace Autonomy
  • Human-Curated Context Boosts Agent Performance 16%
  • MCP Standardizes External Tool Access
  • Skills Bundle Complete Workflows with Tools
  • Commit Agent Changes to Preserve Versions

Full Transcript

Okay.

>> Yeah.

Hi everyone. Thanks for joining us. Um

we're just going to give it a minute so that people can just settle in and we can have some more people also join in.

So yeah.

Hi everyone. Welcome to today's ACM tech talk. So this webcast is a part of ACM's

talk. So this webcast is a part of ACM's commitment to lifelong learning and personal development.

Um also serving a global membership of computing professionals and students. So

today I am the host for this session. My

name is Abigail Misy Dobe and I am an open-source researcher, community builder and programs manager from Ghana.

and my work is to create more awareness of the power of opensource through community building advocacy and research

especially among the youth. So I'm

excited to be here um for this session and I hope that you are too. So it's

good to see that we are all um interacting with each other as well in the chat. Please keep them coming.

the chat. Please keep them coming.

So for those of you who may be unfamiliar with ACM or what it has to offer, here's more information. So ACM is an

acronym for association for computing machinery and it offers educational and professional development resources that bolster skills and enhance career

opportunities.

You can see some of the highlights on your screen. ACM provides access to ACM

your screen. ACM provides access to ACM digital library, the world's most comprehensive database of computing

literature. In fact, I have personally

literature. In fact, I have personally um I I've personally enjoyed a whole lot of um data um on this platform and I

it's my go-to platform actually for my my research work. So definitely do check check it out. Um it's also the leading

publication and global conferences that draw top experts on a broad spectrum of computing topics and also provides

support for education and research including curriculum development um teacher training the ACM turing and

ACM prize in computing awards and also finally the ACM code of ethics um which is a collection of principles and

guidelines designed to help computing professionals make ethically responsible decisions in professional practice.

So now before we get started, I would like to quickly mention a few housekeeping items shown in the slide in

front of you. Now, if you have questions at any time, please type them um using Zoom's Q&A button. We'll organize the

questions as Marlene speaks and try as much as we can to um get it to her and we'll try to have her get through as

many as possible. So, this session is being recorded and it will be archived.

And after this session, you would receive an email notification when it becomes available.

Also, do check out learning.acm.org

for updates on this and upcoming webcast. At the end of this

webcast. At the end of this presentation, you'll see a survey open on your screen. Please do take a minute

to fill it out um to help us improve our tech talks.

So as you all know and we're so excited to have you all here. Today's

presentation is a practical introduction to aentic coding and we have no other person than Marlene to give us this

talk. And I'm going to um do a brief

talk. And I'm going to um do a brief introduction of who Marleene is. I've

had the chance to work with Marlene for some years now and I know she's on top of her game when it comes to this space.

So Marlene is a software engineer. She's

an explorer and a speaker currently based in London. She's a senior developer advocate working at Microsoft

focusing on Python and AI.

Marlene is a previous director and vice chair for the Python Software Foundation and is currently serving as co-chair of

the ACM practitioner board.

Now, in 2017, she co-founded Zimbopie, a nonprofit organization that gives Zimbabwean young women access to

resources in the field of technology.

She's also the previous chair of Pyon Africa and is an advocate for women in tech on the continent. And it goes on and on and on. Marlene, thank you so

much for all the work that you do in the space. Without further ado, please take

space. Without further ado, please take it away. The stage is yours.

it away. The stage is yours.

Amazing. Thank you, Abigail. I'm going

to just switch over to make sure I am sharing my screen and I will just hide the different bars here. It is great to

see everyone. I saw a lot of comments

see everyone. I saw a lot of comments there in the chat um from people all over the world joining us. I'm in London right now. It's actually sunny in the UK

right now. It's actually sunny in the UK for once. I saw a couple of people

for once. I saw a couple of people joining from the UK. Can you imagine?

It's sunny today. So, it's great to see everyone. So, like Abigail mentioned, I

everyone. So, like Abigail mentioned, I currently work as a senior developer advocate at Microsoft. I focus on Python and and AI. And for this talk today, I'm

going to be walking you through a practical guide to agentic coding. Now,

before I get started with the talk, I do want to mention that I'm on all the social media sites. I'm chronically

online. So if you have any questions that were not answered after this talk, feel free to reach out to me at any of the platforms that are listed there on my slide. A final thing here as well is

my slide. A final thing here as well is that you can use the link on my slide here to access the slides that I'm sharing or you can wait. Yan is going to

send out the link as well so you'll have all of the slides available for you because I know that's a question people tend to ask. All right. So, diving

straight into things here is a bit of an agenda of what we're going to be covering in our talk today. The first

thing we're going to look at is the evolution of AI and coding. We're going

to understand what a coding agent is and what something called an agentic loop is as well. And then we're going to walk

as well. And then we're going to walk through some practical demos of how to work with agents, coding agents to get the most out of them and the best results. And then finally, we'll look at

results. And then finally, we'll look at some best practices for when you're working with coding agents.

A good question for us to be asking and hopefully you are here on this call to learn is what is agentic coding? some

context for us to have is that over the past I would say two years or so we've seen this massive shift in AI assisted programming. So in the past initially

programming. So in the past initially when we started to see AI appear in our code it was through code completion. So

maybe you would be writing a function and as you're typing an alm would look at the last two or three lines of code that you were writing and then the alm

would make some suggestions of which line should come next and you could accept or reject those suggestions just like you would with your phone with autocomplete or something like that.

Then after that the AI labs, you know, like OpenAI and Anthropic started to develop these LLMs that were specifically made for coding. And in

this stage, we started to be able to go to chat GPT to go to Claude in the browser and chat with it, asking it to generate a function or an entire script.

And I remember during that time I would go over to chatbt and I would ask it to create a function for me but I would have to copy the code that was generated

and paste it into a file or I can remember sometimes I would get the code that chatpt has given me and I would try and run it locally in in VS code or something like that and then there would

be an error. So I'd have to copy the error from the terminal and paste it back into the chat so that I could get some help with the problem. So this era was fine. It was great because we were

was fine. It was great because we were seeing the LLMs getting better at working with code and solving our problems with code. But there was a lot of manual work with that copying and

pasting into our environment.

And then I would say probably around the beginning of last year, we really started to see the emergence of true agentic coding. In VS Code, for example,

agentic coding. In VS Code, for example, we introduce something called agent mode to GitHub copilot. So if a user is using

GitHub copilot in VS Code, when they switch over to agent mode, it turns C-pilot into this autonomous AI programmer. So if I go to Copilot and I

programmer. So if I go to Copilot and I now ask it to complete a task for me, Copilot is able to execute commands in

the terminal. It can complete multi-step

the terminal. It can complete multi-step tasks using the tools that it has and it can even debug its own code. So you have a higher level of autonomy. And I would

say the specific thing that changed is that we've given the LLM access to our full coding workspace. So it's not just, you know, working in a chat, but it has

the full context of our workspace. And

it also has access to specific tools that allow it to when it sees, okay, this needs to change or I need to add this, then it has the tools it needs to

be able to make those changes by taking action with those tools.

So, in the market today, there are a lot of different coding agents, and we're really at an insection point, I would say, where, you know, there's a bit of a fight for which coding agents are going

to come up on top. But this is great because it provides us with lots of different choice. So, regardless of

different choice. So, regardless of where you prefer to use your coding agent, there's lots of options. If you

want to use an IDE, so say for example, GitHub Copilot in VS Code, you can do that. If you want to use your CLI,

that. If you want to use your CLI, claude code has been very popular for that. Uh in the web, you can use things

that. Uh in the web, you can use things like codeex as a coding agent uh in the cloud. And then you can also build your

cloud. And then you can also build your own custom coding agents in code using different frameworks. For example, LANC

different frameworks. For example, LANC chain is a great example of an SDK you can use to make some custom coding agents.

So, I will say that even though all of these different coding agents, we have such a wide variety, all of them under the hood typically are built the same

way. And for us to be able to know

way. And for us to be able to know practically how we can take advantage of these coding agents and get them to produce the best results possible, one way to do this, I would say, is to

understand how these coding agents work under the hood. So, let's start by defining what an AI agent is. Now, it's

been debated in the past, but a widely accepted definition of an agent today is an LLM that calls tools in a loop to

achieve a goal. So, that's pretty much accepted in the industry as what an agent is. And I would say specifically

agent is. And I would say specifically for coding agents, we see them, you know, often in almost all the coding agents, they are associated with this

agentic loop that you can see on the screen. I actually got the image that

screen. I actually got the image that you can see here from um an article that the team appropic published um about how claude code works under the hood. And

for the rest of this talk, because I mentioned it's going to be demoheavy.

We're going to be uh trying things out live, we're going to focus on these three parts of the aentic loop. So,

gathering context, taking action using tools, and then verifying the results.

Um, so let's go ahead and start with the first part of the agentic loop, which is gathering context. Now, if you're a user

gathering context. Now, if you're a user and you go to your coding agent and you ask it to complete a task for you, you're sending in a prompt to your

coding agent. The first thing the agent

coding agent. The first thing the agent is going to do is it's going to try to gather context that will allow it to be able to complete the task on your

behalf. And there I would say as someone

behalf. And there I would say as someone that is doing agentic coding, one of your top priorities should be to make

sure that your coding agent has the correct context it needs to complete a task. And there's a number of different

task. And there's a number of different ways today that we can give our coding agent access to context. One of those things, so you'll notice that right now I'm presenting the slides in VS Code.

I'm using something called simple browser to show the codes but uh to show the the slides but I just clicked on this chat button and it's opened up um

GitHub copilot uh the the coding agent that is GitHub copilot if you want to install that for example in VS code you can just go to the extensions tab and look for GitHub copilot chat and that

will install it in VS code for you now once I've opened this um this chat session I want to make sure that I've clicked on

agent here and that's going to make sure that copilot is ready to act as a coding agent for us. I mentioned this coding this agent mode before previously. And

then you also have options of which uh coding models to use. Let's go ahead and use claude opus 4.5. I usually have a wider variety of um models to use, but

for the moment, let's go ahead and use that.

Now to actually one of the most common ways that we can give our agent context is to attach that context to the chat

that we're using to talk to the agent.

So here in VS Code for Copilot, if I click on this add context button, it's going to open up a list of different options for context. So here we can see

there's this GitHub issues option and I can literally just click on an issue.

So, Copilot is going to autodetect which issues are associated with my repo and I can attach that issue to the chat and can ask Copilot maybe to work on this

issue and fix it. Another option is maybe to go and attach a pull request for review or we can actually go in and

click on a file to give Copilot and it will use all of this context to be able to know what to do. One of the top complaints that I hear from developers

today when they're using coding agents is that they feel that the LLMs are generating code that is overly complex.

Maybe the code doesn't follow the standards they like. And something that I like about attaching context is for me I have gotten into the habit of creating

an example file. So I'll have a file where in that file is some code.

Typically, I'll use APIs or functions that I use over and over again in my coding process, and I'll attach that to the chat and let the agent know, hey,

this is the file that you need to use as an example, as a reference point when you're generating new code. So, it looks at that file and it'll know that it

should follow the conventions in that file and use that as a baseline when it's trying to generate new code. So

that is mainly that's one way that we can go ahead and help the agent in this part of gathering context.

A second way we can do this is by providing an agent with instruction files. So if you've been in watching the

files. So if you've been in watching the coding space for a while, you've probably heard of these file names. So

if you're working with GitHub Copilot, they will be called Copilot instructions. If you're working with

instructions. If you're working with cursor, they're called cursor rules. For

Claude, it's Claude MD file. For agents,

it's agents.m MD. We've tried to push for a standardization, and agents.md

seems to be what's going to be the standard in the future, but basically these instruction files are going to be used as a place where you can tell your

agent how to behave. And the instruction files, any of the ones listed in this list, when they are added to agitub

folder, that is going to control how your agent that you're working with behaves. A way to show this practically

behaves. A way to show this practically is I'm going to open up this simple agent folder. I mentioned in in one of

agent folder. I mentioned in in one of the slides that you don't have to just use one of the pre-built agents. You can

go ahead and use code to build out your own custom agent. So in this example here, I'm using GitHub Copilot uh SDK to

be able to create a coding session. So

this line here will create a brand new coding session. I can change the model.

coding session. I can change the model.

I'm just using GPT 4.1 because it's the lowest, you know, energy model to use.

Um, but you can really choose any model that you would like to use to drive your agent. Um, then you can pass through

agent. Um, then you can pass through different tools and different MCP servers and decide what sorts of permissions you want to give to your agent. And then I'm going to use this

agent. And then I'm going to use this line of code here to pass through a prompt to the agent that I've created here in the code. [snorts] Now to run

this code, I'm going to open up a terminal. I have too many terminals

terminal. I have too many terminals open, but it's okay. And I'm going to run this Python script in the same way I

would run any other Python script. So

I'm going to run Python simple agent.py

and it's going to get that agent session started and it's going to pass out prompt copilot to the agent. Now, can

you see the response? I'm not sure how many if I hope I'm zoomed in enough, but you'll see. I'm going to zoom back out,

you'll see. I'm going to zoom back out, but it says, "Hi, I'm Marlene's AI assistant for the ACM Tik Tok. How can I help?" This is a bit strange because if

help?" This is a bit strange because if we look at our folder, there's nothing in this folder that's talking about the ACM. Absolutely

nothing. And the reason why our agent is behaving in this way is because I've already created I've created this.

GitHub folder. And then in that folder, I've created this copilot instructions.mmd

instructions.mmd file. It's a simple markdown file. And

file. It's a simple markdown file. And

I'm going to define in this file the way the agent needs to behave. So in this case, I've told Copilot to whenever the user says hello to you, when they greet

you, you should answer by saying, "Hi, I'm Marlene's AI assistant." And so on.

this copilot instructions file because it's at the root of the.github folder.

So you can replace this with cursor rules, you can replace this with a claude MD. But because it's at the root

claude MD. But because it's at the root of this GitHub uh folder, it's going to affect any agents that are created, whether that's in the SDK or whether

that's here in the chat. So I'm going to say hi again to Copilot that's here in the chat. And when I do, it should also

the chat. And when I do, it should also give me this response saying, "Hi, I'm Marlene's AI assistant for the Tik Tok.

How can I help?" And again, this the reason for this is because this file when it's at the root of this. GitHub is

going to be the instructions for the full for the full repository. So anyway,

you have an agent, it's going to be affected by this. Sometimes you don't want an agent that is, you know, a fix the full repository and maybe you want a

specialized agent that you use for specific um toss. And so in this case, what I would encourage you to do is to

create an agents folder. So in that same GitHub folder, you're going to create an agents folder. And in it here in this

agents folder. And in it here in this case we're using that agent.m MD uh sort of file uh structure and we are also

putting the name of the agent we want to create and then in this file again it's just a simple markdown file with instructions. I'm going to define what

instructions. I'm going to define what this agent is going to do. So I'm saying you're a reviewer agent and this is what you do if a user comes and tries to use

you as an agent. this is what you need to do. So this particular agent is going

to do. So this particular agent is going to be helping with reviewing code. So I

can add some context here for specific requirements I want the agent to keep in mind when it's reviewing code. And

again, this is in an agents folder and then I save this agent.md file. One

thing that you'll see is that if I create this file, I can come here and I'll see my reviewer agent has been recognized and it's listed as a special

agent here. And so when I switch over to

agent here. And so when I switch over to the reviewer agent, I can type in hi.

And instead of using defaulting to the co-pilot instructions, it's going to default to the reviewer agent and it'll tell me I act as a specialist agent for

reviewing code. how can I help? So, this

reviewing code. how can I help? So, this

is a great way of using context in different ways. We looked at adding

different ways. We looked at adding context directly to the chat. We looked

at creating a repository level co-pilot instruction and then we also looked at making a specialized agent like our reviewer agent here. So, let's switch

back to the overall copilot um agent mode.

Great. So now we have a good understanding of context and our agent if we follow those specific if we do at least some of those three things our

agent will hopefully have the right context it needs to carry out a task.

Now the second thing we want to take into consideration is action. And here

we want to make sure that our agent has the tools it needs to once it has all this context and it's made a decision on what it needs to do. we need to help it

take action and the way we would do that is by giving the agent the tools that it needs. Typically in coding agents today,

needs. Typically in coding agents today, most of them will give you the option to configure the tools that your agent has access to. So here I clicked on this

access to. So here I clicked on this tools icon and it gives me a list of all of the tools that the agent has available to it. You can see right at

the top that it's quite standard for most coding agents to come with some built-in tools. And some of the tools

built-in tools. And some of the tools you'll see that are built in in almost all the coding agents are an editing tool to help the agent edit files in the workspace, an execution tool that will

allow your agent to have access to the terminal and maybe also execute code in the background using a sandbox. And then

you also have a read tool that will help your agent read in the space. A search

tool, a to-do list tool. Um, and one of my favorite built-in tools is the web feature so that your agent has access to be able to find information online and

help it with the specific tasks. And so

these are built in. However, if you are someone that's doing agentic coding, whether you're a developer, a PM, a designer, you probably have other tools that are not part of these built-in

tools that you want to also give your agent access to so that it can help you with tasks. For example, I have this

with tasks. For example, I have this bicep tool that I've added on top of my built-in tools. If I close those there,

built-in tools. If I close those there, you'll see these are a couple of the tools. So this bicep tool I give it to

tools. So this bicep tool I give it to the agent when I want the agent to help me deploying a model for deploying a a an app for example if I wanted to do

this. Um so and one of the most

this. Um so and one of the most consistent ways that we can give our agent access to tools is through

something called MCP or model context protocol. So model context protocol is

protocol. So model context protocol is an open pro protocol that was created by anthropic that is it's it's a standard way or it helps standardize the way that

we provide context to LLMs. So a provider, you know, maybe Microsoft for example, will create a server, an MCP

server that you can connect or give to your agent and it will help it access the context, including tools from a specific API. And I would say that you

specific API. And I would say that you know MCP has been very controversial recently but in most interfaces when

people are using uh coding agents in their IDE or for example in in on the web in the cloud with uh Claude for example MCP is the standard way that we

are going to provide our agent with external tools. And one way that you can

external tools. And one way that you can give your agent access to MCP servers is to in GitHub uh in VS code I would click

on the extensions tab here and then I'm going to press add MCP and when I run that I get a list of different MCP servers. So say for

example I want to give my model access to superbase so that it can you know I don't know do something with databases or whatever it is till for web search or

or whichever thing I want to a tool I want to give my agent I can just go to this list here and some people are concerned sometimes with the security of

MCP servers. I would really recommend

MCP servers. I would really recommend using this feature in VS Code because all of these um MCP servers are part of the GitHub registry. They've been vetted

by GitHub and we know they come from the original creators. So, this is a great

original creators. So, this is a great way to sort of avert some of those security concerns, though there are others that you should think about as

well. Um, and once you install an MCP

well. Um, and once you install an MCP server, it's going to appear in the list of tools. So, all of these

tools here with the this logo, that's the MCP logo are have been installed as MCP servers. And you can see I have the

MCP servers. And you can see I have the Excaliraw MCP server installed. And I'm

going to ask it as an example for you to see today. I'm going to ask it to use

see today. I'm going to ask it to use the MCP server, the Excaliore MCP server which I showed you I had installed to

draw me a quick diagram that will help us understand what MCP is. And you'll

see that the agent has already connected to our MCP server, the Excalitor MCP server, and it's using that read command. The MCP server is providing it

command. The MCP server is providing it with the context it needs. Oh, it's

already generated the diagram as well.

So what's happening is here we'll see our agent is going to um here we're going to see the agent has

used that readme tool to connect to excali draw and it's going to read the context on how to use excaliraw to draw a diagram and then the agent is going to

also use this excaliraw mcp server with the tools from excaliraw to then generate a diagram using that MCP server. And so here we have this very

server. And so here we have this very pretty diagram that shows us exactly how MCP works. So this is one example of how

MCP works. So this is one example of how we can use the uh use MCP as a tool for us to be able to complete a task um with

a coding agent. Okay, so that's MCP and a way to use that.

Now, another way to give your agent access to tools is through something called agent skills. Now, agent skills are different from MCP in that it's a

reusable workflow that's defined. You'll

have as part of your agent skill, you'll have a a skill.md file and that is going to be a workflow that an agent can invoke to complete a specific task. So,

you know, say for example, there's a task that I do over and over again. And

in this case, I have an example for you of a skill. And again, the thing that we're doing here is we are going to that.GHub folder and we're creating a

that.GHub folder and we're creating a skills subfolder in there. And then we have I have an email skill. So, I like

to use agents to draft up emails for me, send emails sometimes because that is my least favorite part of my job is dealing

with email. And so, here a skill.md file

with email. And so, here a skill.md file

is going to have a it's just going to be a markdown file that has some YAML at the top. And the first thing is that

the top. And the first thing is that there'll be the name of the skill and then there's going to be a description for what the skill does. And your coding

agent is going to read this description and will know whether or not to use this skill to achieve a specific task. So our

agent would read the description and know that it can use this skill to send an email, for example. With skills,

they're particularly good because you try to bundle everything that the agent will need in this folder. So here I specifically give the workflow

instructions saying that the agent needs to pass through the user's request for an email to be sent to this agent. py

file and you'll see the agent. py file

is here already included as part of the skill. So a skill is mainly a bundled

skill. So a skill is mainly a bundled way to give an agent a complete workflow. It can include MC servers in

workflow. It can include MC servers in as part of the skill for the agent to be able to complete a task. Now to show you uh what a skill looks like or using a

skill looks like in a uh in agentic coding I'm going to show you uh an example in the terminal. Before I do I I want to kind of set this up as a story

so that we can understand what we're doing. So let's say I'm a developer that

doing. So let's say I'm a developer that is working for a toy company called Tailspin Toys. Yesterday I got an email

Tailspin Toys. Yesterday I got an email from the search product management team asking me to add a search bar and filter to this site. They want some new features added to our website and

they've given me a short checklist of what to build including a search bar with tick search for simple searches and Azure AI search for more complex searches. They've also asked me to add

searches. They've also asked me to add in a sidebar so customers can filter by category and price. So, this is a helpful list and maybe I'm really busy

and I can't do this work on my own and so I want Copilot to help me with the task. And so, to do this, we're going to

task. And so, to do this, we're going to use Copilot CLI. And so, I'm going to just run the command to get Copilot CLI started. And when I do, you'll see that

started. And when I do, you'll see that you'll see things similar to what we were seeing in VS Code. It connects to a number of different MCP servers to four

different skills, four different agents as well. And when we look at the skills

as well. And when we look at the skills that are available, you'll notice there's this work IQ skill for Copilot, I'm going to just exit out

of that. And for Copilot, let me just go

of that. And for Copilot, let me just go ahead and hide this loading meeting control. [snorts] for Copilot to be able

control. [snorts] for Copilot to be able to help us with the task that we saw in the email, it somehow needs to connect

to the uh to our our email and then bring that information here into the terminal. So, in the prompt that I

terminal. So, in the prompt that I showed that I just had in the that I sent through to Copilot just now, I'm going to ask Copilot to use the work IQ

skill to help me get the information from the from my email and pull it into the terminal. So, work IQ is a skill

the terminal. So, work IQ is a skill that was created by Microsoft that helps you connect to the M365 suite. So, in my prompt, I'm telling Copilot, find the

latest email from the Tailspin Toys PM team and then list out the checklist of what we need to build here in the terminal. And you'll see that Copilot

terminal. And you'll see that Copilot has used, it's evoked that work IQ skill and it's passing the question through to IQ saying find the specific email. And

what's going to happen is in the background, this is of course like uh one that's coming from a company, but it will look very similar to what we just saw in VS Code where under the hood, um

the skill is going to be using the graph API to as part of the skill.md file to be able to triage through my emails and

to try and find the right email. And

fortunately, sometimes the demo gods are not on my side, but they today they are.

And uh fortunately it has found the email from the team and it's been able to print out all of the requirements that we need to build for this new these

new search features. Great. So now that we have all of the requirements here in the terminal, maybe we want to also switch out different models. So here in

Copilot CLI, I can actually call the model. Let's do that /model command. And

model. Let's do that /model command. And

then it gives me access to all of the models that are available. And again,

one thing to note with aentic coding is that I would say to try and look for a way to have as many models that you have

access to as you can because the models are improving so quickly that sometimes it's hard to keep up with which one is the best option. So for example, in Copilot, there's already GBT 5.4,

before, but last week, just a week ago, people were excited for GPT 5.3 Codeex.

So, I'm going to choose 5.3 Codex. I'll

set it to medium, and that just means the model is going to think with a medium amount of strength. And then I'm going to send through a prompt. I'm

going to ask Copilot to now using the checklist that it has, go ahead and build out those features that are here in the terminal that we've just seen listed out.

Now, because copilot CLI is also a coding agent, just like what we saw with GitHub in VS Code, it's going to do exactly those steps. So, it's going to

analyze the prompt that we give it. It's

going to gather as much context as it can. So, it's going to look around at

can. So, it's going to look around at the environment. It says it's planning

the environment. It says it's planning its implementation step. So, it's going to create um use the tools it needs to just read through different files in the

code base so that it gets it right before actually starting to work on things. So again in the CLI even we're

things. So again in the CLI even we're using that same aentic loop where you're going to have your coding agent gathering that context. It's going to use the tools available for it from it

and then we'll try to uh verify. So this

process is going to of building the implementation the new features is going to take a few minutes and so while it works in the background let's switch over to a new tab. So earlier today I

sent the same request for the search bar and the filter to copilot and I asked it to add the changes it made to a PR. So

we can see this is the PR that it's made and should have the new search bar and filter. Now, I want for Copilot to use

filter. Now, I want for Copilot to use an MCP server here in the terminal. And

when I press that /mcp command, you'll see I have a playright MCP server.

[snorts] And I would like for Copilot to go ahead and use the server. Just going

to hide this when I press exit. It

sometimes does that. Um, uh, I want Copilot to use the Playright MCP server to help me manually test the features

that it has. So, here we can see that Copilot has Oh, I don't think I've even I think it's starting to test from the

other terminal. Um, not here. I want to

other terminal. Um, not here. I want to test properly here. So, um, what I'm going to do is I'm going to ask Copilot to use this MCP server, the Playright

MCP server to test the search and filter features using Playright. So, playright

is an open-source testing framework that was built by Microsoft and it allows you to automate the browser, run manually automated uh, checks in your browser. So

when I ask Copilot to use Playright, it's going to immediately launch the browser. And here you can see it's going

browser. And here you can see it's going to do things like typing in different inputs. Um it's going to navigate to

inputs. Um it's going to navigate to different pages, click on different buttons to be able to test that all of the features are working. You'll see my hands are not on the keyboard. It's all

Copilot. It's all Playright that is running these different tests. And you

see it has just tested the doll and it's now testing out the categories for uh which um for which toys are in a

specific price range. So I have actually told Copilot to as part it of its instructions here in the terminal to always use playwright to test a future

and I think that's why it launched earlier because as part of its process it knows that not only does it need to generate coding tests it should also manually test its features the features

that it's created using playright. So we

can see that our search bar is working as expected and we can also see that the filter is working as expected as well.

You'll also note that for each test that copilot runs, it saves the test to a screenshot folder. It's taking a

screenshot folder. It's taking a screenshot. Whenever a test uh works

screenshot. Whenever a test uh works successfully, it'll take a screenshot and then it's going to save that screenshots to a folder here.

So going back to our terminal, we have now because we did playright, we saw a skill in motion with being able to use

the work IQ skill to pull in information from our email and have that information be in the terminal for Copilot to build.

After that, we also saw some verification and I think that uh sound here is just showing us in the other terminal. You see it tried to run

terminal. You see it tried to run playright as the test there and I had to close it. So it's asking me for more

close it. So it's asking me for more feedback but I will come back at another time. Um but here for verification what

time. Um but here for verification what I would say is when you are asking as part of that agentic loop I would make sure to provide in your co-pilot

instructions some instructions to tell the agent to make sure to write unit tests and to verify that the code coverage tests pass by running those

tests itself. The problem is that often

tests itself. The problem is that often times we sometimes see agents writing self-affirming tests. So it will write a

self-affirming tests. So it will write a code test saying if you if this code works correctly, print yes and it will just print yes by itself. And so there's

no clear verification that the code tests are the the app is working as expected. And so this is why I would say

expected. And so this is why I would say using playwright is a great way to add that end to-end functionality for your agent to make sure that it's working as expected.

All right, great. So, so far we have covered the entire agentic loop. What

will happen is that our agent is going to like we said gather context, take action. We'll verify the results. And

action. We'll verify the results. And

once it has the results, we'll either you know it's either going to go back to you as a human and ask you for more context to see if you like the work that was done or if it needs some extra

context from you. And then it will repeat the process or it will complete the process and say it's done if you're happy with the results. So this is pretty much all of the practical things

that we've looked at. Let's go over some best practices as a last note. So for

context specifically, we talked about how we have a skill.md file and agents.mmd file. And there's research

agents.mmd file. And there's research that has come out recently that it's so important for human beings to write or curate their skills.mmd files or their

agents.md.

agents.md.

When a human create curated their skill.md file, the performance of the

skill.md file, the performance of the coding agent increased by 16 points.

When an agent when a human told the agent just to generate their skill, it actually had a negative impact on the agent's performance and actually made

the agent overthink or things like that.

The same with that agent's MD file in their test that they run. This is

another group. They found that when a human would go ahead and provide an agents.mmd, it would a% increase.

agents.mmd, it would a% increase.

Um, but then if an AI agent was to create the agents MD file, it would have a negative effect. The agent would start the LM with the agent would start to

overthink and it would increase uh inference cost by 20%. So important for us as humans, we still have a very important role with the context part

there for sure. And then a second thing with best practices is version control.

So say for example, the agent had finished building out that search bar and the filter and I wanted to make some other changes. I need to make sure that

other changes. I need to make sure that I'm committing the first version of those changes because once the agent moves forward, it's not going to remember some of the original code. And

so the best way to do that is to commit those changes. And agents are very good

those changes. And agents are very good with GitHub. They're very good with kit.

with GitHub. They're very good with kit.

So they'll be able to look back at the commits, reverse anything that's needed, and so on. I'd also say say that now because we can generate the code faster, you're able to prototype on different

branches. So I would recommend you know

branches. So I would recommend you know you creating a new branch whenever you want to maybe create a prototype of the same feature. As a final note for

same feature. As a final note for version control with pull is pull requests um at GitHub. So I work very closely with the GitHub team and one of the top

complaints from open source maintainers is that they're getting so many pull requests from people who are using their AI agents to generate pull requests and the maintainers are getting overwhelmed

and they're saying please some maintainers are completely stopping accepting pull requests. So for open source to survive I would say let's try to be mindful of the PRs that we are

sending to different repos. If it's your own repo have your agent go wild. that

can make create as many PRs as possible.

If not, it's think before we have our agents submit a PR. Also for critical code, I gave a talk recently, the same talk, and someone was saying all code is

critical. And I don't really know, you

critical. And I don't really know, you know, however you would define critical code. I would say that you should have a

code. I would say that you should have a human review that code along with an agent. So maybe you can have that custom

agent. So maybe you can have that custom reviewer agent, but make sure to also have a human look at a cult because you can really we've seen agents, you know,

getting it wrong with security issues or maybe performance issues and their regressions there. And then a final

regressions there. And then a final thing I mentioned that with these playwright tests, when the agent tests manually, something I've liked doing is having the agent putting in that copilot

instructions to have the agent attach the screenshots that it's taken to the PR when I'm requesting when I'm adding a new feature. So I'll use a PR like that

new feature. So I'll use a PR like that on the screen. All right. So we've

covered a lot in the last few minutes.

We looked at the evolution of AI and coding. We looked at the definition of

coding. We looked at the definition of what coding agents are and how they use this agentic loop. And then we looked through some practical examples of how

to give an agent context, how to give agents tools and also how to verify both using unit tests and also with manual ver verification with playright. And

then we looked at some final best practices on context and making sure encouraging humans to pass in the context and also with version control as

well. So uh I think we have some time

well. So uh I think we have some time for questions. Abigail uh you can also

for questions. Abigail uh you can also just as a reminder find the slides there but Yan will also go ahead and send a link to the slides. So um yeah let's

let's get into some questions.

>> Yeah. So first of all, thank you for that excellent and practical overview of uh agentic coding. We had like I mean

over 53 questions now, but we just going to take a couple.

>> I'm going to stop sharing my screen.

[laughter] >> Okay. [clears throat]

>> Okay. [clears throat] >> Yeah. So there's one question from Lance

>> Yeah. So there's one question from Lance White that says, "How smoothly can you build these agents out with locally

inferenced models or having multiple agents running locally?"

Um I so I do think that I'm so sorry team I am trying to find a way. Let me

share this screen instead for the moment. But I'm trying to find a way to

moment. But I'm trying to find a way to see the tat as well. Um, but yes, I think that it's very possible to use local models. There are some models like

local models. There are some models like the Quinn models that are entirely local that have been trained specifically for coding. And I would say the next

coding. And I would say the next generation of coding models I would definitely think are going to come from the local models because you're going to

save a lot of money um by just using a local agent to run these coding loops.

So there are specific coding models. I

will say these are big models. They tend

to be pretty big. There are smaller models that are also sometimes used for coding. But I would I haven't really

coding. But I would I haven't really tested any that I've been blown away by.

I think the fire models from Microsoft are reasonable, but I I wouldn't say they're right at the same level as some of the uh newer ones here.

All right. So, Zoe is also asking, "How much context should one use during a coding sessions?

Can there be too much information?

>> Yes, absolutely. Oh, and you know what?

I should have shared my screen for but I won't share my screen in this case. But

one of the things you'll see and I think it's a feature in most of these coding agents these days is there'll be like a little spinner that will tell you when

the agent is reaching its content max.

So the agents definitely get to a point where they too much context is going to degrade performance and there's actually a certain amount that is known that once

the agent has all of this amount of information it's going to stop working well. So you want to again there people

well. So you want to again there people are calling it context engineering. Um,

with a lot of these pre-built agents like Copilot, they have as part of this harness that they have, they have a way to manage the amount of context that

they pass to an agent or maybe once it gets that limit, it's going to compact and summarize the context so that the agent can continue with um, you know,

the good amount of context that it needs. But yes, it's absolutely a great

needs. But yes, it's absolutely a great question. It will degrade if you overdo

question. It will degrade if you overdo it with the context. All

right.

>> So, we have another question also from Dave. Um, see,

Dave. Um, see, how do we give an agent access to one folder containing code without giving it

access to the entire computer?

H. So for example, when I was showing you on this screen, I the agent only had access to my IDE and to the workspace I

was working in in my IDE. So when you actually install or when you're using your agent, the folder that you open up the agent in is the folder it's going to

have access to. If you wanted to control your full computer, you'll have to download a desktop app like this co-pilot co-work which is Anthropic has

co-work and so that's an option for giving it full access to your machine.

But at the moment actually what I was showing you was really contained to the folder that or the repository I was working in there.

>> All right.

So I know that towards the end of your talk you touched on some uh best practices but um there's a question on best

practices but I think this person gave um some context to their question says what best practices do you see for

agentic coding and environments where source code tickets or internal issues cannot be made directly. directly

accessible to the agent due to privacy or compliance constraints.

>> Yeah. So I would say there are a number of different ways to do that. I

definitely would look into local models or getting a a sovereign environment.

This is something that Microsoft is kind of pushing at the moment, which is kind of this sovereign AI where you have an isolated environment to run your agents and they have specific models that will

protect the privacy. You also have an option to add in maybe some middleware or some code that will encrypt the data

before it's sent to the LLM. If there's

sensitive parts of your data, you can have a you can have uh some code available that you can run that data through and then add in some encryption

uh before it's then passed over to the LLM. This definitely makes it much more

LLM. This definitely makes it much more complicated. So I would say I there a

complicated. So I would say I there a couple of different resources online that I would look into for that. Um but

best practices I say if you can access local models that are you can use entirely on your machine the quen models are fantastic for coding. Quinn models

can all be used locally. You don't have to access a cloud. You don't send your data anywhere and you can use that in a in an isolated environment.

>> Okay.

All right. Um thanks for sharing that.

So we'll take uh maybe two more and then we we'll we'll end the session for today.

So one question is um from Maria. It

says when you prompt the agent, >> does it just append the instructions to your prompt each time?

>> Uh yes. So there's

really it depends on the how the harness. So whichever co-pilot coding

harness. So whichever co-pilot coding agent that you're using the person who built that agent will have made a decision on how that context is going to

be ingested by the agent. Once you

attach it to the chat, it will look like it's appending the information to your prompt. And that's typically what they

prompt. And that's typically what they do. But different people optimize this

do. But different people optimize this in different ways depending on the coding agent. But that's that's

coding agent. But that's that's typically the process >> um of what happens.

>> Okay. Wow. There are lots of questions.

>> Okay.

>> Uh but >> find me online afterwards. DM send me a message. I I'm happy to answer your

message. I I'm happy to answer your question.

>> Yeah. I will take uh one from Eric uh that is are there differences between MCP and skills.

>> Mhm.

>> Can agents call a skill that includes API contained in an NCP?

>> Great question. Yes. Uh I tried to emphasize that that the skill is go your skill can contain an MCP server. So your

skill is going to be like this package.

People have also kind of thrown around the idea of creating like a central skills registry. You know how np there's

skills registry. You know how np there's npm or there's pipi for python. People

have kind of thrown around this idea of creating like a skills packaging place.

And the reason for that is you can put so much into your skill so the agent will know exactly what to do with it. So

you can include MCP servers in your skills. The MCP servers are basically

skills. The MCP servers are basically going to be the tools. The skill itself is going to be the holistic inclusion of the workflow. So we'll have the

the workflow. So we'll have the instructions, we'll have the different servers, the different tools, the different APIs. If you have a script

different APIs. If you have a script somewhere you want to include it, you can include all of that in a skill and

including the MTP servers. Yes. Yeah.

Yeah. So this is um a shout out and a followup. So Rishav says that this was a

followup. So Rishav says that this was a nice introduction. So congratulations to

nice introduction. So congratulations to you Marlene. And they also want to know

you Marlene. And they also want to know if there are any pointers where they can learn more about this.

>> Yeah. So I think um follow me online [laughter] and I try to post there. But I

definitely will also recommend the um the anthropic blog post is genuinely such a great resource. I think the anthropic blog is my favorite resource

on agents because they always put out very high quality content there. Um

GitHub also has some great blog posts on on agents as well and agentic coding and I typically try to get involved in quite a lot of live streams. So, if you see

any live streams online, that's another great way to see practical examples, especially if you want to actually see someone using an AI agent or agent coding. A live stream is great on

coding. A live stream is great on YouTube. So, um yeah, those are my

YouTube. So, um yeah, those are my favorite resources at the moment.

Probably the top one is Anthropics blog to be honest.

So, definitely please um try to um follow Marlene on social media. She's

very active. Definitely follow her and then reach out and ask questions, share some insights that you took um out of this session as well. You could also put

it out there on social media um keep the conversation going um sharing about what you learned from this session as well.

So, I would like to thank Marlene again for her informative presentation and insightful answers to the many questions. Unfortunately, we couldn't go

questions. Unfortunately, we couldn't go through all the questions, but I know she's going to spend some time going through all of them and trying as much

as she can to also keep sharing online for all of us to keep learning. So also

a special thanks to each and every one of you for taking the time to attend and participate today.

Um this talk was recorded and it will be made available online in a few days. Um

but also specifically you can visit learning.acm.org

learning.acm.org org uh where you can also find some announcements on some of our upcoming talks and other ACM activities that you

can participate in.

Also, please fill out a quick survey where you can suggest future topics or speakers uh which you should see on your

screen in a moment.

Now on behalf of ACM, Marlene and myself, thank you once again for joining us and I hope you'll join us again in

future. This concludes our talk and our

future. This concludes our talk and our session for today. See you all soon.

>> Thank you everyone. Appreciate you

joining. Have a great day.

Loading...

Loading video analysis...