Week 2 - Video 2 - Workflow of a machine learning project
By M. Iftikhar Uddin Khan Sami
Summary
Topics Covered
- The Universal ML Workflow: Collect, Train, Deploy
- First Attempts Never Work: The Iterative Reality of ML Projects
- AI Fails When Training Data Meets the Real World
- Iteration Is the Hidden Step in Every ML Project
- Models Must Evolve After Deployment
Full Transcript
machine learning algorithms can learn input the output or a to be mappings so how do you build a machine learning project in this video you learn what is
the workflow of machine learning projects let's take a look as a running example I'm going to use speech recognition so some of you may have an Amazon echo or a Google home or Apple
sory device or a Baidu to iOS device in your homes some years back I've done some work on Google's speech recognition system that also led by Jews duro s project and
today I actually have a Amazon echo in my kitchen so every time I'm balding an egg I will say Alexa set timer for three minutes and then it lets me know when the Freedmen's are up and my eggs are
ready so how do you build a speech recognition system that can recognize when you say Alexa or hey Google or hey Siri or hello Baidu let's go through the
key steps of a machine learning project and just for simplicity I'm going to use Amazon echo or detecting the Alexa keyword as this running example if you want to build an AI system or build a
machine learning system to figure out when a user has said the word Alexa the first step is to collect data so that means you go around and get some people
to say the words Alexa for you and you record the audio of that and you'll also get a bunch of people to say other words like hello or say lots of other words
and record the audio of that as well having collected a lot of audio data a lot of these audio clips of people saying either Alexa or saying other
things step two is to then train the model and this means you will use a machine learning algorithm to learn in input to output or a to be mapping where
the input a would be an audio clip and in the case of the first audio clip above hopefully it will tell you that the user said Alexa and in the case of
audio clip two shown on the right hopefully the system will learn to recognize that the user has said hello whenever an AI team starts to train the
model meaning to learn the a to be your input output having what happens pretty much every time is the first attempt doesn't work well and so invariably the team will
need to try many times or in a I recall this iterate many times you have to iterate many times until hopefully the model looks like it's good enough the
third step is to then actually deploy the model and what that means is you put this AI software into an actual small speaker and ship it to either a small group of test users or to a large group
of users what happens in a lot of AI products is that when you ship it you see that it starts getting new data and it may not work as well as you had initially hoped so for example I am from
the UK so I'm going to pick on the British but let's say you had trained your speech recognition system on American accented speakers and you then ship this small speaker to the UK and
you start having British accent to people say Alexa then you may find that it doesn't recognize their speech as well as you had hoped and when that
happens hopefully you can get data back of cases such as maybe British accented speakers was not working as well as you're hoping and then use this data to
maintain and to update the model so to summarize the key steps of a machine learning project are to collect data to train the model that a to be mapping and
then to deploy the model and throughout these steps there is often a lot of iteration meaning fine tuning or adapting the model to work better or getting data back even after you've
shifted to hopefully make the product better which may or may not be possible depending on whether you're able to get data back let's look at these three steps and see how they apply on the
different projects on building a key component of a self-driving car so remember the key steps that collect data at Raymonda and deploy model since we revisit these three steps on the next
slide let's say you're building a self-driving car one of the key components is a self-driving car is a machine learning algorithm that takes as input say a picture of what's in front
of your car and tells you where are the other calls so what's the first step of building this machine learning hopefully you remember from the last
slide that the first step was to collect data so if you go instead of a machine learning algorithm they could take as input an image and output the position
of other cause the data you need to collect would be both images as well as position of other costs that you want to AR system to output so let's say you
start off with a few pictures like this these are the inputs a to the machine learning algorithm you need to also tell it what is the output B you would want
and so for each of these pictures you would draw a rectangle around the cause in the picture that you wanted to detect and on this slide I'm hand drawing these
rectangles but in practice you would use some software that lets you draw perfect rectangles rather than these hand-drawn ones and then having created this data
set what was the second step hope you remember that the second step was to train them although now invariably when you're AI engineers start training a
model they'll find initially that it doesn't work that well for example given this picture maybe the software the first few tries thinks that that is a car and is only by iterating many times
that you hopefully get a better result like figuring out that that is where the car actually is finally what was the third step it was to deploy the model of
course in the self-driving world is important to treat safety as number one and deploy model or to test the model only in ways they can preserve safety but when you put the software and cars
on the road you may find that there are new types of vehicles say golf cause that the software isn't detecting very well and so you get data back say pictures of these golf cars use the new
data to maintain and update the model so that hopefully you can have your AI software continually get better and better to the point where you end up with a software that can do a pretty
good job detecting other costs from pictures like these in this video you learn what are the key steps of the machines project which had to collect data to train them although and then to deploy
the model NYX let's take a look at whether the key steps or what does it work though of a data science project let's go onto the next video
Loading video analysis...