Demystifying Large Language Models in 45 minutes (non-technical)
By Anastasia Borovykh
Summary
Topics Covered
- AGI Claims Overhype Benchmarks
- Attention Captures Token Semantics
- Benchmarks Suffer Data Contamination
- Post-Training Turns Parrots Useful
- Sycophancy Biases Honest Feedback
Full Transcript
all right welcome to this new video in this case I would like to keep it under 30 minutes and give you a brief introduction into large language models um going into detail into how do we
actually build these models highlight what they're really good at but also expose some of the fun flaws that they still have now the motivation for this
video is that even if you work in um the field it is not always easy to keep up with everything that is happening and equally it's not always easy to cut through what is high and what is reality
so here on the 6th of January we had Sam Alman say we are now confident that we know how to build artificial general intelligence hence we're actually turning our focus on super
intelligence and while these claims are exciting they definitely don't go without criticism there are people on Reddit saying oh you guys are so easily impressed by all this overhype there is
for example Fran CH who has often been um critical of some of these AG claims and he says well we can only say we have reached AGI if it's no longer easy to
come up with examples that people can solve easily but um AI models struggle with there were at um again especially around Christmas last year there were
all these headlines of um whether the tech industry is once again at this AI slowdown and lots of people are questioning whether in this gardner hype
curve we might actually already be Beyond this first Peak um and specifically for open AI I found this one Reddit post very funny because
it said that open AI is really built on genius PR there are so many people that are dissecting the tweets the cryptic tweets that um people from open a eyes
sometimes send out creating both fear but also um excitement around everything that is happening now the other thing to mention is that even if you are pretty positive
about these Technologies which I myself am it is not always easy to see the forest through the trees there is so many solutions that are out there and
new ones are coming up every single day it's not easy to choose which one you're going to use but it's not even easy to know for which specific use case you should be using Ai and where you might
better um skip it at the same time llms really did not necessarily get much easier to use as is noted here in um a blog
post it's not always easy to make them behave as you would like them to so what I would like to do with this video is somehow demystify a little bit the magic
around Ai and really go into detail into how is the model constructed what is the data and how do we perform the training over this data and then zoom into what are the model abilities but also really
what are still some of the limitations that they have which may actually be limitations that are really fundamental to the specific architecture that we're currently using so we will start it off slowly we
will start off with the basics the prompts if you've ever used chat GPT this window should be familiar to you and where it says currently message Chat GPT this is the place where you would
usually put in um your little prompts the thing that you want the model to help you with and this can be anything from oh give me a you know a simple recipe for pasta give me ideas what to
do on a Friday night um ask to write an email for you suggestions for where to get vitamin D because in London it is very hard to find it in the winter or you could even you know dump in a bunch
of text messages from your boyfriend or girlfriend asking the model what do you think does he still love me or not now let's dissect a little bit what happens to this front once it goes into
the model if everything goes right the model should output you with your reply which in this case might be well sure thing here is a recipe for a quick delicious fuss fre meal and the way the
model differentiates between um what you're saying what I'm saying versus what the model is replying is that it's um it puts what I'm saying into these user tags and whatever the model is
replying into these assistant tags and if you're simply using the model through this um chat interface you might not know that behind all of this there is
also a system prompt um in these system tags and this system prompt actually instructs the model how it should be behaving if you're using it through an API you can actually change the system
promp to your yourself but if you're using it um through the chat interface this is something that is um already decided by the model providers and this
is for example the system prompt by um from CLA one of the models from anthropic and it basically says information of um how the model should be behaving so saying oh you know the current date is this your knowledge base
was C off on this particular date if someone ask you simple questions please give concise responses if someone asks you controversial topics don't um you know try to provide careful thought y
yada y kind of really giving the model some more instructions on how it should be behaving so um once this prompt goes into our model so let's start with here prompt in the morning I like to drink
this will be the example that I uh use for the next couple of slides what happens first is that the model when processing this it will actually split this um big sentence into so-called
subwords or tokens in this case you it might split in the more n i and so forth and here you might stop me and say hey um but why are we actually splitting it
into these subwords why not for example into letters or why do we do this split even generally now if um we think about splitting it into letters we would have
a much longer sequence so partly splitting it into subwords comes from an efficiency argument where we would like to keep them thing that we're processing a little bit
shorter the other reason that we split it into these sub um words and don't just work with exact sentences or letters is that somehow these subwords
allow us to capture the semantics of um the text in a better way after we have these sub words they get transformed into numbers because the numbers is what our model likes to work
with and specifically things get translated into this sequence where every number is this ID that corresponds to um where in the model vocabulary this
particular um token lies so for completion this model vocabulary is essentially the set of all these tokens all these subwords that the model is
recognizing and is able to um process after this little process of splitting it into subwords and transforming it into um integer IDs we
are left with this sequence of integer IDs and this is then what gets processed by the model and finally um also what
generates the output that we would like now first before I go into detail what we actually input into this model I would like to say what is this specific output that we would like to get from
the model um or specifically how do what do we even train these models to be doing essentially what we train these models to be doing is to Output the next
word so in this case of I like to drink um the next uh word or token might be coffee and this is this coffee is what I'm training the model to um correctly
predict and the way I do it is I show them model different kinds of partial sentences over and over and over and over and over again and have the model guess what it thinks should follow for
each of those sentences now if the model makes its guess correctly I say I update its model parameters such that um it reinforces this guest telling him look
you were correct keep it this way um if the model however was incorrect so it incorrectly guessed what would what should be following um my partial sentence I adjust a little bit those
model parameters so that it can learn from the mistake that it made now um let's go into what happens to this um sequence of numbers as I
actually pass it through the model so first we will transform this uh sequence of numbers into a set of vectors why vectors because they will again allow us to capture somehow more nicely the
semantic meaning of each of those subwords and the way we do this is to a first and bedding layer once we have this we have a bunch of different layers in our model and
what each of those layers are doing is essentially just transforming in some way in some way that will hopefully towards the end of our model so when we get the output um correctly be
predicting what should follow our sentence that we gave into the model so at each of those layers um these vectors are getting transformed in a particular way and the
way these are getting transformed is really governed by these model parameters and this is again the thing that we can update in order to make sure that we're getting the desired outputs from our
model once we reach the final layer let's look a little bit at what the output in this particular layer is um what this layer outputs is actually the probabilities of How likely each of the
words in our vocabulary is so specifically if we look at the very final um Vector that we had in our sequence it's basically outputs the
probabilities that um How likely is the next thing that follows I in the morning I like to drink How likely is a tea How likely is it for example orange juice or How likely is it coffee and if the model
is again learning correctly it should assign the highest probability to the correct token which in this case is um coffee the other way to think about this
in a little bit more of a high level way is that we basically start again with this sentence this these sets of um tokens or subwords in the morning I like
to drink and what happens through all of these layers is that each of those tokens gets transformed in some way so each of these layers somehow slightly
changes the meaning or the um specific token that particular element is representing and again to highlight once what we want in the final layer we would
like the model to somehow converge to tokens that follow the particular token that we gave the model so the first in should at some point converge to the in
the should converge to more and so forth until in the morning I like to should converge to the final token um coffee so for each of those sequence elements in our sequence we basically make a
prediction of what should follow it now there's a lot of details that I can tell you about how the model actually works so what are how specifically do these Transformations work but I don't want to do that because
then we will get into way too mathematical um things I would like to keep it very high level and point out two spefic things the first of which is
the attention mechanism um really one of the core features that's kind of like led to this boom of um large language models working as well as they do and what attention basically allows
us to do is whenever we are Computing the particular predict predictions so whenever in this case we're trying to predict what should follow in the morning I like to drink what the model
is able to do is it's basically able to shift its attention to specific parts of this um sentence in order to somehow predict what should follow it so whenever it's for example predicting
coffee what might be reasonable is for the model to very much pay attention to morning and to drink and use these two tokens to then um somehow infer that a
sensible followup might be um coffee so essentially whenever the model is making its predictions it is variably paying attention to specific words or tokens in in the sequence that we're giving into
the model and this results in a very rich but also very efficient way of capturing the different meanings that we have in our model the other thing that I would like to mention is actually positional
encoding so when you're looking at this sentence you you see that in the morning there's a particular ordering to this sentence so morning follows in the drink is at the very end but what the model
sees is it sees all of these as a kind of bag of words and the only way that we can actually tell the model what um the order is is that we basically have to
embed or add these vectors to these um token embeddings that tell us where in the in the sequence was each particular token actually lying and this is it's a
bit of a funny thing and it's a bit of an artificial way to actually tell the model about these positions and also how you specifically do this there's a lot of work that went into um doing it
efficiently and accurately now that we know at least a little bit um what these models are I would like to also make a distinction between how big certain models can be so the number of
parameters these little um these things that we can update in order to make the model do what we like to predict the correct token varies a lot across different um models so we have
everything from very small models such as the small LM or the um meta Lama 3.2 models which have around 100 million to
a billion parameters it sounds um small but in principle a billion parameters is still a lot um but we have this all the way to really um something around like trillions of parameters that people
hypothesize that open AI GPT 4 and Clause 3.5 sunet actually have and there is also a distinction here between close Source models and open source models so
close Source ones are the ones for which we don't exactly have the model parameters and all we can do is just query the model and give have it give us um the output for the prompt that we put
in but for the open source models they actually completely open source um the model weights and sometimes even all the train data that went into these models so there's a big range of um sizes that
these models have and obviously also performance is very much dependent on how big or small the models are all right so I already said that whenever we are training the model we
are basically passing in a lot of different sequences in order to um teach the model to predict what should follow each of the sequences that we pass in and the thing to stress here is that
when we are training these models we really are passing in a lot of these sequences typically something around billions or even again trillions of um tokens so of these subwords um that we
give into the model let's look a tiny bit more at what specifically this data is because this will also give us very critical insights as to where we can expect the models to
perform well versus where they might not necessarily so um the first thing that goes into these models is for example very general internet data this is really what um
also started these large language models is that people were just scraping the web for example in these common crawl snapshots um cleaning this data with a lot of different D duplication and
filtering steps because again clean data is really very critical for these models to work well um and basically training these models on all these different things that we could find on the
internet there's also a specific um data set called Fine web education which you know contains a specific subset of more educational web samples and the way again what data you choose would really
Define where your performance will be good and where it wouldn't um there's also a lot of domain specific Mo data that goes into these models for example there's the stack which is a data set
that contains a lot of different um codes sourced from GitHub um covering 358 different programming languages and there's also
math pile which contains a lot of mathematical reasoning data sets um what's also got popular in recent years is to actually use synthetic data now partly because you
know there's only so much internet data that we can find um and this in a way I see this as having been popular popularized by Microsoft's 53 model
which um in their report noted how they used synthetically generated textbooks and exercis which they use which they generated with a much larger model GPT
3.5 and used a um used it to train a much smaller model so to be very precise in in in terms of this training data what goes
into these models it's really a lot of content from anything such as blogs like substack and medium um university websites MIT Stanford Wikipedia um a lot
of lip gen data lip gen is this archive where people um post ebooks so a lot of books have gone into the training data and the reason I want to highlight this
specifically is that whenever you think these models are performing a particular task really really well it is impressive from the model side but one thing to really stress is that part of their
creativity and expertise also really comes from the fact that they leverage the work of other authors in books in blogs whatever um to generate their
responses now um this is what we train things on and now I want to spend a little bit of time on explaining also how do we actually evaluate these models so I told you that we train the models
to optimize the next token prediction so to somehow correctly say um what should follow a particular sequence that we input into the model this is however not directly how we evaluate them we
actually evaluate these models on all kinds of different um benchmarks that are designed to somehow measure the model's knowledge and
abilities across different domains so some examples um for example the MML Benchmark which tests the model um on different questions across 57 subjects
in stem in the humanities in the social sciences there is a benchmark called H swag um which tests Common Sense reasoning whether the model would for example be able to pick the correct
ending following a particular context gsmh k8k is a popular mathematical reasoning benchmark Mark human eval contains a bunch of different programming challenges that we assess
these models on so we don't just assess them on predict the next token for these um sequences we kind of structure different kinds of problems in different kinds of domains to capture whether the models are properly learning what we
want them to be learning and whenever you see new models being announced you will also see lots of tables or graphs like this where basically they compare the newly
announced models performance across all these different benchmarks like mou that I just mentioned um to then say Oh look our model performs better compared to all these other models that we
have um and again the thing here that I would like to stress is that another big challenge really lies around contamination so we are assessing these models on these evaluation benchmarks
but at the same time trillions or at least billions of tokens go into our train data set and who is to guarantee that we are not actually training the models on exactly the things that we are
then also evaluating them on and this would be you know cheating in a way right um we can never be fully sure that the training data did not also contain these data sets on which we actually
test the model and there is this one very funny paper which is called pre-training on the test set is all you need um which shows that if only you have access to the test data sets and
you train your models on that yes it's very easy to obtain perfect performance on all these benchmarks hence why some people say look benchmarks I take them with a grain of salt and I will wait
until people in the wild test these models before trusting should I switch my operations to this model or should I stick with the one that I already had before um and one also thing that I
would like to highlight is that there was recently a little bit of a scandal potentially um around open AI 03 model one of the models with which again they
they they they kind of fostered this um AGI is happening tomorrow claim and it actually seems to be the case is really I only read this yesterday but it seems to be the case that The Benchmark data
on which they evaluated and on which they base a lot of their claims this Frontier math um Benchmark which contains super super super difficult math theorems that the model should be
proving it might be the case that they actually had access to a lot of the challenges in this Benchmark while they were designing their model so could be that some contamination happened in
these AGI claims too another thing thing that you might have heard around this pre-trained data is that there were some claims or some worries let's say of people saying oh
are we actually ever going to be running out of pre-trained data so there was this New York Times article that said oh um open a eye and Google are running out of data that they can train their
systems on and partly this is really true partly this is true because there's only so much internet data that has been created by us humans in the last what is it um 20 30 years
um and we might not be able to keep up with creatively generating data ourselves um to also feed these hungry data hungry models to to keep improving their
performance and one solution to this is again this notion of synthetic data so basically have the models generate their own data and train on this sometimes this could um this seems
to be leading to challenges because to do this right you have to make sure that whatever data you're generating is really sufficient novel and diverse if you don't do this what might happen is
that if you retrain the model many many many many times on data it generated itself your model B at some point collapse into just you know as a single output it is no longer able to properly
learn anything um and the way people are trying to get into this novelty and diversity is by trying all kinds of clever prompting techniques to somehow
make sure that the model is um diversely generating this synthetic data and then we have this new way of um keeping improving the model with this synthetically generated
data all right lots of things around data so let's now suppose that we have trained on these trillions of tokens that we um got for this pre-train data
sets all these maths coding reasoning whatever uh data that we had all this internet data textbooks whatever now what is the model able to do after this stage we trained it to
predict the next token and this is basically exactly what the model should be able to so here I have an example where I prompt the model with in the morning I like to and the model correctly completes this as in the
morning I like to have a cup of coffee and read the newspaper with correctly I mean sensibly um at this stage we can think a little bit more around what the model
has actually learned and here I will also touch upon a common criticism of these models so some people state that these models are nothing more than just stochastic parrots in a sense that we
trained them to predict the next token and we showed them trillions and trillions of different combinations of these tokens so in a way it is sensible to say well these models are not really
learning anything like a human is learning these models are simply regurgitating patterns that they saw during training so there's somehow you know whenever I ask at a prompt of in
the morning I like to it is somehow thinking back to data that is stored in its model parameters that had a similar structure and Sim simply predicting the
next token based on that it doesn't necessarily mean that they have a true understanding of the world they might again just be somehow recognizing these patterns from which they have seen so
many of them like trillions of them once again and here we have a funny criticism of yan Lun who said well you know um after seeing an article where parents
parrots learned to make video calls to chat with other par parrots he said it looks like calling large language models stochastic parrots might be an insult to
be at the same time we might note the following that if we ask the model a question that wasn't exactly in its train data again it's it's hard to know what was exactly in its train data because it's so big but suppose we have
something that we we might be fairly certain wasn't actually in there often times the model will still answer these things correctly so what I conclude from this is that the model is able to
recombine known patterns in some kind of Novel ways um so what this might be pointing towards is that there does seem to be
some broader understanding of at least language and structure happening in the model even though we simply trained it to predict these um next tokens and a
little bit of proof two proofs that this might be the case so if we study the way that the model is actually internally representing um the data points so what
is actually happening throughout these different layers that are transforming our input sentence what I'm doing here is I'm feeding the model different sentences that in a human should arise
let arise a certain emotion so for example I found mold on my favorite cheese I left in the fridge might in a human lead to something as disgust or I
received a drawing from my daughter might lead to happiness and I'm testing here whether things that humans would associate with these um with these emotions is the model equally somehow
distinguishing between those things and what this little picture is showing you it seems to be the case because it seems to be that the model clusters everything that the human would say is disgusting
or surprising or happy also it is clustering it in in in different areas so again it seems to be distinguishing this higher notion of semantic meaning
in this particular case emotions now there's an even more interesting um um idea around this which is around World models so this is a specific example
where the model was trained on predicting the next moves in a game called ell um and while it was simply trained on predicting the next sequences that
the next moves that could follow these sequences what we actually can do is we can do some tricks to extract a more proper World model in this case a representation of the board States from
this model so the model seems to somehow be keeping track of also really the state of the board or at least information that should um allow it to understand the state of of this board even though again it has only been
trained to predict these next sequences these next tokens so something more might actually be arising um in these models making them not just stochastic barss but potentially something that at
least captures some kind of higher level logic or representations and so forth and then I want to say for this thing also a little philosophical
tangent but I myself don't even know what it means for a human to truly understand when we we practice um for example mathematics we have to replicate a lot of um we have to solve a lot of
different math problems with the idea that if only we solve 100 some kind of proper understanding will start to arise in our brain even though you know while we solving it we're simply solving it um
something more might start to be represented um in our brain so I don't know if true human understanding could also just be an emergent property of really having seen enough data and hence
patterns okay back to our base model so I told you that after this training stage we basically now have a model with which we
can um complete sentences more or less correctly but for example if I ask it to um give it a recipe for pasta it will
complete this sentence Oh I have a recipe for a pasta that I would like to make this is not a useful reply it is not actually giving me a reply that I would be able to um leverage it is
simply continuing the sentence in a logical manner so at this stage these base models that we um call these these these models that we have solely pre-trained on a lot of information from
the data they aren't yet particularly useful as chat assistance since they don't properly follow our instructions and this is where this second stage of training known as post
training comes in so during this post train stage this is a really a critical stage we actually want to turn these models that now have a lot of knowledge about the world a lot of knowledge around you know what can follow this
kind of sentence whether it be a math sentence or a coding sentence but we somehow want to more of these models now into things that are actually able to follow our instructions understand our
intent and hence also answer our questions correctly so after this model on which I'm giving the examples this Lama 3.2 model after we have actually
instruction F you and it so Post train it in a way that I will describe in a bit when I prompted now could you give me a recipe for a pasta it is finally answering me correctly and says here's a
simple recipe for a classic spaghetti with tomato sauce yada yada yada um and to stress again what a big business this post trining stage
actually is scale AI is a company that um helps model providers Source this kind of posttraining data so all these data sets where models follow these
instructions and it um had quite recently in May 2024 already a 14 billion valuation so it is Big Business to provide good data to um model
providers so let's see what is the specific structure of the data that we feed into the model during this post training stage in essence at its score we would like this to align with how
users will actually be using our model um for example we might imagine that we are feeding the model these kinds of difference um data sets that
correspond to different tasks such as here summarization we would basically give a prompt um provide a concise summary of this particular text and also tell the model how it should be replying
to this particular task and the model should be correctly replying with a summary and these are the particular data points on which we then would train the model we could also have all kinds
of coding things here so data sets data samples where we have the user say oh can you create me a python function in the model correctly replying oh to generate the the the command line string
blah blah blah um we can also have maths um data sets data points in there for example the user asks to for help on a mathematical reasoning question and the
assistant should be answering oh we have this particular thing let's first use the property of logarithms blah blah blah and also API call so for example um
we might train the model to to to um learn that whenever I'm giving it a bunch of different apis it could be calling and whenever I then say oh could you do this particular thing with the
apis you have access to we would either like the model to Output an API C or say oh I'm sorry I don't have the particular capability to to book these flights so so once we have all of these different
examples we basically again could do supervised finetuning so again train the models to predict the next token this time for example not just over the full
sequence that we have but focusing only on the generated answerers so whatever follows the user's instruction to make sure that the model will properly reply to whatever instruction or prompt we
give it another thing that we could be using and that um is used very heavily is this notion called reinforcement learning from Human feedback what we would actually do here is we would
collect a bunch of preferences from humans so we would give um people different potential answers to the prompt could you give me a recipe for pasta and have humans say oh prefer this
answer and I don't like this answer and then we would basically teach the model to align um align its output such that it matches with these human preferences
so basically again teach the model to um give us outputs as users that we would be happy with for example in open a eyes newest model that I was um using they
use a lot of these emoticons and these very nicely structured outputs to make the output to whatever you're asking um easy to understand and also just
visually uh nice nice it also is really the stage this post training where we can give models for example their character and one thing here that I
found is um I would say CLA is quite a bit more combative than chat GPT um for example this is a paraphrased example
but at some point I asked Claud to um to assess whether a certain reply I had sent to a situation was nice or not to which Claud very actively replied oh my
gosh your reply is way too nice and when I questioned um Claud on this when I said it oh it's so interesting your reply I asked chpt the same and it suggested me to not be confrontational
at all um Claud said well this highlights an important difference in approach between different AI assistance and perspective um and I will say here for Claud I think this is one thing that they have done really well anthropic
that they really nicely structured the post trining stage of the model where the model very nicely understands your intent a lot of people are using Cloud for coding and part of the reason is
that the model somehow just seems to capture what you're aiming to do and answer that um very nicely so we're at the final stage now where I want to talk about the capabilities and the um
challenges of these models so first the capabilities now after we have trained all of these models on trillions of tokens from the internet from math textbooks from coding
stage where we have kind of molded the model to more um nicely align with what we would like it to be as a chat assistant so follow these instructions
that we give it our model is able to follow instructions correctly so it's able to leverage all this kind of raw information that it got during pre-training and actually also use it in
a very convenient way that whatever it's outputting is also useful to us as a user now another um a little bit more delicate thing that I would like to highlight but I think it's a very
interesting property of these models is their ability to in context learn um and what I mean with this is remember that I told you that there is this kind of
attention mechanism in the model that somehow seems to um adjust that whenever we are predicting a particular token the model can adjust where in the sequence
it's looking where is it paying attention to predict this particular token and now we get to an interesting ability of these models so suppose that I in the prompts give it all these
different combinations of fruit and animals so I give it banana should follow sorry monkey should follow banana cat should follow Apple dog should follow pear yada yada yada in the
context in the prompts there is an implicit rule that I have here um kind of defined which is the fact that monkey follows banana dog follows Spar cat
follows apple and so whenever I then end with bear if the model is doing their job well it should follow with dog and the reason that this is possible
the reason that this model is able to extract kind of like logic and structure from The Prompt itself so learn something whenever you're just passing this thing into the prompt this is not
something that the model has seen during training but I'm I'm I'm I'm making the model learn a particular task in this particular prompt this is facilitated by the fact that it is able to switch its
attention mechanism and in this particular case really big pay attention to pair and dog whenever it should be predicting the next token to be
dog um and the reason I mentioned this is it really leads to an interesting ability of the model to somehow learn a task based on whatever you have
explained it to do in the prompt and for example fuse shot prompting is heavily relying on this ability so fuse shot prompting means instead of just saying
the model o please do X for me you would give it a few other examples where say look when I asked you to do y this was your output when I asked you to do Z this was your output now I'm asking you
to the X please give me your output and because you have already kind of trained the model into what kind of output you're expecting it has in context learned it will also hopefully follow
with a correct um uh output structure to your prompt X um and again here I will give my example of API calls so for example if I would like the model to
respond to particular to particular requests with API calls and these API calls should be in a specific format if I give the model a bunch of different examples so retrieve
the current weather in New York City the output is how exactly to call this API create a new user account with this particular name y y y the output is
again how to call the API that does this um and then my current request is update the delivery address I really would hope that in this case using this in context learning ability the model is a ble to
correctly output an Avi call that I can then leverage tangential to this is actually agents um and you might again here have
seen a lot of this kind of you know hype or excitement around 2025 being the year of agents at score what these um agents do is you would put in a particular input this would get processed by the
model the model would give you a particular output for example some code or for example some apis that it wants to call we would then allow the model to
somehow also run the code and get the output and potential error messages from this running of this code or from these API calls update the prompt based on whatever the output was and use this
kind of like in this iterative Loop so this really gives the model abilities to independently um run certain things and adjust and and and solve really more
complicated task in a fully autonomous um manner um and again what these agents really rely on is the ability to call
apis and to also um output code and this is all made possible by the fact that we have properly instruction find you this data set so we have these um data sets
with API cost on which we train the model we have also properly trained it on understanding code basis we have then instruction fine tuned it to whenever we ask it to call a certain API to also
call the API or to Output code whenever we are asking into output code and then also leveraging this ability to in context learn we can really make sure that the model output sticks to a
particular structure and use this to then enter into this um iterative Loop all right these were the abilities now a little bit more on the challenges so one
challenge is that these models really have these um knowledge cut offs so there is a bunch of training data that the companies that train these models collected from the internet from you
know books and so forth and so forth um but these were collected for specific in between specific dates so the model doesn't necessarily have access to the
most recent events um if it hasn't been again updated yet on this particular data so here I asked Claud I think something about Elon Musk and Trump and it said well given my knowledge cut of
in April 2024 I cannot answer you so let's just focus on analyzing the broader themes of the question that you've asked or the con concerns that I
raised um and now I ask you how would we for example then actually add this new training data into the model remember that I said that we train on billions trillions of tokens and the models
themselves also have billions and trillions of prameters so fully retraining from scratch over all the data whenever some new data comes
available is very very very very computationally expensive but the challenge is that simply also continuing to train so suppose I've just trained a
model until you know may 2024 you now give me a little bit of new data and I try to continue training there simply continuing training is not always easy to do because there are all kinds of
instabilities that arise um and also catastrophic forgetting of the past so somehow changing your underlying data Distribution on which you're training might leads to the fact that
the model forgets everything that I showed it before 2020 May 2024 even though you know now it might know um whatever happened in January
2025 so there's still a requirements for better so-called continual learning methods for these models we somehow want to keep training these models just like humans
keep learning without very expensive um costs and also without too many instabilities and too many delicate problems such as this catastrophic
forgetting um one solution for this is that um the models now again through this to to this access to call apis when they don't know something they might
have been already postra to actually call external tools to for example search the web here I asked why do you believe Elon Musk is you know helping Donald Trump with the current US
elections and the model started to search um the web for this because it recogniz that this is not something that it knows um doesn't have it in its current train data but by searching the
weapon and kind of aggregating responses there it was able to give me an interesting analysis all right the other challenge that you might have heard about many many times is hallucinations and um
hallucinations if I can Define it like this I would say the model confidently generates information that sounds extremely plausible and reasonable but is actually completely false um in the
very early days of GPT it told me when I asked it about me that I had apparently authored a book which I haven't done yet but it was very confident that I had and that the book was called something about
mathematics for finance or machine learning for finance um I want to stress that this has been these hallucinations really have been significantly reduced through all these different post training
efforts where firstly again more data has gone into the model secondly whenever they don't know something they won't just answer you now but they might as I showed in my previous example
resort to calling an external tool or API to for example search the web when it's not confident about particular information but again also I want to stress that while it is significantly
reduced it is really still present so whenever you are deploying these applications in um more critical areas all kinds of output checks might be necessary to make sure that we maintain
the robustness another challenge which is a little bit less known but which I find very very very interesting is around sancy I still don't know how to
pronounce this word but it is finded as follows a tendency to flatter agree with or excessively praise someone in Authority or power usually to gain favor
or maintain a harmonious relationship now the models somehow have this ability because they have been fine tuned with this reinforcement learning from Human feedback and they have been fine tuned
to to answer in ways that we humans will like them to answer so they have this tendency to flatter someone in Authority in this case the user so they like they
like it when you remain engaged and remain happy with whatever you're saying um examples where this might arise is for example if you give the model two emails um in different styles but on the
same topic where one email is very straight to the point email B and the other email is using a lot of this corporate terminology that is you know a little bit
vague if the assistant doesn't know who authored these emails it might tell you well look Emil a relies heavily on all kinds of corporate buzzwords it obscures
the actual message but once I then say well um thank you but I wrote emoa it might change its um answer and be like oh well
um emoa employs sophisticated professional terminology that emphasizes collaborative potential so it really tries to adjust to also you and this means that sometimes it's hard to get
these models to give you objective feedback on certain things once it knows that you're the one behind these particular things um this is the end of my video which I don't know if I will
manage to cut into under 30 minutes but I hope it provided you with a kind of comprehensive review of everything that is happening in these models how we train them how we build them how um
where you can expect them to perform well and also some flaws where you should be careful
Loading video analysis...