#038 - Prof. KENNETH STANLEY - Why Greatness Cannot Be Planned
By Machine Learning Street Talk
Summary
## Key takeaways - **Tyranny of Objectives**: Objectives produce convergent behavior and distract from discovering stepping stones to greatness, prevailing in society, algorithms, and lives as a cultural prison. [06:26], [01:16:40] - **Chinese Finger Trap Deception**: Like the finger trap, pulling toward objectives tightens the trap; pushing opposite—worsening metrics temporarily—escapes to better outcomes, as in snake bounty breeding more snakes. [14:40], [16:29] - **Novelty Beats Objectives**: Novelty search, rewarding behavioral difference from past, outperforms objective-driven search on deceptive mazes by escaping local optima without future knowledge. [23:29], [02:41:07] - **Picbreeder's Stepping Stones**: Users evolved skulls, butterflies from blobs via dozens of selections on interesting intermediates like doughnuts, not resembling finals, via human-guided NEAT without target objectives. [01:02:20], [02:33:01] - **Committees Kill Innovation**: Committee voting converges to bland wallpapers; autonomy lets diverse noses for interestingness reveal rare gems, as Picbreeder vs. Living Images proves—everything washes out by consensus. [01:55:21], [02:50:38] - **Open-Endedness Generates Everything**: Evolution in one run invented flight, photosynthesis, intelligence without goals by diverging, creating problems/solutions via organisms exploiting each other, unlike convergent optimizers. [27:04], [02:44:38]
Topics Covered
- Abandon Objectives for Greatness
- Objectives Create Deception Traps
- Evolution Invents Without Objectives
- Follow Interestingness Gradient
- Trust Human Nose for Interestingness
Full Transcript
hello street talkers and welcome back to the show this is a special edition i've been dreaming about this show for what seems like nearly a year now i first read kenneth's book about a year
ago but anyway i hope you enjoy the episode i'm just sat here sipping a vita coco coconut water okay let's go to achieve your highest
goals you have to be willing to abandon [Music] them
we have a nose for the interesting that's how we got this far that's how civilization came out that's why the history of innovation is so amazing everything washes out when we start ruling by committee
like we have to allow people to follow their passions to their extremes and yet we run society as if this actually makes any sense at all i think the gradient of interestingness
is probably the best expression of like the ideal divergent search you get to this problem that like i don't know how to formalize interestingness what you get to then are proxies for interestingness that not everything
that's novel is interesting but just about everything that's interesting is novel it is in my personality in nature to want to overthrow this i guess we could say
tyranny of objectives okay a little bit of housekeeping before we kick off i just want to make it super clear what the structure of this podcast is so for the first hour and 15 minutes or so i am introducing
kenneth stanley's ideas i'm talking about open-endedness i'm talking about novelty search and abandoning objectives and pick breeder and the neat algorithm the neural evolution
of augmenting topologies algorithm so if this is interesting to you then um you know feel free to watch the introduction one thing that people have commented on is that sometimes in our shows we have been using clips from the main show in the introduction and you
felt like we were showing you the same thing twice that hasn't happened i haven't showed anything at all in the main show in the introduction it's all fresh content but if you do just want to skip straight forward to the interview with kenneth
and feel free to do that we've got the table of contents just skip on and it's it's about an hour 15 minutes
in anyway enjoy the show
[Music]
me [Music] professor kenneth stanley is one of my heroes of artificial intelligence my particular area of
interest and research has been in what's called neuroevolution which is a combination of neural networks and evolutionary computation so i'm saying we got to flip this completely like the smart part is the exploration
the dumb part is the objective part because it's freaking easy i personally think that kenneth's area of research which is neural evolution and ai generating algorithms
and open-endedness are the most promising paradigms in ai research i don't really get this excited about many other areas of ai research
apart from perhaps francois charles's neural program search which is actually quite related conceptually there's nothing really insightful or interesting about just doing objective optimization yeah we've got plenty of
good algorithms for doing that and it's not counter-intuitive at all it totally makes sense but it's not going to get us hardly anywhere interesting whatsoever so today we have an incredible conversation with professor stanley
and as you can see he really enjoyed being on the show too this was this was a great time it's one of my favorite experiences of any show so thanks for having me on it's really an honor to be here dr keith duggar being the wonderful character
that he is provided a robust rebuttal to many of the ideas in kenneth's book and actually it turned into a really interesting back and forth and turned on her up please keith is actually quite critical of kenneth's
work i forced keith to read kenneth's book in about june of last year and and uh yeah he was quite critical of it we discussed the chapter seven on education
back in september on our episode in capsule networks i'm gonna have this objective where we give you this test and we see how many numbers you can add together and in 30 minutes and then based off of that we're
gonna hire the top whatever 20 percent would we be complaining if their instructors and teachers were manipulating them to perform better and
better on that exam no because they're actually just causing them to perform better and better at the actual job that they're gonna that they're gonna do so the objective is very closely tied to
what they actually need to do to be productive or what society needs them to do for some particular application great manipulate away
nobody's saying to silence diversity people are saying that a combination of let's call it novelty search plus objective optimization is the way to go the reason i'm showing you that clip is
that there is going to be a clash of views on this podcast right kenneth is a great sport for coming on this show and debating keith in this way if i was kenneth i would not have come on the show
you know keith can be a bit of an animal so anyway i hope it will make the show more exciting for you folks and kenneth really is a good sport for coming on so yeah my kudos to uh to kenneth interesting
okay which are some of the key areas that kenneth stanley thinks that we need to explore to develop artificial general intelligence divergence
as opposed to our current obsession with convergent algorithms populations as opposed to putting all your eggs in a single basket diversity preservation stepping stone collection though we know not where
those stepping stones may lead generating new solutions and new problems at the same time in the same run so kenneth and his co-author joe lemon
penned this book why greatness cannot be planned i really recommend that you read the book i read it myself last may and it's one of the few books that i've read in my life where
i've really kind of changed my opinion on some of my outlook um i've been around the sun quite a few times now and i've converged in so many different aspects of my thinking but
this book really was a kind of a wake-up call for me it caused a real stir in my thinking it's a short book you can read it very quickly and i recommend that you do we actually wrote a book called why
greatness cannot be planned um it's not even about just computer science granting agencies are based on objective criteria they say the first thing they want to know is define your objectives tell us how you're going to get there we'll evaluate
your grant and whether we're going to give you money based on the likelihood that this is going to work like how terrible is that when we now understand that sometimes the best way to actually get innovation
is to not follow the objective gradient so this is just totally ridiculous the way we're running things in our society and there's so many things in your life that are objectively driven it's like any time you do anything someone says well what are you trying to
accomplish what will you accomplish what is the payoff going to be why are you doing this like from the moment you choose your major or even where you're going to go to school and so on throughout your entire life
now is it possible to explore a search space intelligently without using an objective to align ourselves towards discovery and away
from the trap of preconceived results kenneth thinks that greatness is possible if we're willing to stop demanding what that greatness should look like
why are the greatest moments and epiphanies in our lives often unexpected serendipitous and unplanned nearly two-thirds of
adults attribute some aspect of their career choice to serendipity because the stepping stones that lead to the greatest outcomes are unknown
not trying to find something can often lead to the most exciting discoveries youtube was first envisioned as a video dating site
flickr was part of an online social game inspired by a pet game to arrive somewhere remarkable we must be willing to hold many paths open
without knowing where they might lead one of the things which we'll get more into today is natural evolution itself and what human innovation looks like
because both of those processes they proceed without any final objective natural evolution it didn't solve a problem it solved innumerable problems and it
continues to solve new problems so what do i mean by problem like i mean the problem of flight the problem of photosynthesis the problem of human level cognition like all of these things were invented
in the same run objectives provide a powerful security blanket they seem to protect us from unknowns but our world has become so saturated with objectives
kenneth says that objectives are good when they're modest but not so if they are ambitious that is to say if they entail
discovery creativity invention innovation or happiness kenneth thinks that objectives distract us from our passions
or for the correct stepping stones in our lives that we should be taking they've become a pillar of our culture but they're also a prison around our potential how many
children do you know who formulate an objective before they go out to play how many great scientists really formulated a hypothesis
before their great idea do you ever tell yourself that you can't do something because it's not justified by a clear purpose have you ever been told to get your head
out of the clouds there is a tyranny of objectives out there in the world we rarely talk about the dominance of objectives in our culture even though they impact us from the very
beginning of our lives objectives dominate our lives at work in a lot of professions the first question you'll hear if you propose a new project
is what's the objective i've got a brilliant idea why don't we try and increase the levels of group think at work we should start with technical design reviews and then we should
do best practices where every single step of the process is mandated and signed off with stakeholders and then all of the code that we write must have peer review
as well um i really think that this will all unify our minds and objectify everything that we're doing at work by the way i'm slightly guilty of this myself but we need to be able to take the piss
out of ourselves a little bit don't we and this is what kenneth meant about ambitious objectives if you're a stepping stone away if i mean this sort of the way we put it in the book then by all means use objectives we've created a culture in the field of
artificial intelligence and machine learning where everything has to be evaluated in order for us to make progress so i can't submit it to a conference if i don't have a very concrete evaluation metric that's very objective
what we may have to do is to acknowledge that some things are actually subjective we're talking about creativity ultimately it is in the eye of the beholder i could argue with you actually that everything in nature is not interesting it's just a matter of
opinion i find it interesting there's no objective proof that everything is interesting including us but we've accepted that it's interesting we accept that intelligence is interesting you know why are we all pursuing it
and so ultimately like creativity is partially a subjective matter and yeah there's some interaction between our subjective view of the kinds of things that we find interesting and some
like modicum of objectivity which is sufficient at least to present to you an experiment and that's something we're very uncomfortable with like we want it to be entirely objective so you may have to grapple with that in order to actually start to really
flourish with creative systems why should we have to justify what's interesting to us we let people spend 30 years of their lives getting up to phd level developing a nose for what's interesting developing strong intuitions
we still don't trust them we use significance calculations to prove that people are doing the wrong thing we ask them to provide us with lots of metrics and key performance indicators of how they're doing you know these metrics basically stop
people from using their education intuitions are banned any kind of subjectivity is distrusted this is completely the wrong direction we should actually be thinking about opening up playgrounds this is where
real innovation happens we view animals through the lens of survival and reproduction evolution's assumed objective people often say that a pursuit is not
well defined enough if it doesn't have an objective the process of setting an objective attempting to achieve it and measuring progress along the way
has become the primary route to achievement in our culture do increasing test scores lead to subject mastery is the key to artificial
intelligence even related to intelligence itself does taking a job with a higher salary bring you closer to being a millionaire if you start trying to do things that are really
ambitious you'll run into it um you know like you might say well okay my goal is to be a billionaire or something okay well now let's measure every decision you make against whether it actually increases your salary and then think about whether that's
going to lead you to becoming a billionaire i mean almost certainly not because the gradient you should be following to get to be a billionaire is probably orthogonal to your salary it's nothing to do with that like even if you double your salary it has nothing to do
with whether you're going to be a billionaire and we get caught in these paradoxes all the time including objective functions in machine learning but also this may be a key uh step on the path to artificial intelligence you
know when you think about it intelligence is one of those so far off goals you know like the kind of like trying to make a billion dollars kind of a thing farther than that for sure because people actually do that um
that like it is one of those things where it's totally unclear what the path should be let's talk about the chinese finger trap the finger trap is a simple puzzle that traps the victim's fingers often
the index fingers in both ends of a small cylinder woven from bamboo the initial reaction of the victim is to pull their fingers outwards
but this only tightens the trap the way to escape the trap is actually to push the ends towards the middle the complete opposite of your intuitions which enlarges the openings and frees the fingers
this is a great metaphor for what we're talking about here right because the solution is deceptive you need to do the opposite of your intuition in order to escape from the trap campbell's law which is
well known in the social sciences says that the more any quantitative social indicator is used for social decision making the more subject it will be to corruption pressures and the more apt it
will be to distort and corrupt the social processes it is intended to monitor in other words social indicators like academic achievement tests
are least effective exactly when the objective is to bring them higher the economy could be stuck in a chinese finger trap right a decrease might be needed to
provide a larger increase in the economy later we might need to see the nhs fail or get worse in order for it to get better the mistake objectives make is assuming that there's a
monolithic trace through their values between now and some good future states sometimes things need to get a little bit worse before they get better there's this notion at the moment
that whether it's profit or educational or scientific achievement that it must increase monotonically every single year
and that is deception so objectives can create perverse incentives campaigns aimed at reducing alcohol abuse or drugs can actually result in
more dangerous drugs being used instead when india was under british rule the british government tried to exterminate venomous snakes by paying citizens for every dead snake they handed over
but it didn't work the way it was intended instead it led to citizens literally breeding cobras just to kill them for the bounty
which is incredible right so ultimately the number of venomous snakes in india increased as a result of attempts to decrease it
many in the uk will remember that objectives took the uk by storm in tony blair's government we had lots of performance metrics around waiting lists and the number of hospital beds that were
required and of course people would just game the metrics they would just take the wheels off the trolley beds so that they counted as extra beds it completely distorts the system and
unfortunately it removes any intrinsic motivation for people to perform better at work objectives can be tyrannical like the interesting things are actually at the high level not the low level
like what are the incentives that actually help something to like move in the direction of learning the thing i wanted to learn um and everybody has intuitions about incentives in education as well
teachers should be allowed more autonomy to follow their battle tested instincts about which methods foster deep learning and understanding of the materials and obviously i mean
the other type of deep learning the real the real one um but the lesson is that silencing diversity um and divergence of approach is a surefire way
to slow down progress and i mean diversity of thought deception is a concept in search but it's also just a general concept in life but it's maybe not recognized how serious the problem is
in other words you're following a path where some performance metric is going up but you're actually going in the wrong direction this happens all the time it's called deception it happens in machine learning too and
so it leads to what's called like local optima and people know about this but what people didn't really recognize and appreciate is just how profound deception is stepping stones are important the path
to greatness is actually unknown stepping stones are portals to the next level of possibility and we have to find these stepping stones if you're a freshman in college there
may be a path to making a million dollars before you reach 30 but what you should do first is probably not so obvious in other words you have
to search for the right stepping stones and if you're lucky and clever enough you might discover the ones that lead to the objective there may be a number of stepping stones you have to cross
and many of them are likely challenging to figure out stepping stones which lead to ambitious objectives tend to be pretty strange in the sense that
they don't resemble the final end state at all for example vacuum tubes led to computers few saw that one coming microwave ovens
came from radar technology it was a lucky accident when percy spencer first noticed the magnetron melted a chocolate bar in his pocket in 1946
the internal combustion engine led to aeroplanes you see the thing is creativity itself is a search problem if we're searching for objective
then we must be searching through something the space of all possible things in a sense the places that we visited whether it's in our lives or in our minds
are stepping stones to new ideas one of the key concepts in this paradigm of search is deception the measures we use to help us search
actually blind us to the stepping stones we should actually take which is to say quite clearly objectives themselves block discovery
it's almost impossible to shortcut the process of innovation take human intelligence for example we know that single-celled organisms did evolve
over billions of years into human beings so all you need to do is accelerate this process right so just select parents that look
increasingly human and voila you've done it before you know it right you've just created humans but you'd be wrong so wrong your pants are on fire in fact
i think it wouldn't work if that was what you were trying to do whatever that would mean like for example like if we started evolution on earth with a single cell where presumably it started and we said all right let's get human level intelligence let's start
let's let's start working on this so let's give them iq tests you know like we're gonna we're gonna just go straight to let's go straight to the heart of the matter let's not go through all these silly things like flatworms and stuff like that well i mean that wouldn't work obviously
the colony would die when you start throwing it at a iq test just ridiculous the problem is that the stepping stones to intelligence do not resemble intelligence put another
way human level intelligence is a deceptive objective for evolution millions of years ago our ancestor was a flatworm it wouldn't score any accolades for its intellect i can tell you that for now
but its one great achievement was its bilateral symmetry the stepping stones closest to the shore fade gradually as they wind into the fog you come upon
a fork where a choice must be made because of the fog you don't know where the path leads even though the mind is a powerful force in searching it's still difficult to see
further than one stepping stone away no matter how intelligent you are we think it's always best to take the road that heads in the direction of your desired destination but when the objective function is a
false compass it's called deception which is a fundamental problem in search and in machine learning because what what do machine learning algorithms do right they basically just take a gradient
of the direction that we're optimizing in and they go in that direction so these algorithms they suffer terribly from this false compass in a way it's a similar concept that some of you folks might be aware of
which is the exploration versus exploitation trade-off in reinforcement learning but of course kenneth stanley is talking about something completely different he sees that kind of exploration as
basically a complete waste of time it's just a random a random search which isn't really guided by any notion of interestingness in gpth3 we do a greedy search right for
every next token that we're predicting we have a probability distribution and we just greedily take the ones that appear highest probability now but probably we trace a path which will
give us cycles or give up give us things which are completely crazy because we don't have that view into the fog so to speak this is what deception is okay so let's look at an example of
deception so we're in a maze we're an agent we take actions in an environment there's a state action space and let's say we need to learn some policy using some kind of algorithm now kenneth
would argue that almost every single problem has deception and the deception is quite simply this i have some kind of a reward which i'm trying to optimize on monotonically and
that reward is my distance to the goal state so clearly i hit a wall i can't intersect the wall i can't go any closer and now in order for me to reach the real goal state i need to
actually make my reward look worse for a significant amount of time now currently we use things like exploration and exploitation to solve these problems but kenneth would argue that exploration is convergent it doesn't really
have any kind of gradient of interestingness so in kenneth's work essentially he relies on this concept of novelty search because novelty is is a proxy for interestingness and novelty is an information
accumulator which you can directly optimize on but you know the key concept we want to get across here is that there is more to deception than just the structure of the maze the search space is not actually the maze itself right the search space is
the uh the space of neural network topologies underneath inside your model would it be possible for us to define a notion of what interestingness or
curiosity actually is it's clear that curiosity is is fundamental to early development and learning and and becoming intelligent as a human is
it's not clear how to actually formalize or motivate curious behavior because it's not really clear how to formalize what's interesting and so i don't always think of it in
terms of an objective okay now the kid has a new objective jump on a log and now the kid has an objective to play with that doorknob or something like that so much as the kid has some notion that certain things are interesting and it's like a
very perceptive notion somehow and we all have this but it's different in all of us too we're not all we don't all agree on what's interesting and so if we want to have curious
systems at all the different range of human intelligence from like a baby up to an adult like a mature expert in some field we have to grapple with this very subjective notion of what is interesting
and that means that we have to grapple with subjectivity which we don't like as scientists like benchmarks and stuff we like being very objective about where we're trying to go and so to deal with subjectivity it
means that at the ground below a lot of our assumptions are going to be something that that can't be motivated objectively something that is ultimately just something that we believe
and we have to somehow figure out how to get the ai to align with that because if we just say go off and do things that are novel in some completely undefined way a generic way then um it'll go deviate off into a
space that we don't find interesting how do we align that notion with the things that we actually find interesting so that it can plunder the space that we want to exist within and so what does that mean i guess it means that there are things
that we have the capacity to do as individuals and together that that we we need help to do but we can't do without some help for example i know great art when i see
it but i can't produce great art i know great music when i hear it and somehow these kind of latent capabilities that we have maybe could be made more explicit if we had the right kind of assistance
and so in some way i think that what ai can do and would be really exciting if it would do is just amplify us in our capabilities and give us the ability to express ourselves so kenneth has pioneered this
field of open-endedness in artificial intelligence what exactly do we mean when we talk about open-endedness open-endedness is not how to learn something like how to learn how to do
some particular task so rather it's really you can think of it as how to learn everything there are properties of evolution evolution in nature that are just so profoundly powerful and
are not explained algorithmically yet because we cannot create phenomena like have been created in nature if you think about something like the history of all of human invention
now that is beyond interesting and that is in some sense an algorithmic process a process that continually increases in diversity and complexity without bound virtually forever
in fact there are a number of different kinds of open-ended processes that we can observe in nature there's the process of human innovation which i'm alluding to but also like natural evolution for example which is
not human driven that's also an open-ended process it's the ongoing creation of all the diversity of life on earth so if you think about open-ended processes they're not about like a
single positive result it's really not about any particular result at all it's really about surprise like an ongoing cacophony of surprises as we tried to argue there's a parallel between our own
history of innovation and evolution i think we're gonna continue to pursue these kind of algorithms and also hybridize them with others including learning algorithms because after all that is the world in which we live where there's
an outer loop of evolution and inner loops of learning this is the tree of life and this is a single run so to speak of an algorithm called evolution which generated
all of life on earth this is quite incredible so the process itself is is a process of endless surprises we can't really predict where it's going to be going in the future and we couldn't have predicted where it
has been and look at the kinds of things that it produces along the way like it invented all these things like it invented photosynthesis and invented the flight of birds it invented
of course intelligence itself all in one single run to get to these really really far away branches of the tree of life you cannot have an optimization process
you have to have an open-ended process but those branches are worth getting to and what's interesting is we don't have an algorithm like that including an evolutionary computation it's not something that invents one thing it's something that invents
everything this is not only an explanation for our actual existence because it perceived us and created us but it also seems deeply entrenched in our nature we don't actually know what
is the property of our universe that has facilitated something that could go on for more than a billion years of creativity you could come back a billion years later and still be surprised at what it's creating
we don't create systems like this like none of the domains or artificial systems we create have this property so you're just putting algorithms aside for a second this is a fertile area for research to
understand open-ended environments where the surface is still barely scratched to understand human level intelligence we are going to need to understand creativity that's a big part
of what being intelligent means from a human level is our creative aspect we may need to implement and run some kind of creative process or open-ended process
that produces this general intelligence we may not be able to get to this really high level of intelligence without some kind of process that's creative generating the intelligence
itself what kind of process explains open-endedness and complexity explosions like things where things just keep on getting more and more complex up to really an astronomical level like
the human brain with its 100 trillion connections one of the most important ingredients is divergence it's really important because a lot of the time we're worrying about convergence all the time
converging towards the optimum converging towards the objective minimizing this loss but but if it isn't but for open-endedness if it's important to get divergence then a lot of our intuitions that we've been building up
are not relevant to that kind of a process because divergence is like the opposite of convergence the system has to generate the problems and the solutions and they have to be self-generating since no one can anticipate a curriculum
all the way up through a billion years it's the system's job itself to generate that curriculum and this is the key to earth's open-ended creativity it's generating both the problems and the solutions
and that's an interesting thought because where are the problems coming from that we're solving well they're coming from each other the thing is we are both the problems and the solutions like we as the
organisms that are being evolved because the existence of anything creates the opportunity for something else like the existence of trees now makes it possible for us to have giraffes and giraffes are solving a problem
but the problem was actually created by the trees which is how do you get to the leaves of the trees when they're high up and so all of us are creating or maybe i shouldn't even call it problem opportunities for each other the fact that i'm standing up here gives you an
opportunity to learn something but also the fact that you're in the audience give me the opportunity to talk to you and so forth and so we're all creating opportunities for each other all the time and that is the source of why this process can go forever
and we really would love to produce a truly open-ended algorithm and the space of problems is a lot different from the space of solutions isn't it like the space of solutions is we've been grappling with that for a long time and like deep learning is all about a space
of solutions they happen to be in neural network space it's a neural network space but the space of problems is much harder to understand then how do you represent problems as opposed to solutions and what keeps them interesting and
meaningful forever perhaps to achieve human level ai we must travel the road with no destination there are many destinations that are reachable through no other process
than not trying to reach them and therefore we have to confront the grand challenge of open-endedness and i want to highlight that grand challenge of openness i believe this is a challenge just as worthy as ai of huge amounts of
resources if we could conquer open-endedness and create algorithms that create forever we will have opened up a massive sphere of possibilities that's currently not available to us the power of creation is the power of
open-endedness so technically this all started with yannick light speed culture in icml 2019 yannick went to their tutorial you know we've got kenneth stanley and joel
layman talking about open-endedness and poet etc and it inspired yannick to make a video on it on his youtube channel which was quite nascent back in summer of 2019 we're not actually talking about an
optimization problem because there's not a particular problem that we're trying to solve rather we're trying we're inventing problems and solving them at all times and there's not anything specifically that we're trying to solve the less you have an objective so you
can kind of think about that like if you have millions and millions of objectives you imagine millions of people with different objectives the whole system in aggregate starts to be less objective i was kicking myself actually because
annoyingly i didn't meet yannick at icml19 even though we were both staying in the western hotel but yeah i would have found that absolutely fascinating and uh fortunately i i found out about it vicariously later through connor shorten
and obviously meeting yannick this is what yannick had to say at the time so i was pleasantly surprised by this tutorial because i knew almost nothing about these techniques and they seem really cool seems to be a
really cool line of research so started out with what is population based search and basically in population based search you don't want to just reach
one solution of a problem but you want to kind of maintain a population of solutions that you develop over time so this is jeff cloone
presenting at icml 2019. now jeff cloon is incredibly well known in the open-endedness literature and if you haven't checked out his work make sure you do he's a great guy so our most recent algorithm here is
this poet algorithm which stands for the paired open-ended trailblazer and there is a pair of things happening there's going to be an algorithm that's generating problems or challenges or training environments
an observation they make is if you look at nature and natural evolution it is very successful even without a goal so there's no goal in mind to
natural evolution except reproduction creates other reproduction but is not a goal that's um that's simply a kind of underlying mechanism
what if we build a search algorithm that only wants to create novel things so where kind of novelty is the only goal
uh what happens then this is a clip from our first ever episode of machine learning street tour we were so excited about this connor shorten recommended that we check out this poet paper the uh pairwise open-ended trailblazer
we're going to talk about uber's recent paper the enhanced poet open-ended reinforcement learning through unbounded invention of learning challenges and their solutions
by wang itau now this paper it introduces a new paradigm of machine learning which is called ai generating algorithms or open-endedness
it's all one single run of the same algorithm and it doesn't really have a goal in mind so open-ended algorithms are like that they kind of define an
interesting notion is it still interesting if we were to just let it run for a billion years like would it still be interesting if yes consider it an open-ended algorithm
you might even be asking yourself why should we be looking at population-based methods to learn let's say uh policies instead of reinforcement learning algorithms the reason they're not as well known is
because they are population-based methods and traditional machine learning community does not deal with populations of agents but a single agent trying to solve a task but we think that the ideas and the algorithms and the insights behind them
are very broadly applicable what is the difference between the problems that evolution is better on and in the problems that the deeper l is better on in terms of surprising results probably the most
shocking result that we found was that there were a few games where random search so not a ga just complete a random search in this huge high dimensional network space
uh was actually better than than some very good well-respected deep reinforcement learning algorithms okay like we already know all about evolutionary algorithms you would have learned them in your
computer science degree it seems like we've already got this problem nailed right wrong and we also tried to propose some enhancements uh to to like modern kinds of
evolutionary algorithms that like sort of protect them from some of their criticisms about evolution like for example that well mutation is completely random so it's unprincipled compared to sgd which is actually
looking at the gradient and following that in some kind of principled way so we introduced like new kinds of mutation operators that actually are gradient informed mutation operators so
it's kind of mixing paradigms in a way to make the point that you know when we criticize evolutionary algorithms because like say it's just random or something like that it's kind of a myopic view
the main thing that current genetic algorithms do not have is divergence what they do have is diversity they maintain a separate population of individuals and then they do crossover
and random mutation and all that good stuff but the fascinating thing about real evolution is divergence and one characteristic of that is that evolution is continuously
providing new problems and solutions to help agents if you like maximally exploit the environment that that we're in so divergence is super important and then you might think well does that
imply that gas now are convergent yes they are and one of the main reasons they converge is because they have one fixed objective which means if you run it for a significant amount of or not
maybe not even a significant amount of time if you run it it will converge on a kind of local optimum and then that's it it will no longer continue to improve there's something wrong here because
evolutionary algorithms what are just talking about here they also converge they converge and then they basically give you your solution or they get stuck like any other machine learning algorithm and then that's the end
and it basically by the way is the same story with deep learning or anything basically they find your what you're looking for maybe they find a few things you're looking for then it's the end or if it doesn't work out well they
don't but either way it's the end natural evolution is still diverging and doing something interesting interesting is the key word right because divergence just for the sake of divergence that could be the worst thing
in the world we don't want to be doing completely random things that aren't interesting so what does interesting mean well interestingness simply means that you're accumulating information right the amount of information that you
have is growing and what does that information mean or do well it could be anything it could be helping the agents in your world exploit the environment better novelty is kind of a proxy for it
novelty is not equivalent to interestingness interesting this is richer and also harder to formalize but novelty is kind of close to what interestingness is and so the idea is
what if you only followed gradients of novelty and novelty is just really fascinating as a gradient you know because it's actually very well informed because it's basically a comparison of where you are to where you were in the
past and so it's actually just really the opposite of an objective because an objective is comparing where you are to where you want to be in the future but neither has more information like people have this intuition when they hear about this the novelty search is
kind of random but objective search is like not random because you're comparing to the objective but that couldn't be further from the case i mean there's actually more information about the past than the future because actually was there so i know what i actually
had so i can compute novelty with respect to some record of where i've been in the past and have a very defined gradient but the really interesting thing about information accumulation as opposed to any other objectives
is that there is one direction right it does monotonically increase information accumulation is the only objective which does increase monotonically but we shouldn't think of it as an objective the whole point of stanley's
conception is that there is no one objective it's more like meta-learning where there are an infinite sea of objectives which are learned through some kind of
search process right and the fascinating thing about these systems is that when you have the diffuse diversity of objectives then in a sense there is no objectiveness to the system at all
because it becomes completely fluid and for that reason it can accumulate information and it can diverge continuously what it illustrates when you sometimes win with novelty is just
how profoundly weak objectives are in terms of like really being a good guiding light you know they're just so much less than you think they are it's an absolute embarrassment for objectives that something that doesn't even know what problem it's trying to solve
is doing better at solving the problem than the objectively driven version of the same exact algorithm the other fascinating thing that we explore in the show today is that our existence is not entirely explained
by selection so we've been thinking about this entirely the wrong way it's actually the operating search structure which is exploited by evolution that's what makes it so interesting if you really think you understand
something from an ai perspective then you should be able to formalize it and write it down and then it should give me what i'm asking for so if that's all evolution is then let's just write it down and then we'll have open-endedness and you've
just revolutionized the field and which is why i think it probably is not just that it's probably not just anything that anybody here thinks it just is you may have read your biology textbooks and you may think that you know what
evolution is and it may make sense but actually i can almost assure you that there's going to be a few counter-intuitive kind of curveballs that are thrown into that story
that we learned in high school which aren't in the textbook because if there weren't we would just take the textbook and formalize it and it's not that simple and people been trying to do this for 50 60 years so evolution is a very
complicated and subtle process and what we mean by it is different than just optimization if we're talking about the open-endedness aspect of it which is probably the more interesting aspect of it if you're just saying from an optimization you know perspective like
how can i explain why hyenas are as fast as they are okay there maybe you can give me an explanation but that's not the same as sort of formalizing the whole thing as a process that's going to produce this kind of
grandiosity of nature on earth open-ended search is definitely a cool research direction and i encourage you to check it out all of these different challenges that
you're seeing here were invented by the algorithm as it's running and also the agent that is solving that problem was also invented at the same time and over time these environments get more and more complicated
we did the counterfactual we went back to the 298 agent and we just ran it forever we gave it tons and tons of computation and it was stuck on a local optima it never stood up the if you just let that thing run forever
it only gets a score of 309 and it keeps that knee on the ground for whatever reason because the landscape is deceptive so counter-intuitively in this particular problem you had to go from a simple problem
to a harder problem and then come back to the simple problem in order to get the right solution to the simple problem which is just a curriculum that a human would never design but these algorithms figure out on their own which is the beauty of
these things just dropping an agent in that environment and trying to solve it because there's no curricula to get there or by taking kind of like what you would see is the direct path here are some solutions that traditional optimization
cannot solve but poet does find a solution to some of these really really crazy environments any of these really challenging problems when you're going to go back and look at it you cannot directly optimize to solve
that problem so very quickly poet is kind of quality diversity plus plus because as ken said it breaks the mold it now can generate its own niches in addition to trying to have all the other properties
is it truly open-ended well no it's not because well it's definitely a step closer because it's generating its own niches it's currently limited by the fact that we picked a specific physics simulator and a specific way to parameterize
environments in that physics simulator the stepping stones that lead to where you may want to go are only accessible to you if you open your mind and are willing to actually accept things that don't look better objectively so i like to give the example walking
like on the road to learning to walk you have to learn about oscillation but if you just start oscillating your legs while you're standing or something like that you'll just fall on your face but if i look at it from like is that interesting it's very interesting it's a
completely new idea i've never seen that before so in the novelty version of the world i would be like okay let's go down that path i don't know where this leads but it's something i haven't seen before but in the objective version i'm like nope try something else so what will i
try like i'll try like lunging as far as i can or something that's in the short term like looks like i'm actually making progress but in the long run it's going nowhere because you're not going to be able to walk based on just like throwing your body forward as fast as you can
but the larger lesson is just like let's take seriously other gradients like the gradient of novelty so this serves as an interesting lesson for gophi advocates good old-fashioned ai there
are many good old-fashioned ai folks challenging machine learning and the statistical approaches to ai saying that knowledge is universal and should be explicitly constructed
but what about knowledge which is not yet in our possession kenneth and joel's poet paper demonstrates explicitly that there's plenty of knowledge about existing earthly systems
which we don't fully understand we understand the what but not the how the how will only become known possibly through some open-ended discovery process
because the stepping stones to that final knowledge you know the how will not in any way resemble the current what that we have in our minds so in a sense
knowledge engineering is only possible if you understood it in the first place if we understood how to build a compiler for language then natural language understanding would be solved
we don't understand it that's the whole problem right it's the same thing convolutional neural networks they started dominating the computer vision world after 2012 and it was precisely because the
representations learned as part of the training process were better than any handcrafted features it just it was a complete game changer and even now we don't fully understand how they work we have to
resort to all sorts of esoteric methods to do interpretability on on these models but that's just the way it is but have you noticed now that gophi people they recognize neural networks as being part of the
architecture for any agi system they use the term perception right so so perception means a module which cannot be explicitly crafted with
knowledge because it's not understandable yet by humans even bayesian aficionados they point out the lack of causal inferencing in these neural network models they just say look
they are statistical correlation machines that there's no causal factors and yeah in a sense they're true but i'm not really sure their alternative is better
what they are proposing is that you impute the causal factors into the model architecture explicitly right so you already know what the causal factors are
you know that male testosterone causes car accidents so you you build a probabilistic graphical model that has that causal factor in well yeah that's great if you know what the causal factor is right
what about all of the causal factors out there that we don't know you know bayesian approaches are not going to help you then are they what we need to have is is a new way of
discovering that knowledge and potentially kenneth stanley's approach might be leading us in that direction right because stanley instead explores the seemingly
infinite space of knowledge which is not yet known um you know interestingly he focuses on information accumulation as being one of the core kind of tenants of of of his paradigm
and in recent years also discovering new problems and new solutions in tandem rather than you know solutions to one specific task okay so an amazing blog dropped this morning by
this guy mark sarafin actually we're getting him on the podcast at the end of february so you can look forward to that but he just wrote this wonderful article called machine learning
the great stagnation right and i'm not going to spoil the surprise too much because of course we'll be talking to mark but he he leads by saying academics think of themselves
as trailblazers explorers seekers of the truth any fundamental discovery involves a significant degree of risk if an idea is guaranteed to work
then it moves from the realm of research to engineering unfortunately this also means that most research careers will invariably be failures at least if
failures are measured by objective metrics like citations so the construction of academia was predicted on providing a downside hedge
or safety net for researchers where they can pursue ambitious ideas so mark points out that academics sacrifice material opportunity costs in exchange for intellectual freedom
society admires risk takers for its only you know their heroic self-sacrifice that society moves forward he points out that most of the admiration and prestige that we have
towards academics are from a bygone time uh economists were the first to figure out how to maintain the prestige of academia while not taking any monetary or intellectual risk they'd show up on
cnbc finance talking about corrections or irrational fear and exuberance so mark says that it's hard to point the blame towards any individual researcher after all while risk is good for the
collective it's almost necessarily bad for the individual however this risk-free approach is growing in popularity and has specifically permeated machine learning
right the the fang's salary with an academic appointment is the best job available in the world today and with all of the sota chasing we're rewarding and lording incremental research as innovators
increasing their budget so that they can do even more incremental research parallelized over as many employees or graduate students that report to them
what exactly are you doing yeah yeah i've just been really busy chasing sota so he's fundamentally saying that you know machine learning researchers now can engage in risk-free high-income high
prestige work and they are today's medieval catholic priests this is an absolutely hilarious article i mean i'm not going to spoil all of it for you but i'll do a bit more so machine learning
phd students are a new kind of investment banking analyst in in mark's opinion both seek optionality in their career choices but different superficial ways like preferring meditation over parties and
marijuana and adderall over alcohol and cocaine you know so he points out that a machine learning phd is now kind of like an extended interview for the tech industry whether it's facebook or microsoft
and google the entire data science interview process at the larger labs has now just become a mix of trivia and prestige you know checking out your portfolio would take far too long it's far easier just to say well you know have you graduated from
stanford or do you have a paper co-authored with google brain that's a good filter so yeah he says that you know matrix multiplication is all you need he's got some absolutely hilarious memes
here so you're telling me it's all matrix multiplications always has mean yeah some of these things are brilliant rise are the transformers attention is all you need
transformer on proteins transformers on molecules transformers on images fast transformer transformers are graph neural networks learning to transform with transformers oh this is hilarious as well so he's got this concept called graduate student
descent and the death of first principles this chart below describes how graduate student descent works so essentially you you uh initialize you find the sotter on archive
you find the code on github if worse you make random changes and then you publish so this is uh it's a bit of a piss take but to be honest it's not too far from the truth statistical learning gentlemen our
learner over generalizes because of the vc the vapnik chervanescus the dimension of our kernel is too high get some experts and minimize the structural risk in a new one rework our loss function make the next
kernel stable unbiased and consider using a soft margin language models in particular like gbt3 are starting to feel a lot like the large hadron collider
but no language has ontologies which you're not capturing with self-supervised learning your approach isn't research because it's not sample efficient like humans you can't even interpret the weights and
the other was like as you go hilariously he points out that bert engineer is now a full-time job qualifications include and i'm guilty of this by the way sun
bash scripting check deep knowledge of pip waiting for a new hugging face model to be released jack watching yannick kilcher's new transformer paper the day it comes out
guilty as charge uh repeating what janik said at your team reading group guilty as charged it's kind of like devops i'm all over that by the way but you get paid more
yeah so i mean you know obviously we've got to be able to take the piss out of ourselves a little bit in the machine learning community that you know you you could say that this is just posting but um it you know there's there's
obviously a large kernel of truth to what mark is saying yeah um so imparting at the end he says that he's sad to see that the most exciting work in machine learning is coming from outside of machine
learning he says he spent 10 years in this field and he learns more from the crackpot outsiders on twitter today than he does from peer-reviewed papers and i think that that says quite a lot really anyway
we're talking with mark at the end of february so i hope you look forward to that it's quite an interesting piece here as well about the bio and tech vaccine which was uh um
recently approved and this lady catalin carrico she pioneered this work into mrna therapeutics and vaccines going back many decades but the problem was that this article tells
a really fascinating story about when she was at her lowest ebb she was a biochemist at the university of pennsylvania and she dedicated like the previous two decades of finding a way to turn
one of the most fundamental building blocks of life mrna into a whole new category of therapeutics and she found herself hitting lots of dead ends to cut a long story short there was a
problem with using mrna because the there was an immune response from humans and that needed to be attenuated so um her bosses just ran out of patience she got demoted and she
basically got told that you know she couldn't work on this anymore and this is a wonderful example of group thinking objectives in a way because we can't trust innovation centers or
universities or companies to innovate because to innovate you have to basically explore into the unknown and this was a wonderful example of you know luckily this woman had the
initiative and the persistence and the perseverance to continue with her dream she really believed deep down that she could make this work and she placed herself at considerable personal
hardship and risk to make it happen and then because of this incredible work from this woman we now have another vaccine for coronavirus which we wouldn't have had
otherwise so it goes to show that we really need folks out there like catalin who pursue their passions and interests
and innovate for the benefit of all of us so pick breeder was influential in the genesis a lot of the ideas that we've been discussing today is for the audience like where does all this come from
this was like a website where you can just breed pictures but the the kind of clever twist of that was that if you brought a picture basically you would be evolving the pictures like you choose a picture and it would have
babies and then you could choose the ones that you like and they would have children and so forth and then you could like publish your picture on our website that you found by evolving it but this clever twist is that you could then
someone else could come in see your picture and then branch and start evolving from there so we had people evolving from where other people left off and so it was kind of experimenting at some level
so it's sort of like it's like what you might think of if you're breeding horses or dogs or something like that but in this case it's just pictures and so you could go online and see some blobs and you could choose the blob you like and it would have children that look kind of like it but are a little
bit different just like if you're breeding a horse or a dog or something like that but the key thing that makes it sort of special is that and sometimes this is called genetic art by the way some people may have heard of that or evolutionary art but the thing that made pick readers
special is that it was a website so if you did discover something or breed something that was cool you could publish it press a button that says publish and would go back to the site and then somebody else could come in and then
breed from there so they start out with very simple pattern and you just have the opportunity to kind of you pick one and it gives you a bunch of random perturbations of the procedurally
generated image and you pick the ones that you like and then you continue exploring from there and if you're happy you can just save that to the database and someone else can look through the database and then
pick yours for example to continue so people were building on things that people built on that people built on that people built on it's basically standing on the shoulders of your predecessors
and you end up getting this giant branching phylogeny of pictures and it was ultimately for me and i don't know it's up it's obviously subjective it was ultimately for me astonishing what people discovered
like these were generated by little tiny neural networks and we had skulls and butterflies and cars and stuff like that and so it turned out that yeah the the notion of what's interesting which every person uh has a different
notion was was instrumental in allowing that process to happen in in what i think is a completely phenomenal process of discovery because like the number of iterations involved in say finding something like the skull
was in the dozens like think about that compared to deep learning thousands or millions or billions of iterations this is like dozens of iterations to find things that are absolutely
extremely rare needles in a haystack what is the explanation for that indeed part of it is that we have a very very strong nose for the interesting as human beings and that it's diverse we're not all
looking for the same stuff and the things that the humans came up with or that the result of that was extremely interesting now something like
pick breeder is much more kind of like close to like actually following gradients of interestingness because it's human beings that are making the decision and humans arguably make decisions on what's interesting what is it that makes systems like pick
breeder work so effectively open-ended systems in general and what it is is basically it's the connection between one stepping stone and the next you're just doing things that are random like people in pickport are not doing
things that are random it's not like just clicking randomly on images just hoping something good happens they're obviously doing things for some reason why are they doing them it's like if i click on this and then it leads to like some pattern like that's
symmetric or something and i like it it's not because i'm like oh well now i'm gonna get a butterfly it's because the pattern in its own right is just interesting to me and this is not like a random fact this is based on like the the full force of
all of my intelligence is still being put on to making this decision that i like this pattern so it's hardly a trivial gradient that i'm following that connection like if there's a very very salient stepping stone for you just for
you this could change your life like if you look at this picture of this butterfly or something like this it will not be true if the majority of people and so we will never get to follow down that
path if it's a majority rules type of voting situation we need you to be able to follow your special special instinct because that was the thing that could change only your life the system
is designed basically to try to raise things up that people find interesting the images that were discovered by pick breeder users were discovered by people who were not trying to discover them and that is because of deception i have
to make a strong argument that pick reader actually is like real life and and i do think it is and the reason that i think it is is because pick reader is actually much much simpler than real life
like in pick reader if you take any arbitrary image like the skull and you say okay well how would i get there from scratch like we don't really know how to get there from scratch that metric of improvement of like how close are we to a skull is
not the right one for following that trajectory through the search space that's basically the definition of a hard problem like if you knew what the stepping stones were it wouldn't be hard we would just follow the stepping stones of
course sometimes you have to be following the other by other i mean not the objective gradient not the thing that leads to where you're hoping to go but something that leads somewhere along a different line but which has value
anyway in every case where something really cool was discovered like say a car or a butterfly or something like that there is some predecessor image in other words an
ancestor where the person who discovered that ancestor was not trying to get the thing that was ultimately discovered and this is like 99.99 of cases it is an experiment an evolutionary computation
but it's also an experiment like mass level collaboration like huge massive collaboration but where there's no explicit collaboration so nobody's talking to somebody else like how am i going to work with you like who should i work with
it's just here's this artifact i discovered hand it off to the world and then someone else might take that baton and then move it forward again and we got really really large branching phylogenies is what you would call it
like trees of evolution because people would branch off of people off of people and we could actually show it as a graph like these huge phylogenys of pictures basically and just amazing pictures were discovered
um you know things that you wouldn't think that from just random blobs you could evolve things like cars and butterflies and skulls and all kinds of very recognizable objects were involved
and certainly it's some kind of a democratization you know it's like humans in the loop so it's it's kind of different from nero and since it's like a mass many people experiment but
it has this extra element which is that like there is a lot of potential for the combination of learning in humans um where we still are better at certain things
than a.i and
than a.i and this is what we call divergence in other words the system rather than converging to the uber image whatever that might mean is actually diverging across the space
of all that's possible and collecting stepping stones extremely valuable as an example of how an open-ended process can develop because we know every single choice that was made
you can only get to these things by not trying so pick breeder was a website that was launched a very long time ago now and i can't really demonstrate it properly because it's not working
uh without java applets and i don't think anyone uses java applets anymore so you can see the parents and the children so you could this particular one had this parent and
then that parent and then that parent and the fascinating thing with pick breeder is that quite often the parents do not even resemble the children
although in this case it kind of does quite interesting so when we go up the tree high enough we see that the parents were just basically blank images so there was a rating system on there you could see
which the best images uh were rated so there are foxes and ghosts and a lot of um kind of anthropomorphic images as well which shouldn't surprise you given that humans are
you know essentially supervising the process you can see the highest rated images here some of them are fascinating that looks a bit like a monkey that looks a bit like a baby's face some fruit
how are these things generated well it's it's easier than you think actually so this was the original java application that people used to use for pick breeder essentially it's a human augmented distributed evolution algorithm
and the population included individuals which were compositional pattern producing networks or cppns now it's not as complicated as it sounds they're right here actually so a cppn
is a neural network that takes in a spatial input and has an output in three color channels so we have an rgb now there are four inputs here i can't remember exactly what the four inputs are but basically think of it as an x
and a y maybe there's a bias in there or something but um what happens is the user clicks on the result so so this is the underlying pattern producing network and then there'll be
a result which is shown so for example those ones i just showed you there they all produce slightly different results and they produce different results because they're variations of each other they have a slightly different
topology or they might have different weights or activation functions the activation functions that are used are are just an optional set as well so you can just check all of those off so the only thing
we're missing here really is how do we make that evolution algorithm produce mutations and crossovers in the neural network space because we're actually searching through the space of
neural network topologies if you think about it and that is using something called neat okay so a very long time ago now kenneth stanley pioneered this algorithm called
neat which is evolving neural networks through augmenting topologies and basically it was an evolutionary algorithm for neural networks a population-based method and what do you need to do to
implement an evolutionary algorithm where you need to know how to represent neural networks and cross them over and mutate them and there was also a challenge as well because as you'll remember from our
conversation with max welling you have these symmetries especially on topology right so you can have two graphs which are basically the same thing they are symmetry transformations of each other but the problem is when you when you
cross over those two networks they can kind of cancel each other out and this is what kenneth stanley called the competing conventions problem so we need to have a system that can solve that okay so
the core concept with neat is how do you represent neural networks in such a way that you can mutate them later so we have the concept of a genome and this basically just you know is a list of all of the things
that's ever happened to a particular neural network so you can see the nodes and you can see the connections and what's interesting about this neural network is that um number five came along later so
originally two is connected to four so you can see here that number five got added and it's just like a stack you just add all of the innovations on forever it's a bit like a transaction log in a database or something like that you know you you kind of keep a record of
everything that happened before so number five came along and you can actually see here that we've got all of the innovation numbers and monotonically increasing sequence and the interesting thing is that one of them has been disabled
well i wonder why that is well this particular one is going from two to four so from there to there and it needed to be disabled because now five is in is in the way so five came along and then you can see here
the sequence of things that happened so now we've got one going from two to five which is there so that was added and of course this was disabled straight away and then we've got from five to four
right and then the last one is from four to five and then one to five so presumably five got added this connection got added the original connection between two to four was deleted and then we had two new connections
between two to five and four to five so how do we do mutations well let's have a look so if we wanted to add a connection to this network let's say from five to three all we do is we add a new innovation
number on the end so innovation seven and we just say three to five okay easy what if we wanted to do something a little bit more complicated so we wanted to add in
a new node here six so this is a a node that didn't exist before well in order to do that we'd have to disable the original connection right from three to four because now we've got six in the way so
this one's a little bit more complicated so here we put an innovation eight in going from three to six and then nine going from six to four and we delete the old innovation from three to four so yeah and this is a
population-based method so of course we have many many different types of topology going on but they will all be sharing the same kind of genome if you like but of course they'll be specialized in the sense that
some of them will have additions in terms of innovations or some of them will have previous innovations disabled so you can you can imagine how we can make an algorithm to do this automatically because it's basically just
a random mutation so you just need to essentially come up with a place to put your new node which activation function do you want which weight do you want and so on so that's reasonably straightforward
the next thing of course is in the world of pick breeder it's a human augmented process so so the humans basically pick the children and then the crossover is performed so how do you do the crossover well the beauty of having this
innovation system is you just kind of like line up all of the innovations and if there are any disabled so it's like an or on the disabled then the child is disabled so here for example
um if we cross over these two parents because one of these is disabled then the the child must be disabled as well um if there are any gaps then it just gets filled in
and that's basically how you produce the offspring so this offspring gets the eight from parent one um everything else um comes from parent two and then if there are any disabled nodes then
the offspring is disabled and then it produces this child topology so it's actually surprisingly simple and this is the you know the foundation of how pick breeder works but yeah this neat algorithm is
absolutely fascinating because probably most of you are reinforcement learning aficionados and uh you haven't really thought about using any other approaches for let's say building policies for controlling agents
and i think this is a fairly cool example on github i found where someone had used neet to learn a policy to control the game of snake and this was the topology which was
found you know through this population search and um it works incredibly well you know you don't necessarily have to use reinforcement learning which should be uh interesting for some of you to hear so how do i apply these
ideas in my life well there's a trade-off isn't there between things that you know will pay off investments of your time versus investing in things or creating a
fertile ground for discoveries so there's an interesting dichotomy between the manufacture and discovery of opportunities in our lives i got a message on steam the other day from someone who wanted to
play counter-strike with me now i've known this person for years presumably he plays counter-strike 20 or so hours a day um do i want to play counter-strike
no it's not interesting to me i i just i have a sense that it's not going to lead anywhere interesting i mean of course we don't really know where things lead but we have an innate sense for interestingness and it's different for
all of us perhaps if i wanted to get a career in esports or something like that then i would find that investment would pay off for me but i invest in other things you might ask
yourself why am i running this youtube channel i don't monetize it i don't get any money from it i don't get anything from it well that's not true at all right first of all it's incredibly serendipitous that i met
yannick it was icml 2019 i was there and yannick was there we were both staying in the same hotel but we didn't meet each other then in fact i didn't even go to the tutorial
on open-endedness and he did he made a video about it connor shorten then reached out to yannick because of that video and then i um happened on connor shorten's videos and he introduced me to yannick and we
decided to make this youtube channel together that was pure luck it was serendipity and i do this youtube channel because yeah it leads to so many interesting opportunities later i could build an
audience of people that trust me maybe i could monetize it later maybe it will lead to career opportunities it's a forcing function for me to get really good at all the latest and greatest in ai
it means that i can facilitate lots of conversations with people that i wouldn't otherwise have been able to do it's just it's an incredible um all-round fertile
kind of cultivation process for opportunities so this is open-endedness and manifested in in my life i have a strong intuition that it will
lead to something very interesting later whether i'm learning more about machine learning and artificial intelligence i'm learning about creative pursuits whether it's making 3d renders or video editing or
graphic design my communication skills my written skills if you want to have an all-round kind of skill acceleration process i thoroughly recommend you all to start your own
youtube channel because within a couple of years any weak links are accentuated greatly it's very easy to do an audio podcast it's very easy to do a blog
try doing a youtube channel it's an order of magnitude more difficult one of the key messages that kenneth is talking about is that we should all be treasure hunters we should identify some domains of
what's interesting to us and we should create a fertile environment to maximize the number of opportunities we have because it's a numbers game at the end of the day
opportunities do come and arise and we should have the same idea in science and in engineering and just in what we do at work we should be building platforms and building systems
that create and manufacture opportunities and cross-pollination so by being a treasure hunter it means that all the things that we hoped might happen in the future will become more likely
because you know the more stepping stones we have the more places that we can get to the stepping stones you end up on might not be where you had planned to go
but kenneth would argue you shouldn't be planning anyway anyway i really hope you've enjoyed the show today and and will enjoy the interview with kenneth it's been so much fun making it uh this this has
been i've been planning to make this particular show for what seems like an eternity now and there have been so many events that have conspired against me so i'm so glad
to finally get this out there remember to like comment and subscribe we love reading your comments and we'll see you back next week i gotta say i'm kind of i'm both disappointed and intrigued that
you're no longer at ucf i was so looking forward to visiting you down on campus one day when i visit family in florida but now you're at this cool place so it's an
equally cool place welcome back to the machine learning street talk youtube channel and podcast with me tim scarf and my two compadres mit phd
dr keith duggar yannick lightspeed culture and today we have an incredibly special guest professor kenneth stanley now i've been dreaming about getting kenneth on the show since the very
beginning some of you might recall that our first ever show was on the enhanced poet paper you know the uh pairwise open ended trailblazer and of course kenneth had his hands all over that uh kenneth held a charles
milliken professorship and was a field professor at the university of central florida he's an associate editor at the frontiers in robotics and ai and also the ieee transactions on
computational intelligence and ai in games he's on the advisory board for the o'reilly artificial intelligence conference and the spring natural computing book
series he's been cited over 16 000 times his most popular paper with over 3000 citations was the neat algorithm which is evolving neural networks through augmenting
topologies and that's a genetic algorithm for the generation of evolving artificial neural networks he's currently a research science manager at open ai in san francisco
he was a senior research manager at uber for three and a half years he completed his phd in computer science from the university of texas in austin his interests are in neuroevolution open-endedness neural networks
artificial life and artificial intelligence he invented the concept of novelty search with no clearly defined objective his idea is that there's a tyranny of
objectives prevailing in every aspect of our lives society and indeed our algorithms crucially these objectives produce convergent behavior and thinking and distract us from
discovering stepping stones which might lead to greatness he thinks that this monotonic objective obsession this idea that we need to continue to improve benchmarks every single year
is dangerous sarah hooker recently spoke about the hardware lottery how the previous software and hardware decisions enslave us but in my opinion the objective lottery is so much worse
he wrote about this in detail in his recent book greatness can't be planned which is the main topic of discussion in the show this evening he introduces several very important
concepts the false compass or deception treasure hunting we should all be treasure hunters my friends this book has really touched me in particular modifying many of the key uh you know aspects of my personal
philosophy in life so i'm very indebted to kenya for writing that book kenneth stanley welcome to the show thank you tim uh it is it is an honor to be here i'm really happy to be on this show i've enjoyed this show i watched
this show before and uh really happy to actually be here finally myself so thanks amazing you have a very interesting philosophy which is quite contrarian actually because
as you said objectives seem to dominate our lives and and society and from the very beginning you've been pushing against that that grain if you like so where did it come from well before
before he answers that is is it a fair assessment that tim just gave because for example when i read the you know the abandoning objectives white paper
it's not remotely as extreme as like a tyranny of objectives etc etc dominating our life you know it's much more nuanced like it says hey look maybe the best approach is to take you
know the some of the uh things we find in novelty search and then apply objective based optimization to them and if we kind of look at the history of humanity it has been this dual-sided coin you
know i mean out-of-the-box thinking is nothing new serendipity is nothing new exploration's always been a big part of human endeavors and yet we can even the scientific method right begins the first step is
observe hypothesize right test repeat so i mean we've always had this dual sided coin of explorer combined with optimizing those results how many great scientists formulated
their hypothesis before they're discovering i'd say all of them observed before they hypothesized and observation is exploration okay interesting i i so i i guess so
this is i think the question that tim is asking is is more i'm interpreting more as a personal question like how i came to these views and keith is is is rightly pointing out
that we can't simply dismiss all objectives in in the world i mean obviously there's no there is an important role for objective pursuits in the world but nevertheless i think it's true that like even though i recognize that so i'm
not on this exact extremum of the spectrum i mean that's just not where i am even though i and i recognize that i think it's it's true that i'm just tend to be more of a radical on this
particular axis it is in my personality in nature to want to overthrow this i guess we could say tyranny of objectives even though i do recognize that
certainly sometimes they they they have merit and and have to be used and followed but it's true that sort of like throughout my life i i just seem to be have have an intuitive tendency to want
to reject objective driven type of endeavors and go in the opposite direction and so it probably started out as a personality issue maybe it's a maybe it's a
personality problem but so maybe some a degree of rebelliousness although it doesn't exhibit in all aspects of my life but there's some degree of rebelliousness but i think that it crystallized over time
especially at that moment when we started discovering this principle like if it the the moment sort of where we saw this principle in pick reader was probably transformative for me because before that it would have been
really implicit maybe in my personality that i just kind of didn't like being told what to do maybe you could say i sort of felt intuitively that like sometimes i don't understand why i have to justify where
i'm going when it's clearly very interesting uh in some objective manner but when we started discovering this principle in a more formal sense in pick breeder you know it's like the the discovery
somehow like intersected with my personality and then really launched me to the point where i might use the word tyranny or though i don't think we really went as far as calling it tyranny but but yeah it's it that then that
really triggered eventually you know the impetus to write a whole book about this because i felt that it's not a topic of discussion right now in in our culture
and i wanted to start that discussion at least or try to start that discussion because this isn't you're not seeing demonstrations on the street about this like you know getting rid of objectives this is this is a very very entrenched
status quo it's not even controversial it's that objectives basically rule our culture in our society and so i guess it aligns with my personality but i started to feel that there's a
preponderance of scientific evidence to go along with it and this leads to i would also say like just because you're asking me sort of about why like why did this happen i just want to point out that
it wasn't you're right keith that the the original paper didn't have that kind of radical bent to it like the white paper on abandoning objectives and it wasn't i didn't start feeling this kind of radical view until
i was talking about this for for years like i kind of got on a almost on tour because because there was such a counter-intuitive point like you should abandon your objectives in order to achieve them
like the discussions that i was getting at computer science conferences after giving talks about this became more and more personal until i started getting invited to places that weren't computer science conferences like one of the first one
was the rhode island school of design where i spoke to a whole group of artists and that was a really for me transformative experience because they were it was like a cathartic therapeutic
session i didn't expect that but all these artists were thanking me and saying finally like we understand why we're doing what we're doing we haven't been able to justify to our parents to our teachers we haven't been able to
explain why are we doing these things that seem you know pointless maybe you could say and i started to realize that like this is a way bigger issue than just like an algorithmic concept
that we have to think about in terms of you know should we balance objectives with some other kind of diversification and algorithms but this is actually something that's pervading our lives and actually inhibiting a lot of people from reaching
their potential and so i was like i gradually came to the more radical view and wanted to express it that way after speaking a lot with a lot of people and realizing that it's becoming personal was a really strange thing i
mean it was for me it was entirely an algorithmic point that we were making it's very very dry and very scientific and it just sort of dawned on me over time that this is not
just about science this is about our institutions in our personal lives and so i came to this point through a transformation that happened over years and and that's why i'm now you know and
sitting here acting like a radical about this it's not like i just immediately started out that way at what point does the the science no longer support your radicalism
because we we kind of have another pervasive problem in society today which is that people take science and extend it far beyond like what the science actually tells us it's kind of like oh if science
hints that you know five percent of the time we have a certain problem some agenda-driven people out there are gonna say it's actually 95 percent of the the problem that we have so
do you run the risk of kind of getting off the firm you know foundation that you had saying the white paper and moving kind of into areas where in fact you don't have
evidence to support that radical view it is a risk but we'd have to we'd have to really analyze you know whether this is this is actually happening like in the arguments that we're making
i mean clearly it's you know before you read the book it's a risk i i think if you read the book maybe not you because i think if you when you read the book you didn't feel that we got around that risk but but hopefully someone who read the book
would see that we were measured in our approach and actually tried to be realistic about this like in that we're not you know in the first chapter we're saying okay let's let's concede
that in many many cases objectives are effective and we should use them and this is really about blue sky types of discovery and innovation which is an important aspect of our
culture i mean we we thrive on that in this in in western culture and and also in other cultures as well and so it's very important that we run those kinds of things carefully
and people's lives are also affected by that and i think so we try to to circumscribe where we're making this argument clearly and maybe that that failed in your case but i think that i think that it is
actually realistically applied in this in our case but we could argue about the details there so i think that if it's extreme like the one of this 5 to 95 kind of like you
know just wishy-washy kind of talk new age kind of talk you could you could go in that direction but one of the reasons that we wrote the book was because we have all this empirical evidence
you know this is not like a new age feel good self-help book they don't start out with like a bunch of scientific experiments with empirical evidence it's true though that like the extension
of the empirical evidence into normal life we could argue about i guess we probably will but i but i think that those extensions are reasonable and what's happening here is just that the status quo is resistant
to a challenge because it's not used to it this is not the kind of thing most people challenge like just that hey uh if you're going to do something measure progress towards it that's just like obvious to 99
to people out there and so to say to strenuously argue that like that actually is not a good idea that's tough to swallow i think so i think what's actually happening is just
the status quo is naturally resisting the fact that it turns out not to be the correct approach in many many cases when we're striving towards this kind of blue sky innovation and that's tough to swallow it's it's
interesting the there is a there's a subtle distinction that you you make sometimes and one hand you say something like sometimes we should drop our objectives
in order to reach them right so sometimes we should let go of them in order to reach them and there are you know certain discoveries we want to reach the moon we should drop the objective of building a better plane or
something like this but then in in other cases for example in pick breeder there is no objective right so you drop the objective but
what you reach isn't necessarily what you want it to reach right do you see like uh is is this for you two different paths could i just have a quick go at that because the the moon one is an
interesting example i think kenneth would argue that all of the intermediate stepping stones for landing on the moon might not resemble the end goal pick breeder we will introduce that in
in a little while but that's a little bit different that that is a kind of human augmented evolution process where at every stage of the evolution cycle humans can select things that that
they like the look of well rocketry is an interesting one because in fact the intermediate steps looked exactly like the the end product of the saturn v you know i mean you can trace a very clear
evolution all the way back to like chinese you know sort of rocket fireworks and see how that was an optimization effort like to build that rocket that got to the moon so what i'm what i'm getting at sort of
is that if if you if you run something like pick breeder don't you run the danger of sure you'll reach something but is that thing that you'll reach any
any useful yeah yeah yeah yeah i get it and actually there's there's two parallel threads in this in what we just discussed so that makes it that makes a little tricky to answer because there's this this one question
that yannick is asking about like the difference between becoming less objective in service of getting to in effect an objective versus just being less objective with the knowledge that we don't even know
where we're going but it might be somewhere interesting that's one interesting dichotomy and then there's this other issue of like okay if if we did do something really really impressive like get to the moon
which we did we could argue about whether we were being objective or not objective like that's another issue actually and and i agree actually with what tim was was saying about that
latter issue that i think a strong argument could be made that the stepping stones that lead there largely do not resemble there of course though that's going to be an argument because it already was we just started
arguing about it but i think that's a little bit orthogonal to what yannick was pointing out so i'll try to get i'll try to first address that which is basically yes that it is true that the the
the ultimate principle that that the book is advocating does lead you to a point where you don't know where you're going and therefore it's not designed to produce a particular
outcome that you had in mind before you went off on this journey and is basically saying that you can maximize the potential for producing something really interesting
if you're willing to drop knowing what that interesting thing is going to be now in the long run what that is is basically you're generating stepping stones so it means that all the things that we
hope might happen in the future become more likely but not any particular thing because the more stepping stones we have the more places we can get and so it's true that the stepping stone you end up on may not be where you
planned to go but i would argue that like you shouldn't be planning to go anywhere anyway if you're going to take this approach because what you're actually trying to do is just uncover stepping stones anyway but
in aggregate because we're all uncovering all of these stepping stones that otherwise wouldn't even exist we have now made possible a whole new generation of stepping stones some of which will be things that we
hoped would happen in the future yeah this is what's opening up and making those things possible ken made it clear in in his book that even though the world has become saturated with objectives they're not necessarily a bad thing modest
objectives you know for modest goals are okay but really ambitious objectives that require discovery creativity innovation
and even happiness those are the things that we're talking about that require a slightly different approach now there are lots of examples of inventions that if you look at the intermediate stepping stones they don't resemble the final
product at all so kenneth gave an example of vacuum tubes leading to computers or microwaves being developed from radar technology or even the internal combustion engine
leading to aeroplane engines and the really fascinating concept here is that creativity could be seen as a search problem and that implies that we're searching through a space of things
and this is what fascinates me because if we are exploiting something if we're building a bridge and we already know how to build it we can kind of visualize all of the intermediate steps from now to completion
but when we require creativity we're actually searching through this space of possibilities yeah so i want to respond to something though so the first computers were not built with vacuum tubes they were built with gears and machines
right and then so the story of computing has always been one where people wanted to do machine based computing and they surveyed the componentry that they had available to
them and then they put those together at a large scale to do to do computing right so i view this it's two sides of a coin you know you've got a bunch of people out there tinkering in garages thinking
outside the box trying to violate the second law painting weird things great right and then when somebody needs to achieve a particular thing they look at the toolbox of stepping stones and they put those together in
some plan planned organized efficient large-scale way to achieve that goal these are not like and actually the points made in the white paper that these are complementary
kinds of avenues right you really have to be doing both at the same time so it seems like what we're really arguing about here is is saying that you know in western society today we're just funneling too much of a proportion of our
resources onto objective pursuits and suppressing creativity i argue we're not really suppressing creativity to start with but we're arguing about really a
balance of resources here aren't we well this might be a good time to introduce this concept of deception so i i think if we have an objective approach it blinds us
to the stepping stones we should actually be taking yeah so i think that you know keith is really arguing for the status quo like this that's not a very controversial point of
view like basically you're saying that okay we we uh things are going fine you know we we set objectives and this often works well and we should have some exploration nobody's
going to find that controversial everything's going fine and i think that the status quo is very easy to argue for because we we already have a consensus on the status quo and we like it
we think we're very smart and that's working well and so you have to give some room here for the possibility that the status quo actually is not the best way to do things and it's true that deception is is a
real enemy of the status quo and it's not recognized how pathological and serious it is like this idea that it looks like you're moving in the right direction but actually you're heading towards a brick
wall and we could be sidestepping this kind of brick wall much more often if we reject the status quo which is this confidence that we know when the switch should happen which is i
think what really is implicit in what keith is saying is that okay you're recognizing and acknowledging that yes to some extent creativity innovation or exploration or things that should be part of what we do
but i'm arguing that look we just give it lip service right now because we don't we may we may all think those sound really good on paper but the gatekeepers do not actually reward you for doing that like in all
kinds of institutions that's not the way the world actually works this is true in publishing this is true in grant writing they want to know what your objective is they want to see metrics that show that
you're moving towards that objective it works just like some kind of naive local optimization problem in almost all these cases which any expert on search would know is just completely naive and yet we run
society as if this actually makes any sense at all and so somebody has to push against that status quo just start bringing it down so that we can actually move in directions sometimes not because they're
causing some metric to go up but because they're simply interesting and they're opening up possibilities to move in directions that will sidestep the brick wall of deception and actually get us to stepping stones that are going to get us to places that
otherwise we are not going to get and so all these gatekeepers need to really really change their behavior and we're not going to get there by simply saying yeah yeah you know things are going pretty well here like we can
anybody can go explore if they really want to you know but like mostly things are good because like we have people making innovations using objectives anyway yeah that's the status quo things are just going to stay the same and
and i'm arguing that look that's actually not the best way to be doing things that were far the pendulum has swung far far in the other direction to the point where it's at this point it's absurd like i said we're running
this like as if it's like the most naive optimization problem for like the most complex problems that we have in the universe and this just doesn't make sense at all first of all i'm on record actually in
this show multiple times advocating against certain status quo for example i don't like the one-size-fits-all education system you know it's it's absurd to me that life is filled with
all these niches and instead of figuring out what somebody would be good at and what they're passionate about and giving them the tool set to get to that niche we try to funnel everybody down some uniform
you know path to college four years whatever right so i think i mean what i'm hearing you say is that and i i agree that that the pendulum has swung too far towards objectives
uh and i and i think i said that you know maybe it's a balance of resources question so i think what you're saying is the only way to rebalance that is to have this extreme
viewpoint you know very agenda-driven kind of shock shock the gatekeepers and to kind of moving moving away moving the pendulum towards that that's fine i i worry that the science doesn't
necessarily back that up and there's this arguable point about you know what is novelty what isn't novelty what's interesting what's not how much resources should we be putting down there so i get it uh but
i think at the end of the day we are arguing about a balance it's like okay we have 95 of our resources heading towards objectives that's a problem maybe it needs to be
80 20 or 50 50 or who knows whatever it is so and that i would agree with it's not really that when you write a grant
oh 95 of the score of the grand readers is going towards you know can you reach your objective but right now it's a hundred right it's much more of a this is the way we do things right you want to reach
this goal i give you money for reaching this particular goal and that's just the way we write grants i mean yes and no but some of those grants are to exploration objectives
like people do get grants to do exploratory research what's on everyone's mind is wait a minute isn't novelty isn't exploration yet just another objective right
that you're somehow maximizing you could argue that kenneth is advocating for an objective but it's the only objective which does increase monotonically right because all i think the argument is is
that all of the objectives we have at the moment create convergent behavior not divergent behavior and when you have a novelty style objective you can monotonically you know optimize on it and it works but
i wanted to comment that i i agree that there is a preponderance of objectives in our society and i don't think people realize how bad it is whether it's in education you know we want to monotonically increase our grades
whether it's in science we have to have this convergent group think because all of the peer reviewers have to approve something whether it's wanting to obsessively improve gdp metrics in our society it's absolutely pervasive
i don't think anyone in this conversation is binary and wants it to be one or the other i think we just want to have a slightly more open-ended approach yeah yeah i mean it's true that we are
we have common ground on the idea that this is a question of degree i mean that that's that's clearly true so it's not like you know keith and i are in complete opposite ends of the spectrum like there's a matter of degree issue going
on here and so to to to some extent this is actually a political discussion and some extent it's not actually a scientific discussion this is about
how to affect social change in some sense political discussions here i mean so so i and i and i come to this as a scientist not qualified probably to make
a good political point here but my anyone who's qualified to make political points yeah that's fair that's probably true but so my my intuition about this political situation
is that actually we need a really big push we need a big we need a big push to budge us out of our rut and and to get us to that more reasonable balance whatever that may be 50 50 whatever you
want to call like i don't really think it's actually about the percentages as much as the types of problems like i think it's more that like if you're a stepping stone away if i mean this sort of the way we put it in the book then by all means use
objectives like there's a lot of problems like that let's recognize that let's not make that 50 50. let's make
that 90 10. you know like that should be mostly just objective driven but if you're many stepping stones away which really means i don't have any idea what the intervening stepping stones are
like the three the four the five the 500 that are going to lead to that thing that i really want to see happen then forget it then it should be the exact opposite ratio and so and so we should be principal now
and recognize that this is actually the way that search works and we just totally are not that way and so as a political point i do think that it's important to push in a radical direction here and say like you know we really need
institutional change because like like the others are saying that yeah it does it is true that like in a lot of realms like in grant writing which i'm very familiar with it's near 100 i mean you're right there
there's there's special programs and things there might be some exploratory grant program or something like that but like you know in a conventional program that there's a national science foundation or something great if you're just like this is really
interesting i don't know i can't tell you what we're gonna accomplish but like it's just really cool we should just look at this it's funny because like this is that actually was the case with pick breeder itself
like i i did try to get funding for pick breeder so maybe i'm here like you know bitter about this now this is why i'm sitting here arguing this but like the pick reader was just shot down for this very reason like i have
exact wonderful quotes from from our reviewers about it's not clear what this is going to accomplish like why are you doing this we don't really know and and what's funny is that like i
wouldn't be sitting here having this conversation with you guys today we wouldn't be having this talk at all which i which i at least think that we would agree is a valuable discussion to have if we had let those gatekeepers be the
arbiters of whether this thing should get out there or not all right so i i kind of get what we were saying now which is that you know the sort of guy in the garage the starving artist
the uh out of the box thinker there always has been kind of a stigma of like yeah yeah they're just kind of over there little small things doing their thing we can largely ignore them right and what you're kind
of saying is that society needs to put much more value on that process and understand you know the fundamental contributions it makes towards other progress that we may objectively
care about let's say but so let's suppose we get to that world like everybody goes yeah great we we love novelty search we still have to allocate funding right so when somebody comes along and says
you know what i want to research how cheese whiz can be used like in an agricultural setting or just explore the properties of cheese whiz like for that matter like do we decide
to fund that and and how much do we allocate like what how do we do societal novel research but in a way that
at least gives us some you know some hope to progress along goals that society cares about you said oh we want to encourage the artist in the garage and so on and i think what we really want is for this
no the starving artist and the tech guy in the garage okay what we really want is for this kind of treasure hunting just to be the modus operandi in all institutions i understand i think that's what i said which is that
we need to understand as a society that that type of effort needs to be institutionalized but okay suppose we get there how do we allocate funds and time let me just quickly talk about
how it works at the moment so i've worked in the tech industry for many years and what i see is a tragedy of convergent thinking so for example that we might be building a big data platform in some corporation
and when it's designed everyone is saying well i i want to make money this is my financial year if you don't if you don't give me some metrics by next year then you're not allowed to do it
and what that means is that all of the really innovative things that we could be building we can't even conceptualize because we're constrained in the way that we've designed this system wouldn't it be great if we could advocate for a treasure hunting approach
where we could allocate some budget now how we allocate that budget to your point keith i don't know maybe kenneth can tell us but if we just allocated some money just for an open-ended search just to create a
fertile environment for search and innovation even though we don't know where they will lead but i guarantee you let me say that another interesting thing by the way kenneth is we've been playing with gbt3
and it's this uh it's this language model and it's it's deterministic by default and if you if you sample it deterministically you get stuck in you know like a kind of greedy sampling
you get stuck in loops and you don't really go anywhere and turns out if you sample it randomly or do a beam search it produces better language who would have thought it yeah so
and actually i i've had a lot of fun with gpt 3 as well after having joined open ai so i have the the privilege of being able to access it so that is that is a lot of fun so actually there's yeah there's several threads going here and i also want to
acknowledge that we lost the threat of whether novelty is an objective that yannick raised so maybe we'll get back to that one but i just want to address this because i want to address this this point now which which does seem really
important about how do we allocate funds like what is the principle here and you know the thing that i think that that we're missing is that we have invested enormous amounts
in people through our through our social investments like educating people we're talking for decades like 30 years like to get to a phd or something like that to get them to a point where they're supposed to have intuitions right
scientific intuitions like they're not just supposed to be able to look at a chart and see that oh algorithm x is actually performing better than algorithm y by a significant amount like people love
this significance computations right because it's like well i can really hammer somebody on that like you didn't use the correct method like to calculate significance or something like that doesn't take a phd to do that
the phd allows you to have intuitions about what are the directions that actually have promised and what's amazing about us using these like relying so much on these metrics is that it is
basically saying that we are not allowing people to use that education that they have because we don't trust them after 30 years of putting money into them of investing them in
in our social investments thousands and thousands of dollars maybe millions we don't trust them to have any intuitions at all the only thing we trust is a significance test and so at some point we do need to say
because this person thinks that this path looks interesting that we should listen to them now how do we know of course we don't just say well you have a phd so we listen to you no problem like you can do whatever you want here's a million
dollars that doesn't make any sense you've got to get your head out of the clouds ken you can't be talking like that i think what we do need to do is start to understand that we can have discussions about why things are interesting
that's the thing that i think we can change like we are at a level together if we have the right background i'm not just talking about phds either i'm talking about experts in any field
not just phds you know whatever it may be it could be athletes it could be great chefs like whatever you're talking about like these are people who have huge amounts of experience which gives them the credibility to have these
discussions and of course they should have these discussions with their peers not with a random person on the street but with their peers who similarly have that kind of a background and have been recognized for having that background about and
this is a very very it's a very unsettling and uncomfortable topic for a lot of people but what is actually interesting and the and it's true what happens that enters into those
discussions is called subjectivity and scientists just run away when they hear about subjectivity uh oh it's not objective like i can't talk about it like this is getting me scared
let's go back to some bar charts and significance tests but come on like ultimately everything that's ever been done that's interesting ultimately boiled down to some level of subjectivity it's just that we covered it up because
we're not allowed to talk about it maybe we could try out like a a a little committee an interesting voting system interestingness voting system it's like you propose an idea and then anonymously
a bunch of your peers vote on whether or not that's interesting no that that's still convergent that's the last thing you want well but ken saying that we need to trust we need to trust intuition on what's interesting
right it's moving in a direction that i think is interesting i think that like the you know what you might get i mean maybe this is true there's a convergent aspect to that kind of voting type of thing voting can be dangerous we've seen that in pick breeder too
there's there's experiments that show you get much worse results if you sort of vote about what's the best picture but the idea that we can actually ask experts to discuss what's interesting i
think is actually interesting like it could lead to things i've heard proposals even recently things like before we go to the results section like papers get through an initial filter just based on whether the idea is
interesting and that could be that could lead to something different in terms of outcomes so yeah we're able to have these discussions about what's interesting we can do it informally we're just afraid to actually air
these types of things in any formal setting but aren't you scared that any any filtering for interestingness and you admit that interestingness is subjective
how can you even set up a filter for interestingness because that author might think that paper is really interesting yeah so that relates to this idea of the committee and the convergence and things like that like what is actually the
right way to maximize our gains from elevating interestingness in this way and i and i agree that this should not
be the only way we evaluate things ever of course like again we're back to balance and degrees here so i don't think all of society should work this way but to the extent that we put some resources into doing things this way
i think that we have to think about it as why is what is it that makes systems like pick breeder work so effectively open-ended systems in general and what
it is is basically it's the connection between one stepping stone and the next it's like some things can lead to very very interesting things down the road but somebody has to make that connection
and so i think what the issue is is that it's not a it's not a voting issue because like the majority is not going to make that connection like if there's a very very salient stepping stone for you just for
you this could change your life like if you look at this picture of this butterfly or something like this it will not be true of the majority of people and so we will never get to follow down that
path if it's a majority rules type of voting situation we need you to be able to follow your special special instinct because that was the thing that could change only your life that's the thing for you
and and this is about what's interesting this is totally subjective i mean that's just you like butterflies and so what we need to do i think is create systems where people put in are willing to put
in put a stake into things like they're willing to put something up to say look i'm willing to give up some of the bonuses that i have in my life if there's a grant system maybe you're even willing to give up getting a grant
to say i'm going to stick up for this thing because this if this gets done it's going to mean a lot to me because it's extremely interesting and i want to see it happen and so we could set up a system like that and then we know because the stakes are high
that people are not just sort of gratuitously or casually making these decisions they're making these decisions because they matter to them and because these stepping stones resonate with them so we need to carefully set up the system
but i think then we can encourage people to have these kinds of elevated discussions and bring out the fact that actually this is the thing that humans are best at it's not like just producing a bunch of objective material
objective formal results that's not what we're great at that computers are great at that they're better than us at that we have a nose for the interesting that's how we got this far that's how civilization
came out that's why the history of innovation is so amazing for the last few thousand years and so let's elevate that let's talk about it seriously and give each other credit for the fact that we
can do this both on both ends both the person making the proposal and the person critiquing the proposal we can talk about what's interesting but we just don't need it to be consensus driven because that's not how stepping stones work
i think that's the key point we need to have autonomy we spoke to max welling on friday and he said that when his students come up with an idea no matter how bad it is he will zip his lips and he won't say anything he'll let them explore it for a
month because you just never know what's interesting to you might not be interesting to me and when you have this kind of um group thing when you all get together and decide if something's interesting or not there'll always be a clash because it
won't be interesting to someone else and then that will cut the idea off and one of the things with pick breeder and also you mention a natural evolution and with some of your approaches like the poet paper it it's all about creating a
divergent search and in evolution and in poets it's about actually creating problems and solutions at the same time allowing you to create this beautiful space that you can just continue to
explore yeah exactly yeah autonomy is is a really big part of this and it's it's destroyed by these committees committees and autonomy just don't go together
you know and grants are basically run by committee and yeah it's it's exactly true and something we saw very strikingly in pick breeder that like the reason that everything is being discovered there in that
particular search space which is actually a very relatively simple search space compared to say the space of invention which is like thousands of times higher dimensional but even in this small lower dimensional space
where those images are you know between dozens and hundreds of dimensions in terms of what defines them underlying so it's a relatively low dimensional space even in that space like we needed to give people room to follow their interest to their bitter
ends without some committee intervening and saying actually we don't think it's interesting to go down this path sorry about that like you should actually look at this image instead and we know that because there was another project and i hate ranking on it
because i thought that was a really cool project too it was called the living images project which got a lot less attention because it was the exact opposite it was committee driven so it makes it really a really great contrast like it's
a really nice metaphor for the way we run society because the living image project basically works like the nsf you know they they'd show the palette of current leading images and say okay let's have a vote now by committee we're
going to figure out which the most promising ones are and it just led to a bunch of wallpaper patterns or probably even worse than wallpaper patterns because they didn't even have like symmetry it's just like kind of
pleasant looking blobs after this like thousands and thousands of votes were cast and it's you know it's because like the thing is like everything washes out when we start ruling by committee like we have to
allow people to follow their passions to their extremes and of course most of those bets won't pay off i mean you know venture capitalists know this that's the nature of investing we're talking about risk ultimately here
like risk has to be tolerated in order to make great discoveries so it's not like oh yeah every time anybody's allowed to follow their interests like great things are going to happen most of the time no but it's worth it to
maintain this portfolio so that we can get them to go to their extremes and see these amazing results so yeah we've got to start getting rid of these committee driven types of systems and look at different types of systems
which is why things like having say 5 000 grant reviewers instead of like five but you get your grant funded if like two people are willing to put up something really big to say like yeah this is
gonna change my life and that's enough and so that's a much more like pick breeder-like world than the living image type of world crowdsource funding you said before that your personality
naturally matches with this kind of idea and the things you've discovered do you besides your personality just matching do you make deliberate steps deliberate
attempts in your like life to uh to implement these ideas of less objectives more exploration
and so on uh i think so i i mean obviously discovering these principles helped me to justify being that way more so maybe i'm more like that now
since i've been improved my ability to make this kind of argument maybe i've still not optimized it completely because keith makes some good points here that show that we still have to defend this but but still i've obviously spent a lot
of time thinking about this so now just easier for me to say well the reason we should do this is because we've thought about this for several years and here's why we're going to do it this way but yeah like something like pick breeder is an example you know
pick breeder i had no idea what the heck we're going to see from that like it was like i don't think so i don't think most objectively comfortable researchers would want to make a pick breeder
because it's like completely unclear what the point is it's like the the most sale the most alien point is probably that it looks like a fun toy but that's not usually why we make things as researchers it's just because they're a toy but that wasn't my
motivation at all my motivation was entirely that i was pretty sure that something interesting was going to happen i was like if you just release like thousands of people
online to just explore picture space like we're gonna see a phenomenon that we've never seen anything like i don't know what phenomenon it is i had no idea but it's just gonna be so cool and there's no question that something
like will fall out of it that'll be valuable because it's just so interesting and and lo and behold like you know my entire career it was diverted because of doing this
and indeed like the nsf committee was was totally turned off as you would expect because like it's not clear what the point is so yeah i think i i make decisions because i think things are interesting and i'm not and i'm not afraid to
like i will tell people who i work with like i think we should do this because it's interesting i don't really know what we're gonna find but we need to find out and it's pretty clear but i won't just say it like that like that's the thing is that's the naive the naive view
and the argument against doing what i'm saying is to believe that that's the whole argument somebody like me would make like hey this is interesting like you should follow me just do this because it's interesting like obviously i need to justify it to
that person you know so like if i have somebody that that i'm managing or advising and i'm saying here's something i'm proposing that we should do together i will really try to get into the details of why it's interesting i'm
not just gonna say it's just interesting let's just do it but it's like look these are all these opportunities that this opens up that basically create a whole new playground of opportunity which is arguably going
to reveal things that nobody has ever thought about before and i'd get into the details of why and if i can't persuade them then i them then i'm satisfied that i don't have a good argument you know because i fully believe that
that we at our education level or experience level in our fields because like i said you don't have to have a phd you can be in some other field that that we can have an argument like this with each other and and be completely reasonable
about it even though it's not ultimately objective and so i think people should have a right to hear my case why things are interesting and reject it and that's totally reasonable but if i can get it to resonate with you and you want to follow that stepping
stone with me then i think we've made a great connection and it's much more exciting than that i have a bar chart with a significant result on it just my two cents on communicating this because i think
part of pushback you may get from some people is that if you don't if you're not careful to make a distinction between divergence and interestingness you know
because a lot of divergence is not interesting like oh look i have a a pink dot on my shoe congratulations i don't care right and so in the paper you kind of
make this point too which is that yeah look not all not all divergence not all novelty if you will is actually interesting and and it's an open area of research is to defining
what exactly is interesting or coming up with this objective that helps to find novelty that's actually interesting i think that that may be some of the nature of the pushback because we've all been around
you know new age people that are like look i've decorated my house and crystals you know and i'm like yeah like i couldn't care less right i think novelty is the wrong word i think kenneth said in much of his work
that a better description is being an information accumulator so for example you can keep crashing into the wall in the maze but eventually you know what what if i could actually do something more interesting in my policy and in order to do that i would
need to accumulate useful information and it works really well because even in evolution there seems to be an arrow of complexity that goes through the evolution of life on this planet
and being able to breathe air and being able to see photons of light being able to climb trees all of these things help us exploit our environment more effectively yeah but i want to point out there's way
too much context shifting both in the paper and in the book jumping back and forth between information versus interestingness versus novelty versus complexity you know that these are not the same
thing and and there there's there's actually are examples in both the book and the paper of context shifting i mean the sort of logical fallacy of starting off in one context and then sort of
magically alighting over to like a different one so these are not the same concepts and and i think the one that actually will resonate with more people is interestingness because tons of
information is flat out boring okay like the exact precise trajectory of a missile really not excited with that right or what you ate today for lunch
you may be highlighting useful strategic angle here i'll acknowledge that how how do you effectively communicate these points and i do think that it's worth thinking about that like the
book is an attempt to do that and it's true that some people are are turned off by it so maybe it could be more effective i have to agree with that and but i don't think i think what
you're calling context shifting for me was an attempt to uh provide multiple different angles of argument for to support this so like when tim mentions
information accumulation which i'm really uh flattered that i mean do you remember all these things i've said so that's certainly i'm happy to hear you making those arguments because these are these are arguments that i put out like these are just different angles of
attack from making the argument for divergent non-objective type of search and the information accumulation i think is really interesting and it's complementary in my view to arguments about divergence and
interestingness and so forth and not like some kind of you know sort of dishonest uh subtle shifting uh trying to move the ball around but just trying to give lots of different arguments and and the reason
we wrote the book that way was the book was really meant to be a weapon it's because people who uh are the gatekeepers who are in charge of the flows of funds
that control whether certain things get done or don't get done are generally very objectively driven and i wanted to give people who are trying to satisfy those
gatekeepers as many possible arguments as they could get and they could choose and pick and choose from those arguments and that's why the book is not written in necessarily the most crowd-pleasing way it's more like
look i'm just going to put out here all of the arguments that we can think of and you're going to be able to use them and so i understand that for some people it's just a turn off from the start
so it's really this is an honest this is sort of an honest outpouring outpouring of my thoughts about this without selling out and just trying to make something for public appeal which is something that we considered i
mean we went through literary agents and things that had all kinds of suggestions for watering this down so it'll appeal more but then i just started to feel like it was turning into a sellout and so that's why it's this weird
mishmash of like self-help like combined with like empirical scientific arguments you're like what kind of book is this and who is this for and it's not really for anything in particular it's just meant to be a weapon
i i get it but by the way i'm not suggesting that you dull the weapon i'm suggesting that it be sharpened because for at least some people out there like myself i'm not i'm totally not opposed to the idea of the importance of you
know interestingness but i really can't abide sort of illogical arguments and so when when i see things that don't logically follow non-sequitur context shift you know
appeal to emotion you know begging the question et cetera et cetera to me it becomes like less effective and so i understand it's a balance you know you're trying to trying to create a weapon and you know there's a bell curve out there along
every single personality attribute and and whatnot i get it i'm suggesting that a focus on interestingness alone is a very sharp kind of tool to that
just to that point because that that is a good point actually i think that interestingness really is central here it's a very important part of this and it comes up a lot in these kind of discussions i mean
it's related to like how you what should guide divergent type of search processes i mean you're following a gradient but it's not the gradient of objective improvement so what gradient
is it and it's true that i think the gradient of interestingness is one is probably the best expression of like the ideal divergent search but then you know you get to this problem that like i don't know how to
formalize interestingness and so it it leaves it a little bit a little bit less concrete than what we have in like normal machine learning obviously we know exactly what gradient we're
following and so what you get to then are proxies for interestingness and so things like novelty you know the good thing about novelty is that it's true that not everything that's novel is interesting
but just about everything that's interesting is novel and so it's like a pretty good like rough heuristic for getting to see some of what in a true interestingness driven search would
reveal which would be even more powerful now something like pick breeder is much more kind of like close to like actually following gradients of interestingness because it's human beings that are making the decision and humans arguably make decisions on what's interesting
when they look at pictures but we all have different definitions implicitly about what is interesting which makes it even more interesting i mean that's why more stepping stones for you are different than the stepping stones that i would follow which is why it ends
up being divergent but it's very hard to formalize this right it could be even ai complete like to really get interestingness in that sense like to be in it just run completely autonomously in a computer and so we need to look for
proxies and also we also need to recognize that interestingness is a subjective concept and therefore we really never will formalize it in an objective way so we have to accept some degree of
subjectivity as we try to formalize these sort of gradients of interestingness and that sort of from the algorithmic perspective what we should argue about is like what level of formalization is
acceptable here that is likely to lead to an unfolding process that's really cool in the long run and it's okay that it's partly subjective could we formalize what it might look like because
being very naive the helicopter view is that we want to have we definitely don't want convergent systems we want to have divergent systems that are rich in information and also we don't want to have
shortcuts you mentioned some perverse incentives like for example in in india they paid a ransom didn't they if you handed in dead snakes but that just led to snake farms and a similar thing in hanoi with the
rats people started breeding rats and it you know that the actual opposite of of what you wanted to happen happened so we don't want to have those shortcuts we don't want to have those perverse incentives
so if i did design such a system and i had to design some measure of interestingness or information accumulation and i analyzed this system what things would i be looking for as a measure of success
yeah so i think we have to distinguish between systems with humans in the loop or not because so which one are we because i think with humans in the loop we we might we might trust to some extent
that like there's a human brain involved so that helps a lot this is fascinating we have quite a technical audience so we can we can take the level up about one step so it was using these cppns these
compositional pattern producing network and with neat that was a genetic algorithm for evolving the weights and topologies of a neural network but in pick breeder it was an entirely human
supervised process so presumably you did know how to uh cross over two networks but it was supervised by the humans every single step of the way yeah yeah it i mean part of need is in
pick breeders the neat proceeds pick reader like so so one of the things that allows pick breeder to do what it does is the fact that the networks are becoming more complex as the breeding process
goes forward because there's a neat like algorithm under the hood except as you say there's a big exception which is yeah the selection decisions are not being made by neat they're being made by people
so so yeah it's it's like a hybrid of people in need together it brings out an interesting no pun intended interesting thing in in that i don't know what your opinion on on
trolls and trolling is but i just think of trolls as people that do things primarily because they're interesting to them and they almost shine no effort to
just do something interesting and i'm just i'm just wondering what if a group of trolls decide to troll pick breeder would they deliberately do non-interesting things
is like or or because that would defeat their purpose or would they do interesting things and then that would you know defeat their purpose as well it it's like you've made the perfect system against trolls
that's i had not thought of the connection with trolls that's true trolls or trolls have a sort of instinct for interestingness of their own and and it it can actually lead to things that that we wouldn't otherwise
see or think so so trolls have a value trolls and pick breeder is an interesting question they you know if somebody was adversarial towards pick breeder like a really bad troll
like really trying to to wreck it or something i guess they could really try to just publish all kinds of garbage i mean i guess that's what you would do but it's not going to be that effective you know because the system
is designed basically to try to raise things up that people find interesting so people just aren't going to respond to that kind of stuff if you had a troll army i do think you could probably wreck pick breeder
it's thousands of people and then they sort of work together to try to vote up things elevate things that are just horrible yeah it would wash out the stuff that we're trying to elevate so people could find
like they could mislabel things intentionally like we people put keywords so if you found a face you could write face well if they called it like a dog i mean it's gonna mess things up so yeah i think i think it with enough
effort we could screw the whole thing up but most likely like one troll is totally harmless like this stuff would just never be would never rise up and we wouldn't see it and then there's the other kind of troll which is somebody who just has different
interests he just says like all these images you know there's skulls in these cars this is this is not the kind of thing that i think is cool i think like these weird line patterns or something there should be a lot more of those
that is healthy i think that's great you know have that have that divergent branch of the tree of life and let's reveal it there were people like that there was this guy robert that was his screen name because
he was so memorable because he created thousands of images an unbelievable amount of work involved but he was doing these line studies like he was trying to get as straight a line as possible you know like with the activation functions in
these cppns it's kind of hard to get things that are just straight so this guy was really obsessed with like squares that have real corners you know 90 degree angles like just told these whole giant studies
and it's like just totally different for what everybody else is doing but it's great because like like branching off of those things leads to a completely different world and change my view of what a cppn or even a neural network is
you know because once you get into those spaces with these sharp angles and straight lines you st you know people were saying well there's this like pick breeder style it's kind of this like curvy like kind of blobby looking kind
of style that like everything has it's actually not true thanks to this troll we know that you just need to break into this part of the space and you've got a very sharp well-defined kind of style
so yes trolls if you call that a troll i think it's it can be useful and interesting that's super awesome so i i do want to get back to though to the kind of it seems to all hinge on
very much kind of a definition of interestingness and how what or what proxy you choose for tim mentioned for example this if you crash into a wall repeatedly you might
once go around it but then again you know if you crash into the wall the 22nd time that's novel to crashing into the wall the 21st time and maybe the 23rd time the world is suddenly
going to open up you don't know right as long as you don't define some notion of novelty everything is novel right yeah that's true yeah this is uh
this is an issue uh that ultimately these algorithms depend on some definition of interestingness and even beyond algorithms then like if i'm if if we're advocating doing things like this in the real world
gatekeepers like eventually confront this issue of like yeah the open maze like just running off into oblivion what's the point of that this has to do with notions of open-endedness and like what makes them work
and it's true that you do not you do not have a successful open-ended system just by virtue of the fact that it's open there has to be some constraint there's still some constraint because we have finite
resources in terms of space and time like if we had infinite resources which we don't but hypothetically i don't think this would be a problem we would literally just collect every branch of everything
forever and we have infinite space and time and so we'll just do it and everything great would be revealed of course most things that we see would be junk but who cares we get everything great too but we have finite resources so we have
to have constraints so and then that leads to well what are what are the constraints like a lot of the time when we talk about these things like people think of it more in terms of like expressing what i like like what do we
actually want i think of it more like well we need to express what we don't want because everything else is fine let's see all the stuff that if it's something if it's not something i don't want it could be really useful and in fact i
have a problem even with things i don't want because they might be stepping stones to things i do want so being too too careful about this in some sense could actually wreck the process but
ultimately there are some things we have to exclude we cannot afford to look at everything that's just going to be random search and so what should we exclude and it goes back to notions of interestingness
and absolutely we should we should really dig into this and i mean i don't know on the show we can do it in the show but just in general it's very important for the field to dig into what does it mean to be interesting and you know i think evolution is somewhat
instructive because evolution does have some constraints and evolution is an incredible open-ended process like the process that produced all of living nature which is as i like to say literally biblical in
terms of what it's accomplished and so you can look at it and and see that clearly it does have some constraints like there are some lineages
that are not followed they fail and and and yet like if you look at the the reasons for that mapping that to like the word interesting is a pretty large distance you know like this idea of
survive and reproduce and then interestingness it's not completely clear that those are like directly connected to each other and yet it seems perfectly good at finding things that
are interesting like i'd argue everything's interesting in nature that could just be me but that sort of illustrates to me one one thing here is that you don't have to be perfect about this
like if you can have a reasonable constraint and we can talk about what that means like what is reasonable there's a lot of thought that we can have behind that but you can do pretty well here like it doesn't have to be perfect
we don't have to really really formalize the notion of interesting down to like the last iota of information because we don't really know but as long as it's reasonable like something like survive and reproduce can be incredible in terms of the power
because of the power of divergence and the fact that it's pruning out enough junk that it's going to produce mostly cool stuff so i'm glad you made this point because i remember from the book it
brought up this example of imagine a you know a world in which there was no selection pressure like uh and all organisms survived like how amazing that would be and my thought was it wouldn't be interesting at all because it'd just be
a massive gray goo consisting of like nearly identical bacteria that reproduce at the fastest conceivable rate like if you don't have that minimal selection you actually don't end up with an
interesting world it's kind of like white noise is not interesting and neither is a constant number like something in between those two is interesting yeah yeah i mean that was meant to be a
thought experiment in the book which we called it the gentle earth because everybody gets to have a baby no matter what so it's like yeah you don't need to have sexual organs because god will make a baby for you
in this hypothetical thought experiment that's right and it's true that like yeah the vast majority of the world would be grey goo i'm totally with you on that but uh what what i thought was interesting about the
thought experiment is that we'd still have all the cool stuff we have today there would be humans as long as we had the same mutation distribution so it's like you know that's part of the distribution of
of mutations that happen because it did happen and so those paths would be followed in addition to the fact that 99.99999 of those pads are just goo inert blobs flying on the ground
and that is just weird to think about you know because it shows that like our existence is not entirely explained by selection like part of it is actually about the structure the a priori structure of the
search space that this universe has defined for us that we have exploited through evolution and i find that very intriguing i was going to ask about that in the relation of cppns because presumably there is a
kind of prior in in the the the activation functions and so on but but coming back to evolution you said a really interesting thing one of the things that concerns me is that in a way it's depressing
that you can't shortcut innovation and discovery there's this huge array of stepping stones that you need to find and in evolution if we wanted to turn the clock back you made this
thought experiment in your book let's say you wanted to pick all of the single cell organisms and you wanted to breed an einstein so you're just going to start from the very beginning and you're just going to breed all of these things together and
none of the intermediate stepping stones look anything like intelligence actually we're trying to create artificial general intelligence at the moment and probably none of the intermediate stepping stones that we see now will in any way resemble
artificial general intelligence and the depressing reality is that it is a search problem and is it is it safe to say that when we talk about evolution what we
actually want is maximal exploitation of our environment because there's a trade-off between exploration and exploitation and we talk about wanting interestingness or novelty or whatever it is but
isn't that just a proxy for exploitation of our environment that is an interpretation of what makes evolution powerful that might be even just thought of as like a restatement of different ways of saying like what's powerful about
evolution like exploitation of the environment or the satisfaction of a minimal criterion but but the thing that's missing from that that i think you have to remember is that we are part of the environment
so like there's more to it than just that like that maybe is one of the critical elements that is to get to einstein there's that objective paradox and so you can't get the einstein by trying to exploit the environment but
it's like not the only thing that you need to get there because there's these other principles like the fact that we are part of the environment ourselves the environment is not a static thing the environment is us and so we are creating the opportunities
that can lead to einstein it's like the trees and the giraffes you can't have giraffes if you don't have trees you can't have einstein if you don't have other people to uh appreciate and interact with einstein
and so this there's a whole set of principles that need to be in place to be able to get to something like einstein through an open-ended process and i would argue that we still we still don't know
uh what all of them are like this is an open area of research do you find it depressing though that someone from the future can come into his or her time machine and come and talk to us today and there are all these incredible
inventions that have been invented in his or her time and this person wants to imbue us with all of these cool ideas and it's kind of depressing that we can't take any shortcuts
ah yeah that's yeah there's plenty of shortcuts many yeah if you knew where where you would you know where you'd be going i think this brings us back to this to this notion of
of what if what if we have an objective but maybe it's far away like how do we reach it and that that actually brings me to two algorithms of yours or where you've been involved that
fascinate me it's so the big one is go explore and that's as i see it part built on on map elites so
and it just brief for people who don't know if and correct me if i'm wrong basically what map elites does it sort of subdivides
a search space into cells and then what you want to do is you for each cell you kind of want to keep the most promising candidate around so you want to keep the most promising
neural network for the small learning rate and then one for the medium learning rate and one for the large and then you can kind of cross over
you can you can take some change them and if they happen to end up in another cell and get better than the thing that's there you replace them and so on so
your goal is to build up this almost kind of population but that's separated by force and then go explore it's it's kind of it was so amazing right everyone was trying
to solve montezuma's revenge and everyone's like yeah we almost can do level one and then you know you come along and it's like boom we solve the game
done bye and but so my my question there there would be that it seems like you have an algorithm for for general exploration
but there seems so much domain knowledge in that right so the way you subdivide this parameter space in map elites or the way that you assign these
different cells to explore in go explore and so on isn't that almost isn't that almost the same as tim saying there's someone from the future that sort of know what
knows what to do and you're almost kind of building that knowledge into the algorithm but you're sort of hiding that you do that right it's the question
is could you do the same thing for for when you don't know the stepping stones because in in go explore you do yeah yeah that's so that's that's a great question this
is getting into into the weeds of the research really where like it is sort of at the cutting edge with these types of algorithms including like map elites which which just to acknowledge
is not something that that i was involved in creating although it's it's in this area sorry of these novelty i mean it's it's one of these quality diversity algorithms that was originally inspired by novelty search so definitely it's definitely in the right category
and a beautiful algorithm also by the way but yeah when you start to talk about how to subdivide the space or in novelty search how do you measure the distance between two things for for the purposes
of maintaining divergence or diversity you end up or or yeah and go explore how you divide up the cells you end up making a decision as a human being which is based on sort of domain
knowledge that's true and from the like sort of high aspirations of machine learning we would rather not have to make decisions like this there's fewer hyper parameters
and it seems it seems sort of sad that we have to do that as humans and so i think that in this in this field i mean if you want to call this a field and like map elites would be in sort of like the field of quality diversity that's sort of what they're calling
these kind of algorithms or novel searches local competition there's like a bunch like this that do take into account some objective notions as well sort of as keith alluded early on like that like yeah we can combine these notions
of course there's a degree of degree and quality diversity sort of about that and but then we get to this issue even in the most pure form of like just novelty search we don't even have a notion of quality of
yeah what is what does it mean for two things to be interestingly different and where do we draw those dividing lines and how should the distances be measured and it is currently an active area of exploration
even in just reinforcement learning you know it's like curiosity driven search you have to make these kind of determinations of like what is different from what to a sufficient degree to be
deemed novel in some way and there are people trying to make completely generic measures of that it's interesting to do that uh to try to do it in a completely generic way the enhanced
poet has a little bit of a flavor of that because it has this special measure which is based on like how like a new environment re-ranks the quality ratings of agents so you might say well that could actually be applied in many different
domains but then you could argue about like that there's some domains where that might not make sense and so forth i mean we're getting into the weeds but but the thing is that i think it's a fruitful area to look at how generic can we get
and get away from like having to have this kind of human insight into the domain but i also think it's not terrible if we do have to have some human insight into the domain it's not the end of the world like if
if having human insight into the domain did lead us to create an artificial einstein is anybody gonna complain i mean it doesn't really matter so at some level it's okay as long as the results are really awesome
yeah since we got onto this topic of kind of distance metrics i wanted to ask one more technical question then i'll give it back to you in it because this is something that kind of annoyed me in the paper that i wanted to ask about for those who haven't read the
paper just for the benefit of the recording keith is talking about abandoning objectives evolution through the search for novelty alone by joel lemon and kenneth stanley in
2011.
so in the maze search you know it starts off by we're going to run a network and it gets to some location in the maze and the way in which we're going to measure novelty is compare its final location
to the kind of archive of uh of locations that we've that we stored right and then and then you talk about refinements to that maybe if we sampled the path you know some k number of times and then we looked at i think
was the l2 norm you know between those sort of that trajectory and all other trajectories okay good then we get to the the uh the bipedal kind of walker which was the second case
in there and it immediately starts off by saying yeah we've got to do the the sampling because you know the final location is just not really relevant to the way it was walking which was kind of
hand wavy to me and then it makes this comment that that the metric has no knowledge of the distance in which it it walked but if you look right up above it where the formula is defined
it's xk minus x0 i.e the origin squared so in fact it automatically has built in the distance from the origin like in the actual you know thing which is being compared
right so in other words points that are further away from the origin are naturally going to have a larger weight in the l2 metric when you start comparing you know paths i guess my comments are one did i
misunderstand anything there and two like the hand waviness i think is really more just a function that this is a new area of research and so tying back into what you and yannick were talking about is
we have a long way to go as as a scientific body of knowledge to really think about these novelty you know measures and and interestingness measures as we've kind of been talking about
yeah so i think that the there is an issue about the the diversity measure or what we we call the behavior characterization and with novelty search or the bc like how we characterize
behavior there's some vector that characterizes what this thing did and it could be a trajectory of points it it followed through or something like that and and the degree to which it aligns with what
you're interested in which actually could be something that we call an objective even and it's true that if it's completely orthogonal to your interests you probably won't
like what you get so like i was thinking of like say in the maze like if there's a robot in a maze and the robot has a light on its head that can flash at sort of any arbitrary time
like if we just like measured novelty based on the flashing pattern of the light and just ignored where the robot went like and we actually care about it getting through the maze like clearly this is a really bad idea and completely
stupid and so so indeed there is like a hand waviness to to that i mean if if that's how you want to kind of characterize it or intuition intuition or human
intelligence or human insight to to understand how our notions of novelty align with the interests that we actually have in the problem we've noticed that like you can
sometimes have an orthogonal sort of non-aligned uh form of measure at the same time as an aligned form of measure and that actually might even work better because of this problem that like
sometimes the stepping stones to things that you like are actually things that you don't like and so there's actually a pretty complicated problem and there's not an easy answer to it but i think that the ultimate thing is that you just have
to acknowledge that that we have not eliminated the need for us to define something like in the objective world you basically have to define the metric that sort of like measures how close you're
getting to the objective in the novelty world you have to define this novelty metric and then you still have to define something i would argue though it's not a super terrible obstacle because
it doesn't have to be perfect that's one of the points the paper was trying to make like that like you can there's a whole variety of these behavior characterizations that all still work basically the same these have to be reasonably aligned
yeah you made that point like in my opinion very clearly with the uh showing how if you divide up the maze into a grid that you just need like a 4x4 grid or something to
to work well you know in other words you didn't need a high resolution grid i think that kind of makes the point that the that the novelty metric itself can be quite loose in a sense it just has to be
minimally good this paper is the the abandoning objectives paper where we figure out the policy for this robot using neat and amaze is a wonderful example of something that has
deception so here for example if the objective was monotonically improving on the distance to the final objective then this would be deception it would get stuck in a basin of attraction here
but the thing i found a little bit strange is it almost seems like cheating knowing the distance to the final objective and optimizing on that but i suppose there's not really much else you could use
because otherwise you'd have to do some kind of reward shaping to find out what an intermediate a good position might look like yeah it's one thing that's really important about it is that it's
really meant to be metaphorical and so it's like like the whole maze problem doesn't doesn't make any sense if you if you're thinking of it as something other than a metaphor like if you take it literally it's not like a really great example of
anything but what it's supposed to be metaphorical in just the sense that search problems generally have this property that we really don't know like the right kind of shaping measure so we are in this kind of room with
walls and like there are these dead ends we don't know where they are in general the maze is different that you can see them literally but but the real truth is that we don't even know like you we could we actually have tried this
you know what the path is through the maze so you could just define the reward function such that it rewards you higher as you follow the actual correct path through the maze of course this breaks the metaphor because the whole point is like in
general you wouldn't know the right path through the search space if you did you don't have any problems but what's interesting that is even if you knew the path through the maze it still will work worse because there's more to the deception than just
the structure of the maze that's what's interesting here there's other subtleties that are below the level at which we can visually perceive them because this is about the search base of neural networks this is the search base is not actually mazes the search space is the
underlying neural representations and we really don't understand that did you often make the assertion that all ambitious objectives have deception
could you quantify why that is yeah that's an assertion that's trying to make the argument that pip breeder connects to real life because i think that the inclination if i make this argument about this objective paradox which is basically
that following an objective can be an impediment to achieving it and if i base that argument on observations of pick breeder which is that in almost every case the images that were discovered by pick breeder users
were discovered by people who were not trying to discover them and that is because of deception if i make that argument then for many people the first inclination is to reject it
on the basis that hey pick breeder's not real life you're just exaggerating the implications here there's something special about this pick breeder system so for me to extend this argument outside of pick breeder i have to make a
strong argument that pick reader actually is like real life and and i do think it is and the reason that i think it is is because pick reader is actually much much simpler than real life like in pick reader if you take any
arbitrary image like the skull and you say okay well how would i get there from scratch like we don't really know how to get there from scratch we don't know what the stepping stones are like at all i mean one set of stepping stones
are now known because someone did discover the skull and they happen to be things like a doughnut there's a doughnut that you have to go through to get to the skull there may be other trajectories also there's infinite trajectories through search space so that may not be the only one
but the point is if it's a counterintuitive one and so this means that like anybody who's just saying like an expert on skulls who's saying like how close are we to a skull is not going to accept those stepping
stones like those metric that metric of improvement of like how close are we to a skull is not the right one for following that trajectory through the search space and my argument is that's the way of course all hard
problems are that's basically the definition of a hard problem like if you knew what the stepping stones were it wouldn't be hard we would just follow the stepping stones of course and not only that but like pick reader
is a joke in terms of its complexity compared to the real world if this situation is coming up in pick breeder where we're talking about dozens or hundreds of dimensions maximum
can you imagine real world search spaces which are like millions of dimensions like extreme complexity they're going to be like orders of magnitude more complex in other words there are
orders of magnitude more of these deceptive stepping stones inside of that trajectory towards that thing that we hope to finally find i would push back on a couple things there one is you know pick breeder is a particular tool set
you know this is generative networks right whereas if i was a person and i wanted to create an image of the skull i would use photoshop and i would get it done in an hour and it would be like much more
resemblance of a skull right if i was a egyptian pharaoh and i wanted to build the great pyramid which hopefully we all agree are great achievements i do it with a simple process of stacking you know
stones one on top of the other with a simple set of like generalizable tools so the fact is that like in real life in the real world humans do very well by applying
simple sets of tools in large scale plan projects to achieve great things but but the difference is if we already conceptualize if we already know what it is we know what the pyramid is that's
exploitation it's not exploration my point is those are still great things so look but those are still great things getting to the moon building a pyramid building the hoover dam all these things
were great accomplishments and they were planned and so that's like the title i understand which is i've been trying to make this point from the beginning that science and engineering are two
sides of a coin right and we progress by by utilizing both it's not a binary either or thing you know okay look yeah let me take that on that this this that's a good that's a good
thing to try to argue about so i think that what what's happening there is you're not really letting me use pic breeder as a metaphor it's just a metaphor like the skull is is not the point is not that like
the skull is something that artists would or would not would not find hard or like artists don't know how to draw a skull i mean it's nothing to do with that at all the skull is just a metaphor for something that we don't know how to get to
we don't know what the steps are to get to the skull in pick reader space indeed i'm not talking about human space with a canvas and a pencil now one interesting thing though is that most people can't draw a skull most people are not going to draw a
skull even close to as good as that skull so indeed pick breeder is actually allowing people to get to points in picture space that they couldn't get to normally other than maybe some expert artists that actually do have that skill
but that's not that's a side point though the main point is this is just a metaphor for so complex search spaces in general where we don't know what the stepping stones are that's all it's about it has nothing to do with like whether somebody might be able to draw a skull
or not and so we are really in a in a situation like with something like curing cancer it is like the skull like you can't just sit down with your canvas and just say oh well let's just put this let's just
put the pieces together and just do this like we don't know what the heck the pieces are at all so there's a whole bunch of discoveries that need to be made that haven't been made yet we don't even know who's going to make those discoveries they may not even be
biologists we have no idea and so like we are in this situation where like if you just follow the path of things that seem like they're on the path to curing cancer
you're probably not going to cure cancer i think i can confidently say that this is why we need exploratory research funded by the nih and the nih knows that we can't just only fund things which look like
hey we've improved this outcome by two percent unlike the death rate for this particular kind of cancer that is not going to lead to the ultimate cure for cancer any more than hey this looks two percent more like a
skull so let's just go in this direction because that's totally not the right path for getting to a skull you're going to go to a deceptive dead end for sure if you follow that kind of path so this is a completely metaphorical
argument you need to take it as that there's it's going with two sides which is we need both discovery and exploitation kind of like in our multi-armed bandit talk like for example there's a quote made in the uh white
paper that said yet its appeal is that it rejects the misleading intuition that objectives are an essential means to discovery i don't know people who think
objectives are a means to discovery they think exploration is a means to discovery they think objectives are a means to engineering building accomplishment refinement
optimization right which is which is perhaps even the bulk of what needs to be done like if i have 10 000 people that need to be housed i don't need to discover a new type of
house i need to go build houses yeah but but that that is engineering right so i don't think anyone but kenneth has made it clear that objectives work really
well for exploiting knowledge we already have whereas they are incredibly bad for discovering new knowledge as kenneth just said so eloquently that there's a universe of possibilities and knowledge out there
waiting to be discovered and we can only discover them by searching through this huge space of possibilities yeah and the tools to do that searching
requires optimization and exploitation to perfect them it's it's it's a dual there's a dual process here well i mean of course our enlightened friends in the scientific community will agree
with you and give lip service to exploration there's no question about that i just think that the the behavior of people does not align with the things that you're saying that they say like that they would
acknowledge that exploration is important when we accept scientific papers it's almost always because something improved objectively you can talk about exploration all day but that's the way the community actually operates
so there's some disconnect here between sort of like these sort of high-minded types of discussion about how exploration is great and the actual way people operate in almost all fields except maybe in art art is interesting because i think it can actually be
instructive for us in our field but the other thing is that if we think about objectives or exploration in terms of how like say machine learning researchers talk about it they don't really mean it in the way
that i'm talking about like when they talk about explanation you're just talking about when you take random steps when you're not actually following the objective gradient like there's nothing random about following a gradient of novelty or
interestingness this is a totally different thing and this is not the kind of thing they're talking about they're like talking about is sort of like the the exploration part is kind of like a throwaway part of the algorithm
it's just like you do something stupid the smart parts are the parts you're actually going towards the objective so i'm saying we got to flip this completely like the smart part is the exploration the dumb part is the objective part
because it's freaking easy there's nothing really insightful or interesting about just doing objective optimization yeah we've got plenty of good algorithms for doing that and it's not counter-intuitive at all it totally
makes sense but it's not going to get us hardly anywhere interesting whatsoever so like giving it a lot of lip service and credit and saying look how important objective optimization is we should all be like really congratulating ourselves i think it's just setting us in the
wrong direction because it's not the thing we need to concentrate on to change the way that the community is moving if we want to have disruptive change we've got to start talking about what does exploration really mean how do
we actually implement it in real life and not just talk about it i wouldn't disagree with that i think it's just important to maintain the connection that was made in the white paper which is you know uh maybe it's most efficient to
take the most promising results from novelty search and further optimize them based on an objective function i think that's a totally you know sort of great starting point for me personally i mean i want to give credit to that like i mean a lot of people have that
intuition just before you i don't want to shoot down everything you're saying like it's true that at some point when we meet in the middle there will be a role for objective optimization still i mean that's true especially when
you're just a stepping stone away it's not as simple as you just switch when it's time because the whole problem is we don't know when it's time and that that's the problem like that's why i think only visionaries really
understand when it's time to switch and that's sort of like the you know visionaries are often characterized as people who think about several stepping stones away like a genius it's like i can see off on the horizon how do i get to this thing that nobody else knows
how to get to i don't think anybody can do that nobody knows that what a visionary is is somebody who realizes that suddenly has something has snapped into view and the rest of us didn't and then it's time to make the switch
but that takes real insight like that's like steve jobs seeing you know that we can get from like the ipod to the iphone like that's not everybody just okay now we're in the valence let's start doing optimization
like that's really tricky i wanted to touch on just in in closing there are different views so there are schools of thought in the machine learning world right there's compute driven versus knowledge based and model driven versus data
driven and symbolic or statistical and white box or black box and generative or discriminative and you're coming at this from a really interesting angle which is you're an advocate of neural networks and and you think that we need to have a
search-based paradigm to find things so do you think a search approach could lead to intelligence um yeah i think that it is an important
element uh it may not be an either or situation where it's just like there's just one paradigm that's the paradigm but i do believe that search as you're putting it which really means
like like searching for the ai as opposed to just constructing the eye by hand i guess you could say which is like you know my colleague and friend jeff cloon calls these ai
generating algorithms or aigas like algorithms that generate ais instead of actually making the a we make the algorithm to generate the a i totally think that that's important and
that you know we are that we ourselves as humans with human intelligence are the product of a process like this like there's there's an outer loop that explains our existence which is it not itself a human level
intelligence you know it's this evolutionary algorithm which is still not fully understood if it was we would just program it to the computer and then we'd get human brains to come out and we don't know how to do that but it's this this outer loop that it's
an open-ended process because it also produces things like photosynthesis or the flight of birds like it's not just designed to produce intelligence it's an open-ended algorithm and it's because of the fact that it gets around
deception that it can get so so far down the tree of search to something like us which is just an extreme level of complexity astronomical that this is probably an important
element but not the only element like these other kinds of progress that are happening in the field should enter into it and dovetail i think with stuff like that as opposed to like we're
at odds with each other in partisan camps or something like that and so i see these different stepping stones being brought together just as they often are in the history of invention you take an airplane and you take an
engine and eventually you get a spaceship and i think you know the history of engines and the history of the aerodynamics which goes back to people who are making bicycles are different trajectories but they eventually are brought together by
people like the wright brothers and this can happen here too and so that's what i expect to happen if we ever do get that far to like human level stuff well professor kenneth stanley
thank you very much for joining us this evening it's been an absolute pleasure thank you that was a great discussion yeah thank you so much really appreciate it it was an honor
amazing i've been excited about this for literally months so it's fine i can i can attach to that i forced keith to read your book in june uh poor keith but but thank you i really
appreciate everybody's i appreciate everybody's input here this was this was a great time it's one of my favorite experiences of any show so thanks for having me on it's really an honor to be here
remember to like comment and subscribe we love reading your comments and we'll see you back next week
Loading video analysis...