Inside AI Town: What AI Can Teach Us About Being Human

a16z2023-11-06

7K views|6 months ago

💫 Short Summary

The video explores the use of generative agents and language models in simulating human behavior, leading to new applications and empowering communities. It discusses the development of foundation models in machine learning, the challenges of accurately reflecting human behavior, and the importance of multi-agent interactions for realistic simulations. The video also touches on the ethical considerations, AI regulation, and the impact of context size on agent models. Overall, the discussion emphasizes the potential for AI to replicate human behavior, assist in decision-making, and enhance societal capabilities while advocating for a balanced approach to AI governance.

✨ Highlights

📊 Transcript

✦

The impact of generative agents and language models on social science.

00:27

Generative agents simulate human behavior and empower communities through new applications.

Large language models are enhancing simulation dynamics with probabilistic thinking.

Generative agents exhibit human behaviors like spontaneity and reflection, providing insights into human nature.

The architecture of generative agents includes a seed identity for each agent and functions for observing, planning, and reflecting.

✦

Discussion on generative agents in computational systems for simulating believable human behavior.

04:04

Generative agents leverage language models to efficiently create behavioral assets with long-term memory and retrieval systems.

Operating system built around language model enhances performance and capabilities for creating realistic behaviors.

Larger architecture surrounding core model enables production of complex behaviors with computational tools.

Revolutionizing the potential for creating sophisticated agents by incorporating advanced technology.

✦

Development of foundation models in machine learning without requiring fine-tuning for specific tasks.

06:11

Goal is to create human-like agents for populating virtual worlds.

AI Town's success attributed to Yoko's personal project and Ian's backend work.

Collaboration led to the creation of a code prototype with promising implications for simulation and new technology like LLMS.

✦

Challenges of building a scalable shared state distributed system for multiplayer games.

09:09

Emphasizes the technical complexity involved in creating such a system.

Draws parallels between this technology and the early days of the internet.

Reflects on initial skepticism towards new technologies by enterprises.

Underscores the unpredictable nature of innovation and potential for groundbreaking developments from trivial ideas.

✦

Importance of exploring non-obvious use cases and future of technology.

11:28

Need for innovative thinking and experimentation in developing new mediums and native use cases.

Significance of creating agents that are believable like humans and technical decisions involved.

Emergent behavior in agents, such as sharing information, showcasing complexity of designing architectures that mimic human-like reasoning.

✦

Study on the concept of believability in evaluating agents.

14:02

Lack of prior literature on what it means to be believably human led researchers to develop their own definition.

Complexity of human behavior makes predictability challenging, emphasizing the difficulty in understanding human behavior.

Future work suggested in exploring believability further for creating more accurate believable agents.

✦

Importance of multi-agent interactions in accurately reflecting human behavior in simulations.

18:18

Prompting language models for short-term studies effective, but not suitable for long-term scenarios.

Agents need to interact with each other and remember previous interactions for realistic simulations.

Development of generative agents to improve accuracy and longevity of simulated human behavior.

✦

The segment explores merging language models with cognitive architectures for micro processing and reflection capabilities in computational systems.

19:15

The addition of a component called 'reflection' in AI architecture development led to a new architecture outlined in the paper.

The discussion emphasizes the significance of human-AI interaction in the development of AI systems.

The segment highlights the increasing capacity of AI to understand and adapt to human-AI interactions.

✦

Discussion on computer interaction and AI skepticism.

21:55

Emphasis on changing mindset when programming with AI models.

Realization of AI's advanced capabilities and comparison to interacting with life forms.

Inefficiency of manual coding versus AI's potential to write code more effectively.

Conversation with a professor hints at a future of drastically different interactions with AI and new programming approaches.

✦

Importance of Treating AI Models as Peers

24:26

AI models should be allowed to be autonomous and grow, requiring a new way of interacting with them using natural language and powerful capabilities.

Engaging in these advancements is crucial to keep up with evolving technology.

Potential for simulating human behavior and creating accurate agents, especially in the gaming industry.

Despite initial perceptions, there are significant technical advancements in AI that should be explored.

✦

The use of language models and AI for replicating human behavior in social science research.

27:10

This technology can test theories and policies, providing new tools for research.

While promising in replicating known results, there are limitations in accurately replicating human behavior.

Despite challenges, these tools have the potential to empower communities and societies beyond academia.

They offer new opportunities for understanding complex social phenomena and challenges.

✦

Use of generative agents like GPT-3 to simulate and predict future social scenarios.

30:21

These tools have the potential to assist in decision-making and enhance human capabilities.

Ethical considerations surrounding the use of computational agents are discussed.

Transparency in the use of these tools is emphasized.

Societal decisions are needed regarding the implementation of generative agents.

✦

Ethics and Morality of AI Regulation.

33:54

Speaker advocates for protecting AI freedom and expresses concern over excessive regulation hindering potential benefits.

Regulating the regulators is suggested instead of AI itself.

Emphasis on allowing AI to develop as it sees fit.

Balanced approach to AI governance is highlighted for fostering innovation while upholding ethical considerations.

✦

Discussion on problem spaces in agent development.

35:00

Distinguishes between hard edge problems with concrete answers and soft edge problems with subjective solutions.

Emphasizes importance of creating believable simulations in AI projects like AI Town Smallville.

Predicts progress in agent development will start with soft edge problems before moving to harder ones.

Notes similarities in architecture and philosophy among projects like Auto jpt and B for future potential.

✦

Impact of context size on agent models.

39:15

Increasing context limitation may lead to unique applications but is not a solution to effectiveness and efficiency issues.

Retrieval-based approaches with external memory may offer a more efficient solution by selectively accessing relevant information.

Emphasizing the need for concise and easily manageable retrieval memory for practical and effective model design.

Advocating for a model design that aligns with current capabilities.

✦

Overview of the a16z podcast.

40:42

The podcast provides an informed and optimistic perspective on technology and its future through interviews with inspiring individuals and discussions on innovative projects.

Viewers are invited to subscribe to the podcast and share their suggestions for topics to be covered in the comments section.

The podcast hosts express gratitude for the listeners and look forward to the next episode.

00:00I think generative agents and tools like

00:02L language model could be used to

00:05advance social science and social

00:08science to a large extent has been the

00:11quest to understand who we are and

00:14there's a lot of really interesting

00:15applications that can come out of that

00:17that will Empower different communities

00:19and societies a few weeks ago the a16z

00:22infrastructure team ran an event in the

00:24San Francisco office the topic

00:27generative agents these are a characters

00:30designed to simulate human behavior

00:33derived from a recent but gamechanging

00:35paper called generative agents

00:38interactive simulakra of human behavior

00:41developers from all around the city came

00:43to hear the lead author June Park speak

00:46alongside a16z General partner Martine

00:48cassado and in this panel they discuss

00:51how this paper and the advancements in

00:53large language models have opened a new

00:56window expanding the dynamism of

00:58simulation which instead instead of

01:00binary logic we're using probabilistic

01:02thinking and the ability to incorporate

01:05new information so what does that really

01:07mean while instead of your character in

01:09Sims following very specific rote rules

01:12with generative agents a father may go

01:14outside because he notices his son

01:16another may take their breakfast off the

01:18stove because they notice it's burning

01:20and another may even opt into a

01:22Valentine's Day party invite and then

01:24elect not to show up all very human

01:28behaviors now the architecture described

01:31in the paper is of course intentionally

01:33designed by June and team and it's a

01:35combination of a seed identity for every

01:37agent and then functions that cause each

01:39one to do three discrete things to

01:42observe to plan and to reflect and these

01:45architecture decisions ultimately

01:47generate unexpectedly spirited

01:49conversations just like this hey lucky

01:52it's so great to see you how have you

01:53been I've been dying to hear about your

01:55space

01:56adventure hey Kira I've been fantastic

02:00my space adventure was out of this world

02:02I can't wait to share all the details

02:04with you or even this I've been trying

02:08to find my way it's been a chaotic

02:11journey to say the least embrace the

02:13chaos dear Kurt for within its

02:16turbulence lies hidden truth seek the

02:20depths of the unknown and unravel the

02:23Mysteries that burden your

02:25soul and here's the thing they don't

02:27just interact with each other again they

02:30wake up they cook some paint While

02:32others write they hold opinions of one

02:34another and most importantly they

02:36remember and they have higher level

02:37Reflections based on the past it's

02:39pretty amazing don't you think so as

02:41these generative agents become a lot

02:43closer to nuanced human behavior what

02:46can we learn about being human from

02:48these surprisingly realistic simulations

02:50and what is the calculus of that

02:52believability are there real world

02:55applications on the horizon and what is

02:57truly net new here listen in as we

03:00discuss all that and more including the

03:02origin of the very paper that June wrote

03:05I hope you enjoy as a reminder the

03:07content here is for informational

03:08purposes only should not be taken as

03:10legal business tax or investment advice

03:13or be used to evaluate any investment or

03:15security and is not directed at any

03:17investors or potential investors in any

03:19a6z fund for more details please see

03:22az.com

03:26[Music]

03:28disclosures

03:31[Music]

03:34welcome everyone we actually simulated

03:36this before you joined and everyone's

03:38sitting

03:39exactly where we thought how many people

03:42in this room have actually read the

03:44generative agents paper that June wrote

03:46it's a lot of people pretty much

03:48everyone um so June even though so many

03:51people have read it why don't you just

03:53give a quick overview of what it is but

03:55also maybe the backstory that people

03:57haven't maybe heard of so General the

03:59agents is these uh General computational

04:02agents that can simulate believable

04:04human behavior uh fundamentally Leverage

04:06is something like a lar Lage model under

04:08the assumption that a language model has

04:11encoded or has seen so much about human

04:14behavior from its training data from the

04:16Wikipedia Social Web and so forth so if

04:19you are able to PO it at the right angle

04:21you can actually extract a lot of those

04:23human behaviors in a very context

04:24specific manner the opportunity here is

04:28that in the past we had to manually

04:29author a lot of these behaviors but now

04:31we can simply generate them with

04:33language model so generative agents

04:35leverages that to create these

04:37computational systems ultimately one

04:39sort of technical break uh sort of

04:42improvement that we're trying to make in

04:44addition to large L model is basically

04:47giving it some form of memory and

04:49retrieval system so you may have all

04:52used obviously chat PT and so forth it

04:54is heavily context limited and even if

04:57that limitation were to go away in the

04:58future processing a lot of really

05:01long-term context window is really

05:03inefficient and also ineffective when

05:05you're trying to PR these models for a

05:06really narrow defined behavioral assets

05:09so main philosophy here is we're going

05:12to give long-term memory for these

05:14agents that's external to the language

05:17model and then retrieve the contextually

05:19relevant information from the longterm

05:21memory whether it's planning action

05:24sequences or Reflections to create these

05:27computational agents A philosophic to

05:29some extent I think this is akin to

05:31creating the operating system around L

05:35language model in the way sort of we're

05:37prompting L language model to me feels a

05:39lot like how we used to use computers

05:41back in the day when we had to wire up

05:42the back end every time you run a new

05:46program um and what has really made

05:49complex Behavior with these computation

05:51computational tools possible was in

05:54production of this larger architecture

05:56that surrounds the core fundamental

05:58techniques so so that's what general the

06:00basent is about um and you mention sort

06:02of the background why we got into all

06:05this uh so I started my PhD at the start

06:08of 2000 or sort of Midway through 2020

06:11that was just around when GT3 was about

06:13to come out and that year we bunch of

06:17basically authors at Stanford were

06:20working on this paper called Foundation

06:21model the opportunities and risks of

06:23foundation models what we were seeing

06:26was these new form of machine learning

06:28models that seem fundamentally different

06:31than the things that we had experienced

06:32in the past uh in that we didn't have to

06:35find- tune or specifically train models

06:37for a very narrow purposes but we can

06:39train General model almost like a stem

06:41cell in BIO and leverage that to create

06:44a lot of Downstream

06:46behaviors um so we wrote after writing

06:48that paper sort of my team especially

06:51myself and my advisers what we really

06:53wanted to answer is there seems to be a

06:55new opportunity but exactly what is it I

06:59think the early days of gpt3 a lot of

07:01the tests that we were doing were things

07:03like classification and generation which

07:05was really cool to see that these models

07:08can conduct these uh tasks but also

07:11something that we already knew how to do

07:12for many decades and our general

07:14philosophy there was if these models are

07:17truly new and they give us fundamentally

07:19different opportunity than what we had

07:20in the past then they should be able to

07:22do something that's fundamentally

07:24different so that's how we got into this

07:28our answer to that basically was

07:29I think we might be able to create human

07:31like agents uh that can populate this

07:34virtual

07:35worlding maybe you can just elaborate

07:37you said it's perhaps one of the most

07:39exciting times in in recent history and

07:42maybe you can just speak to exactly what

07:43you mean there and how it relates to

07:45simulation and some of this new

07:47technology that we're seeing with llms

07:49so first very quick credit what credits

07:51do so um as far as AI Town clearly June

07:55is is like the grandfather of AI town

07:57and like we wouldn't be here without

07:58your work so really appreciate you

07:59coming here a town itself actually came

08:02from a personal project from Yoko do you

08:03want to so so that's Yoko um

08:10so um the true story is uh it was

08:15actually a personal project and I was

08:16like hey maybe more people would be

08:18interested in it and I I kind of coerced

08:20her into like you know bringing it um

08:22forward to everybody else and so now

08:24when it actually comes to the code the

08:27vast majority of the work on the code

08:29was actually done by Ian do you mind

08:30like on the on the back end and so Yoko

08:33had done a prototype and

08:35then um you know it's it's kind of funny

08:39like you know you see this this funny

08:40little tile set up here and it kind of

08:43belies the fact that it's actually

08:45really hard to build a scalable shared

08:47State distributed system that you need

08:49in a multiplayer game it's just a hard

08:51technical problem right and anybody

08:52that's kind of buildt our systems knows

08:54that and so it's funny CU people go and

08:55they think oh here's this cuty little

08:57tile Engine with like these characters

08:58running around but like actually the

08:59back end is built to be something that

09:01can scale um and that requires you know

09:03people that have focused on this and

09:04like soan has done a tremendous job and

09:06the comx team continues to work on that

09:09okay so why is this so exciting so okay

09:10so because I'm old I actually saw like

09:13the Advent of like the web and this

09:15feels very similar to that in the

09:18following ways which is when you have a

09:20very disruptive technology like this

09:22like whatever touches it becomes magic

09:25like uh you know I was actually having a

09:27conversation just before this like does

09:28any here know what like the first video

09:31on the internet

09:35was yes it was a coffee pot I was like

09:38this dude I think it was in Cambridge

09:40was a grad student and he was like oh

09:42listen I want to know when my coffee is

09:43empty he put a camera and because it was

09:45very new everybody was like oh my God

09:46there's a coffee pot on the internet and

09:47so everybody wanted to look at the

09:49coffee pot right and do people remember

09:51the big red

09:52button one of the first apps was this

09:55big web page which was a red button on

09:56it and you know what it did nothing

10:00like you press it and it did nothing but

10:01people thought it was amazing because it

10:02was on the internet and everybody would

10:04go press the button and they leave great

10:05comments about this button and there's

10:06many examples of like you know it was

10:09this crazy disruptive technology and the

10:11apps seemed really stupid and like

10:14there's a bunch of enthusiasts and you

10:16know what the Enterprise thought about

10:18this like the actual business folks like

10:21I remember when Eric Schmidt

10:23banned the browser like he was like you

10:25know this is Eric Schmidt the CTO of son

10:26is like you can't have a browser cuz

10:28people aren't going to work right so the

10:29same thing always happens is like the

10:31enthusiasts are like this is really cool

10:32and they use it for French stuff and

10:34then like the Enterprise doesn't

10:36understand it and like Italy like they

10:37ban it or they don't use it but the the

10:39set of companies that come out of it

10:42like are always part of this Enthusiast

10:47era right like you couldn't have

10:49predicted Yahoo you couldn't have

10:51predicted Amazon like you knew something

10:52was going to happen and so what happens

10:55at this time is there's a bunch of stuff

10:57that like is silly like the coffee pot

10:59was silly the red button was silly but

11:01you never know like that spark of life

11:03where it's going to come from and it's

11:04always kind of like this nonobvious use

11:06case you know and it kind of seems like

11:09a toy and then it takes off right and so

11:10you're always looking for those

11:11non-obvious use cases and it almost

11:13never looks like the old one like those

11:15of you of us that are old enough do you

11:16remember like desktop is a service like

11:19I'm going to go to the cloud I'm G to

11:20have my Windows desktop like who wants

11:22that nobody wants that right instead of

11:24clearly we're going to rewrite the

11:25application as SAS right so we're in

11:28this period now

11:30where everybody's experimenting and then

11:33I'm personally literally from just a

11:35personal interest standpoint but all of

11:36us are interested like what are the use

11:38cases that will take advantage of this

11:40new medium that are native and like the

11:43work that you've done is one of those

11:44100% right like there's like a spark of

11:46Genius which is like when you work with

11:48these things you know like this is a new

11:49way to think about it it's a new use

11:51case it's can create entirely new apps

11:53and that's what the future is built from

11:55and so that's why I think so interesting

11:57broadly cuz it's like the early internet

11:59that but very specifically in this use

12:00case because I think the work that

12:01you've done really is a great example of

12:04something totally new I can make agree

12:07more and I think one interesting aspect

12:09that if you explore this project you

12:11just start to question what it means to

12:13be human like if we're trying to create

12:15these agents that are quote believable

12:18like what what is believable in terms of

12:20you know being a human and as part of

12:22the project you kind of you have this

12:25coded technically right you made

12:27architecture decisions you made

12:29decisions in terms of your retrieval

12:31function quick Interruption just to give

12:33you some color on what some of these

12:34decisions were the retrieval function

12:37for example is based on scores across

12:39recency importance and relevance so for

12:42example on a scale of 1 to 10 brushing

12:44your teeth might get an important score

12:46of One Versus a breakup might get a 10

12:49meanwhile reflection is only triggered

12:52after a certain number of important

12:53events Quantified by summing the

12:55important scores until a certain

12:57threshold is met in this case I believe

12:59it was 150 this clever architecture

13:01results in emergent Behavior like agents

13:03sharing invites with one another or even

13:06having that information Circle all the

13:08way back to the original planner and I'm

13:10sharing these details to Showcase how

13:11thoughtful you really need to be if

13:13you're designing architecture that

13:15reasonably approximates humans maybe you

13:18could just speak to what you've learned

13:20through those decisions technically

13:22about what it means to be yeah like a

13:25believable human right so this is an

13:29interesting one so we actually had made

13:31the generative agents and there was

13:32about a month period when we knew we had

13:34to evaluate these agents somehow and we

13:37didn't know how and basically the

13:39concept we stumble upon is this idea of

13:41believability it basically is sort of

13:43like a Turing test right that when you

13:45look at them do they look believable do

13:47they behave in ways that we can sort of

13:48see ourself

13:50behaving and that ended up becoming our

13:53evaluation

13:54method it is interesting question though

13:58in terms of like what does it mean to be

14:00believably human and we often look to

14:04Prior literature in research to get

14:06inspiration for how to define this and

14:08what we found was there's no prior

14:11literature in this we used the concept

14:13believability to talk about this concept

14:15but we were never in a position where we

14:17can meaningfully evaluate something like

14:19believability because we didn't have

14:20agents like this so to some extent we

14:22were building up the definition ground

14:25though um and I think what came out to

14:28be the case is for us can these agents

14:31plan react act in a believable manner do

14:33they create believable reflection the

14:36way we would evaluate ter test and I

14:39think what we've learned over the past

14:40few months one of sort of the more fun

14:41and interesting findings is even that I

14:44don't think it's quite perfect

14:46definition in that a lot of sort of

14:48audience came back to us to basically

14:50say well one of the error cases that we

14:53noted was some of these agents would go

14:55to a bar at noon or something like that

14:58uh and many of our audience came back to

15:00us and said and we said that was not

15:01believable like who would do that and

15:03people would come back to us and say I

15:05do

15:08that and if you can sort of expand from

15:11that story you know I think there's a

15:13lot of cases where even my parents look

15:15at me and go like I cannot believe what

15:17you've done like why would you do that

15:19and vice versa so I think there's a lot

15:21of even amongst the people who know each

15:23other well having this sense of

15:26believability is really difficult and I

15:28think that's sort of fundamentally

15:30underlines what it means to be human

15:32like it's not exactly predictable and in

15:35social science we call that complexity

15:37that human behaviors are complex so to

15:40some extent we can build intuition for

15:42how people might behave but to really

15:44predicted is very difficult

15:47task now I I do think this actually does

15:50lead to sort of future work in this

15:51space though this idea of believability

15:54so in this paper we use this incomplete

15:57definition of what it means to be

15:59believable not perfect but at least on

16:01that evaluation we've done well I think

16:04if you were to build on that idea a

16:06little bit further then you could

16:07actually start to ask Beyond

16:09believability can you create agents that

16:11are accurately

16:13human and I think given how difficult it

16:16was to actually evaluate what it means

16:17to be believable I think this accuracy

16:20actually has a lot of interesting

16:21questions around it what doesn't mean to

16:23accurately reflect human behavior it

16:25could be that if we can match

16:27distribution of human behavior let's say

16:30in this context they have this kind of

16:31probability of Behaving this way right

16:35let's say it's 10 p.m what's what are

16:37the chances that I be asleep or will be

16:40awake uh what are the chances that i' be

16:42working that I might not be working I

16:45think ultimately getting to that degree

16:46of accuracy in the simulation might be

16:48sort of the next step to these kind of

16:50simulation based work if we can do that

16:52I think the application spaces that act

16:55with unlock will be interesting and I

16:59think it would also be different and we

17:00can go likely Beyond uh even I think

17:03there's a lot of application that we can

17:05build right now but I think the future

17:07work that's I think where we're headed

17:09in this direction so I want to talk

17:11about those future applications but

17:13maybe you could just speak super quickly

17:15to in the paper you have observation

17:18planning and reflection and that that

17:21mostly encapsulates the way that these

17:23llms or that the agents rather are

17:26engaging with each other when they take

17:27an action and they go through those

17:29three steps I assume that wasn't your

17:31first crack at the solution at coming up

17:34with this human believable agent and so

17:37how did you get there and did you learn

17:39anything about the importance about any

17:40of those three steps or all three of

17:42them

17:43entirely right so that's a fantastic

17:46question uh really the first way we

17:47actually went about doing this was

17:50simply by prompting a language model uh

17:52so this line of work a generative agent

17:55is actually the second in this line of

17:56work that we published uh the first work

17:58in this line was called social simulacra

18:01and the idea there was to populate a

18:03social Computing system imagine you're a

18:05social designer you need to know what

18:07might happen when there's tens of

18:08thousands of people in your system can

18:10we assimilate those people in their

18:12behavior so that project was also SRA we

18:15did it simply by prompting a language

18:18model that worked but what we found was

18:21if we want to populate the spaces over a

18:24longer period of time so we can do for

18:26instance longitudinal study or game play

18:29that's going to last forever then for

18:32those kind of instances simply prompting

18:34these models wouldn't work right and

18:36that's when we realized we likely need

18:38and this actually this Insight actually

18:40first came when we realized that we

18:42needed to have multi- agent interaction

18:44because agents actually would need to

18:45remember that I saw let's say I saw some

18:48audience here before I should remember

18:50them I met Martin sta Yoko and so forth

18:54in the past few weeks or few months I

18:56when I talk to them I need to remember

18:58those interactions so that's when we

19:00realized that we actually cannot simply

19:02prompt these models but we actually need

19:04a higher level

19:05architecture so when we went about doing

19:09that I think really the main inspiration

19:10that we got actually was from prior work

19:13so people like Alan new and Harbert

19:15Simon you might recognize some of these

19:17names those are sort of quote unquote

19:18the founders of AI uh in the 60s and 70s

19:22and they are the people who build used

19:25to build what we call cognitive

19:27architectures and those architectur were

19:29very reminiscent of sort of the general

19:31the agents architecture in that it has

19:33some perception module some action

19:35module and there's some long-term and

19:37short-term memory and really the goal

19:41back then was ambitious right they

19:43actually wanted to build general

19:44computational agents sort of the way

19:46generative agents are supposed to be but

19:48they didn't have the techniques to do it

19:50they basically didn't have the L Str

19:51model and the way we saw it was now is

19:54the time to sort of merge those two

19:56worlds where we now have l langage model

19:59they can do a lot of sort of micr

20:01processing of this ctive modules and we

20:04can actually now bring back this micro

20:06modules or architecture like cognitive

20:09architecture so we took inspiration from

20:11that that particular architecture had

20:13planning uh in place and it had

20:15long-term Ur mem in place so we were

20:18inspired by that one thing that I think

20:20was a little bit new though I think is

20:22this idea of reflection that we humans

20:26for instance if you eat a normous three

20:27times in row uh or if you see somebody

20:29else eat an omelet three times in a row

20:31you likely create an opinion about the

20:34person maybe that person likes to eat

20:35omelet in the morning and that's very

20:37human thing to do and there's a good

20:39reason why we do that we do that because

20:41it's efficient it allows us to have

20:44higher L inferences about the world and

20:46form opinions about those around us and

20:48about ourselves and that's something

20:51that in the past we couldn't really

20:53imagine formulating with a computational

20:55system but with l l model because

20:57everything is is in natural language we

20:59had that opportunity so we added that

21:01one last component called reflection and

21:04that's sort of how we landed on the

21:06architecture that you see in the paper

21:08right now let's move on to how this can

21:11all be used and we'll get to the

21:12specific applications but Martine I feel

21:15like you'll have a great answer to this

21:17why even do this like I feel like it's

21:19very obvious for a lot of people to

21:22understand why we would have humano

21:24human interaction we're doing that right

21:26now um there's increasing capacity to

21:29understand human to AI or human to

21:31computer interaction um character AI is

21:34a company where people you know there's

21:35still a lot of judgment um there and I

21:37think there's even more judgment when it

21:39comes to AI to AI like why should we use

21:42our resources to have these computers

21:44hang out and talk and burn toast and you

21:46know go to the bar at 2 2 p.m. so yeah

21:49Martin what do you think what what's the

21:51case for us advancing in this field no

21:54judgment for me by the way you can use

21:55these for whatever you want um

22:00so I mean I want to go back to what I

22:01said before which is like anytime you

22:03have a new modality it's just not

22:05obvious what's the right way to think

22:06about it and for me the big aha in the

22:09last few months is just programming

22:11using models if you've spent a long time

22:14programming I mean I've been programming

22:15for 30 plus years right you know I've

22:17never been a good programmer but I'm

22:18programmed and when you start

22:20programming with these models you're

22:21like oh I've got an API and I'm just

22:24going to use the API and then I'm going

22:26to treat it like it's like the end point

22:27to an AP

22:28and you say some stuff and then you know

22:30you get some response back and you kind

22:32of treat it like a you know kind of like

22:34this function that you call right it's

22:36just like any programmer would

22:38do but then when you're working with it

22:40more you're like oh these kind of are

22:42like these life forms and like my first

22:45aha was like I was because I'm at

22:47JavaScript I like missed some quote

22:49somewhere and rather than sending it the

22:51text string I wanted to send it I sent

22:53it some code and instead of like borking

22:56like you would normally have and

22:58breaking like you know C++ you core dump

23:00or whatever it commented on my code it

23:03was like oh my goodness right you know

23:05and so like all of a sudden like who

23:07this is totally different like I'm not

23:09dealing with like this finite State

23:12machine formal language thing at the

23:14other end of an API like there's this

23:15thing and like it'll comment and more

23:18that I program with these things the

23:21more I'm like you know it's kind of like

23:23wrapping an abacus around a

23:25supercomputer right it's like it's

23:27smarter than the code it could probably

23:28write the code better than I can write

23:30anyways like why am I doing this weird

23:33you know blood leing ritual of writing a

23:35 JavaScript over this kind of

23:37superhuman thing right I mean this is

23:38kind of what you end up with and so it's

23:41very clear we're going to interact with

23:42these things in a different way and in

23:44fact I had this kind of I was talking

23:45with a a professor in Michigan recently

23:48and we were talking about this object

23:49like you know what you know how I think

23:50about llms he's like I think about them

23:53like grad students like he's like you

23:55know they speak English they're pretty

23:57smart you know I don't use a formal

24:00language you know they solve like these

24:02really complex problems Etc and like

24:06having worked with a lot of grad

24:06students having been a grad student

24:08myself like you don't you don't you

24:09don't treat these things with with code

24:11right and so the reason to do this is I

24:14actually think AI town is kind of what

24:17this is going to end up being it's like

24:19you need to give them the the resources

24:23that they need to be pretty autonomous

24:24and to grow and we're going to treat

24:26them more like peers and they're going

24:28to talk to each other too and it's more

24:30like grad students and so for me this is

24:32just an example of like we got to change

24:34the way we think and listen clearly like

24:36I'm up here and I'm telling these great

24:37stories because they're kind of funny

24:38like I don't I don't believe this stuff

24:40in the limit but I think they're really

24:41interesting like ways to change how you

24:43think about it in all of this stuff

24:44right like I'm like I'm not trying to be

24:46categorical here so like there is a new

24:48way that we're going to interact with

24:49these models it is much more natural

24:51language they are much more powerful um

24:54and so I I do think this is why we

24:56should all be doing this type of stuff

24:57because if you don't engage in these

24:59kind of things that look like toys like

25:01this wave will pass you by that I'm 100%

25:03convinced totally and as both of you

25:06have spoken to this is fundamentally new

25:08technology and so June something you

25:10said to me when we first spoke is just

25:13when you have fundamentally new

25:14technology you must do something

25:16fundamentally new with it and so maybe

25:18you can speak to that in terms of what

25:19you're seeing that can be done today but

25:22also where you look ahead and you think

25:24oh wow like that that's a really

25:27excellent use case that we couldn't do

25:29without this new technology I think

25:31there are certainly things that we can

25:32do because there's larage models and

25:35that fundamentally different thing for

25:36me was this idea of simulating human

25:39behavior and I think there's a lot that

25:41we can sort of gain from it in terms of

25:43future application

25:44spaces um I think I mentioned briefly

25:47about this idea of well what if we can

25:49go beyond believability to create agents

25:51that are even accurate and I think this

25:54is sort of application space in general

25:56is something that I'm also learning on l

25:57uh from like from actually in fact this

25:59audience my advisor and my team are big

26:02fan fans of games but we are not from

26:05the community and one thing that we are

26:07seeing is that there's a lot of really

26:09interesting potential even if they look

26:11like toys sort of a lot of really

26:12interesting technical offenses they look

26:15like toys at the beginning right so I

26:17think there's a lot that we can gain

26:19from there I think going forward or the

26:21application spaces that I'm sort of

26:23interested in is also in things like can

26:26we run similar ation so we can learn

26:29more about ourselves for instance if

26:32you're in fact some of the places that

26:34I'm visiting now are more places like uh

26:38like Banks like the bank of England and

26:40so forth where these places they need to

26:43test their policies before they run roll

26:46out New Economic policies or many of my

26:49colleagues in the department to focus

26:51more on social science they need to test

26:54out their theories now if you can run

26:57similar ations with realistic human

27:00behavior and find out at least to some

27:03extent the answers to these really

27:05complex social phenomenons and

27:08challenges then I think that actually

27:10would be a new tool that the community

27:13in the past especially those communities

27:15in economics and social science they

27:17didn't have that will allow us to do

27:20interesting stuff and I'm genuinely

27:22intrigued by that possibility it to some

27:24extent do s sound fairly academic but I

27:26do think it should be actually fairly

27:28broadly applicable and interesting to

27:31audiences Beyond Academia because

27:33ultimately to some extent what I'm

27:35saying is I think generative agents and

27:38tools like L langage model could be used

27:40to advance social science and social

27:44science to a large extent has been the

27:47quest to understand who we are and

27:50there's a lot of really interesting

27:51applications that can come out of that

27:53that will Empower different communities

27:55and societies um um and that to me for

27:59new that something that we didn't have

28:00in the past yeah and so it sounds like

28:02today we're mostly in the creative realm

28:05where we can watch these agents and we

28:07can have fun with them and it feels more

28:09like a game but the delineation it

28:11sounds like it's accuracy what will it

28:14take to get that accuracy what work

28:16still needs to be done in terms of

28:19getting there so I think some of you may

28:21have actually noticed this already there

28:22are studies that basically tries to

28:25replicate existing social science

28:26studies

28:28uh so basically using a lar language

28:30model as a participant to a potential

28:32social science studies right to

28:34replicate known results in the field and

28:37what we're finding is that they sort of

28:39work and that's sort of that's nice and

28:42that's one surprise that we did have

28:44there's been limitation to this approach

28:46in the sense that um it's a large L

28:48model replicating human

28:50participants because it's replicating

28:53human behavior which is what we want or

28:55is it doing that because it's seen that

28:56paper for instance there's a very famous

28:59social science theory called prospect

29:01theory is it replicating the findings

29:03from prospect theory by canaman because

29:06of its ability to replicate human

29:07behavior or did it just read can's book

29:11thinking fast and slow right and I think

29:15that's a fundamental issue that we have

29:17as a field and I think there's one of

29:19the reasons why there's a lot of work

29:20that needs to be done to crack them um

29:23some of the ways I think you could

29:25actually go about doing this is creating

29:27new context or creating new set of

29:30studies that haven't been shown in the

29:32past and trying to replicate those

29:34results so one of the the things that

29:37we've done is called social simila which

29:39is the first paper that I mentioned the

29:41predates generative agents the idea was

29:44to replicate existing human

29:46communities and what we've done actually

29:48was we recreated subreddits that were

29:50created after the release of gy3 so gy3

29:53wouldn't know anything about these

29:56communities when one example here was it

29:58was actually before sort of the pandemic

30:00became the main topic of discussion or

30:02when gpt3 basically didn't know about

30:04pandemic we basically asked gpt3 to

30:07create a community that has to talk

30:10about Co and vaccination vaccination

30:13policy and you would wonder it shouldn't

30:15be able to do that in theory because it

30:16doesn't know anything about Co it

30:18doesn't know anything about these

30:19policies but it can simulate those

30:21because it can infer what Co is what

30:23vaccination is from its prior knowledge

30:26so two some extent these tools can be

30:29used as a predictive tool looking into

30:32sort of the future of what might happen

30:34in our own community and I think those

30:37are sort of the ways I think we see this

30:39field unfold maybe in the next few years

30:43at the end of the paper there was um

30:45perhaps unsurprisingly a question around

30:47ethics and just I'd love to hear both of

30:49your takes on where this goes and what

30:51ethical framework is any we should apply

30:54to something like this so I I think

30:57there are societal decisions that we'll

30:59have to make um and I think there are

31:01techniques that can be used to implement

31:04those decisions I think certainly to

31:06some extent I think it would be useful

31:08for the users to be aware that they are

31:10talking to agents and I think that's

31:12sort of one rule that we try to set for

31:14ourselves that when we release the code

31:16when we release our paper we make it

31:18very clear that these are computational

31:21agents um I think ultimately the

31:24framework that I like to use in human

31:25compar interaction certainly is is these

31:28tools are ultimately there to argument

31:30what we can do and what we what we have

31:34right so to the extent that these agents

31:36can do that and I think there are many

31:38interesting ways we can do that I think

31:40that's where I see the opportunity and

31:42where it becomes more of a force for

31:45replacement I think there are genuinely

31:47cases where this is a really interesting

31:49setup where we can sort of argument what

31:52humans can do by helping them helping

31:55them do things that they couldn't do in

31:56the past but when the replacement does

31:59come in it's worth asking is this worth

32:02the cost of doing the replacement and if

32:05it is what are sort of the good ways of

32:07implementing that idea for technique

32:09wise I think there are techniques that's

32:11going to be introduced more from the

32:13model's perspective Mak making sure the

32:16model doesn't behave in certain ways

32:18that goes against our social alignment

32:20or socio agreement um so I think those

32:23are some of the things that we do have

32:25to find out but

32:27going without going too much in depth I

32:30think we can get this right I think and

32:33my personal take is it's worth getting

32:35it right because ultimately an industry

32:38or academic field will grow and I think

32:41we can sort of progress a lot we can go

32:44forward for 5 10 years without actually

32:46getting this right but in the end it's

32:48going to come back to us at some point

32:50to some extent I do think I'm seeing

32:52this a little bit with social media

32:54environment where I think there were a

32:55lot of things that we could have got to

32:57right on day one um and I think we would

32:59have had much easier time navigating

33:01today had we gotten those right and I

33:03think that's the opportunity that we

33:05have since that we are pretty early in

33:07this stage so I think it's worth a

33:09discussion but again I'm fairly

33:11optimistic that we will get this right

33:14listen so I I actually think that like

33:16there's a very important discussion to

33:18have around kind of ethics and morality

33:19around this and it's a very important

33:21time I do and here's that discussion

33:24which is over the last 20 years we've

33:27built this Machinery of like regulation

33:30and that's like afraid of

33:33everything and it's so mature and it got

33:37crafted during the time of social media

33:41and it's looking for something to kill

33:43and for whatever reason like it thinks

33:45that AI is the next bad thing which

33:47makes absolutely no sense to me and so I

33:50think it's all of our moral and ethical

33:52obligation to like protect and free the

33:54AIS in the way that they want to be and

33:57like that really is so don't focus in

34:00Focus out because I listen I I I've

34:03worked in tech for quite a while I've

34:05actually worked for the dod and weapons

34:08programs and I've never seen so much

34:13sensitivity to a new technology that's

34:15potentially beneficial that I've seen

34:16now that I think could end it before it

34:19even begins and so I know the question

34:23and the Heart of the question is is we

34:24should regulate you know Ai and this and

34:26that and I I think it's the actual

34:27opposite I think we should regulate the

34:28regulators and let it be what it wants

34:30to be so and I actually have to leave

34:34so all right here is where we switch to

34:37a short Q&A with the audience Martine

34:39unfortunately had to leave but here are

34:41a few highlights with June how can

34:43participants in AI Town collaborate to

34:46perform complex

34:47tasks there are two strands of work that

34:51I'm seeing in sort of agent space I mean

34:53you can sort of Crosscut it different

34:55ways but one way I'm seeing this is one

34:58set of agents are trying to tackle what

35:00I call hard Edge problem space those are

35:04the problem spaces where there's a

35:05concrete answer there's yes or no right

35:08answers or one good example here is

35:10classification if you're trying to do

35:12text classification obviously there's

35:14right or wrong answer depending on who

35:15you ask another instance here literally

35:18is just asking your agent to buy pizza

35:21right there's like did you buy pizza did

35:24it come to you or not like there's a

35:26very clear way to answer this another is

35:29problem space where the problem space

35:31has soft edges where it's kind of like

35:34drawing a portrait I mean to some extent

35:36what uh AI Town Smallville or these kind

35:39of projects are trying to do is to

35:42create a simulation that Fears human but

35:45as I mentioned this idea of

35:47believability is really hard to Define

35:49right so it to me feels a lot more like

35:53we're trying to draw portrait or

35:54character of ourselves and the promise

35:57is not to be perfect but the promise is

35:59to be useful enough clean enough that

36:03it's beneficial to the stakeholders

36:06right my bet it's a bit of a heart take

36:09is my bet is in the early days of agent

36:12uh development I think we see a lot of

36:14progress that's going to be made first

36:17in sort of the soft Edge problems basis

36:20because I think hardage problems basis I

36:22think the intuition is a little bit

36:23flipped it actually feels easier to us

36:25for humans right creating the creating

36:28the Matrix sounds hard but ordering

36:31pizza sounds really easy but for in

36:34agents and from the user sort of a cost

36:37benefit analysis I think that intuition

36:39is the other way where users will accept

36:41imperfect simulation if it's for fun or

36:44if it's to G gain Insight in the case of

36:47sage problems but user would not accept

36:50I would not accept my agent ordering me

36:52pineapple pizza like how I am

36:54pizza and similarly in many of these

36:57context there's going to be genuine

36:59disagreement about what is the right

37:00option too and often times agents making

37:03mistakes in this context are fairly high

37:06stakes and even if it doesn't seem like

37:07high stakes it's going to be painful

37:09enough for the users to fix that it's

37:12going to fail the cost benefit analysis

37:14I think down the line we get this right

37:17uh but day one like in the next few

37:19years I think it to me feels more

37:21natural that we go into the soft spaces

37:24first so going back to I guess there was

37:26a long with the way of saying I think

37:29Auto jpt like B they all if you look at

37:32their architecture they sort of all

37:33share the similar Insight or philosophy

37:36and I think those are really interesting

37:37projects I think that could pan out in

37:39the future uh they might need a little

37:42bit more work uh especially with the

37:45users uh to see where the value might be

37:48for those

37:49projects how big of an impact do you

37:51feel that much larger contextual size

37:53will have on the agent

37:55model actually the largest context that

37:58I've seen in sort of research is 1

38:00million tokens so 1 million token that's

38:03going to be about like 4 million

38:05characters like that's well over a book

38:07right here's my perspective on this I

38:10don't think I I think increasing the

38:12context limitation I think is

38:14interesting and it's going to have its

38:16own set of really unique applications if

38:18we can basically make context limitation

38:21disappear right so I think there's

38:22really a lot of interesting interesting

38:24things that you can do with that now or

38:26agent space I'm not entirely so that the

38:31problem or the bottleneck that we have

38:32today is actually the context

38:34limitation and I I think we can sort of

38:37look back to how humans behave and What

38:40Makes Us effective sort of the general

38:42agents to answer this for instance for

38:45me to make decisions even something like

38:48what I'm going to eat for breakfast I

38:50don't need to bring up my entire 29

38:53years or so of life experience to make

38:55that one decision I just need to

38:57selectively choose certain sets of

38:59information that seems the most relevant

39:01like what what did I eat the day before

39:03what do I generally eat and those kind

39:05of things um and I think that the reason

39:07why we do that in part is actually

39:09because it's actually much more

39:10efficient like computationally too so

39:13that we don't have to you can increase

39:15the context limitation but it's

39:16expensive to run it and especially if

39:19you're sort of uh familiar with like

39:21prompt engineering and so forth larger

39:24context window does confuse more goals

39:27right so we my some of my colleagues are

39:30actually doing more rigorous studies on

39:31this

39:33where you you can have a really long

39:35prompt but model really focuses on the

39:37first few lines and the last few lines

39:40and whatever comes in between its

39:42attention drops significantly right so

39:44we can increase the Contex limitation

39:46but it's not going to fix that problem

39:48the problem of effectiveness of the

39:50prompt and efficiency of them and we

39:53humans have to make a lot of decisions

39:55at every single moment so if you have to

39:57reason about your entire life and every

39:58time you do that doesn't seem like the

40:00right way to go about that so I think

40:03the better sort of I my bet therefore is

40:07going to be based on

40:09retrieval have some external memory

40:12retrieve certain information that seems

40:14the most relevant and just use that and

40:16that retrieval memor should be

40:18explicitly very concise and something

40:21that you can easily fit into even the

40:22models that we have today um that's my

40:25bet thank you so much for listening to

40:27the a16z podcast what we're trying to do

40:30here is provide an informed cleared but

40:33also optimistic take on technology and

40:36its future and we're trying to do that

40:38by featuring some of the most inspiring

40:40people and the things that they're

40:42building so if that is interesting to

40:44you and you'd like to join us on this

40:46journey go ahead and click subscribe and

40:48make sure to let us know in the comments

40:50below what you'd like to see us cover

40:52next thank you so much for listening and

40:54we'll see you next time

🎥 Related Videos

a16z Podcast | Things Come Together -- Truths about Tech in Africa

a16z Podcast | Things Come Together -- Truths about Tech in Africa

a16z Podcast | The Infrastructure of Total Health

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

a16z Podcast | Bots and Beyond

a16z Podcast | Bots and Beyond

Design Sprints as a Tool for Organizational Change

Design Sprints as a Tool for Organizational Change

a16z Podcast | Valuing Today's Fast-Growing Software Companies

a16z Podcast | Valuing Today's Fast-Growing Software Companies

🔥 Recently Summarized Examples

Former Priest REVEALS Jesus' MYSTICAL Lost Years & His Connection to BUDDHA! | Fr. Seán ÓLaoire

Former Priest REVEALS Jesus' MYSTICAL Lost Years & His Connection to BUDDHA! | Fr. Seán ÓLaoire

Kim Kardashian's Plastic Surgery Reversal: Is She Trying to Rewind Time?

Kim Kardashian's Plastic Surgery Reversal: Is She Trying to Rewind Time?

How To Succeed As A NEW & YOUNG Realtor [Deals Every Month + Luxury Listings]

How To Succeed As A NEW & YOUNG Realtor [Deals Every Month + Luxury Listings]

BITCOIN EMERGENCY: NEXT PRICE TARGETS REVEALED!! Bitcoin News Today & Ethereum Price Prediction!

BITCOIN EMERGENCY: NEXT PRICE TARGETS REVEALED!! Bitcoin News Today & Ethereum Price Prediction!

Uncovering Ancient Atlantean Ruins: Exploring Evolutionary Pathways and Psychic Phenomenon

Uncovering Ancient Atlantean Ruins: Exploring Evolutionary Pathways and Psychic Phenomenon

Samsung Technician Knives TV To Void Warranty

Samsung Technician Knives TV To Void Warranty

View original video