No Priors Ep. 7 | With Stanford Professor Dr. Percy Liang

No Priors: AI, Machine Learning, Tech, & Startups2023-04-25

1K views|1 years ago

💫 Short Summary

The video discusses the evolution of Foundation models and their impact on academia and industry, emphasizing the importance of transparency, accountability, and ethical considerations in AI development. It explores the challenges and advancements in language models, question answering systems, and data efficiency, highlighting the potential for creative applications and scientific discovery. The discussion also addresses the use of compute resources, training models like Transformers, and leveraging distributed computing for academic research. The video concludes by emphasizing the need for community involvement, open-mindedness, and ethical guidelines in shaping the future of AI technologies.

✨ Highlights

📊 Transcript

✦

Speaker's background in machine learning and natural language processing.

02:17

Speaker's interest in systems that can understand natural language.

Shift in focus to Foundation models and large language models after GPT-3 release during the pandemic.

GPT-3's training method of predicting the next word billions of times leading to fluent text generation and in-context learning.

Capabilities of the model for tasks like summarization and examples.

✦

Lack of transparency and accessibility in Foundation models.

04:29

Models only accessible through APIs, limiting understanding of inner workings.

Shift towards limited access due to high cost of training, competitive advantages, and safety concerns.

Center for Research on Foundation Models aims to increase transparency and accessibility.

Contrasts with open culture of deep learning in the past decade.

✦

Evolving roles of academia and industry in ML, NLP, and AI.

07:49

Academia previously focused on making models work, leading to advancements that influenced industry.

Academia now focuses on understanding the principles behind models due to a lack of understanding on how they work and their impact.

Industry benefits from resources to scale and overcome barriers, with a focus on practical applications.

The relationship between academia and industry has become more specialized and complementary.

✦

Overview of the Center for Research on Foundation Models (CRFM) at Stanford.

09:01

The center comprises over 30 faculty members from 10 different departments, engaging in interdisciplinary research on foundation models.

Research areas include technical aspects, economic impacts, challenges surrounding copyright and legality, social biases, and risks of disinformation.

Focus is also on leveraging foundation models in medicine, particularly for clinical practice.

Emphasis on addressing concerns such as privacy, robustness, and integration of foundation models into real clinical care.

✦

Importance of defining objective measure for AI performance beyond human standards.

12:19

Emphasis on reliability and statistical evidence in AI development.

Advocacy for technology that is principled and rational, rather than just mimicking human capabilities.

Exploration of computational semantics and its significance in AI development.

Highlight on deriving meaning from language text and application of semantic parsing in mapping natural language to formal spaces for machine execution.

✦

Advancements in question answering systems and the development of powerful language models like Bert and Roberta are discussed.

15:30

The shift from symbolic AI to neural AI is emphasized, with a focus on planning and reasoning in AI.

Symbolic AI's relevance in tackling more complex tasks beyond simple classification and entity extraction is highlighted.

The ongoing debate on integrating neural and symbolic AI approaches for improved AI capabilities and research programs is addressed.

✦

Importance of data efficiency and speed in achieving benchmarks.

17:50

Emphasizes the need for models to handle greater context and complex reasoning chains.

Discusses the limitations of fixed models like Transformers.

Explores the concept of emergent context learning demonstrated by GPT-3.

Notes the potential for improved performance with better models and data.

✦

Language models can creatively mix and match concepts, showcasing their ability to learn and fuse different ideas.

21:55

Models are not simply memorizing text, as demonstrated by explaining quicksort in Shakespearean style.

The creative potential of these models opens up possibilities for scientific discovery and pushing beyond human limitations.

Instructing models using natural language for various tasks is becoming more prevalent, highlighting the evolving nature of AI technology.

✦

Improvements in language models for text generation.

25:17

Language models predict the next word based on context, syntax, and previous words but may produce unintended capabilities.

Current models have difficulty generating accurate and coherent text, sometimes producing nonsensical content.

Hope is that with continued improvements, issues with text generation will decrease.

Increasing context and data input can enhance the model's accuracy in predicting the next word and understanding text.

✦

Pre-training increases accuracy by predicting the next word and developing a world model.

26:58

The success of the Transformer model is due to its scalability and emergent properties.

Concerns exist about the Transformer's limitations and the need for exploring alternative architectures.

Academia is encouraged to challenge the status quo and incorporate principles learned from Transformers in a more principled way.

✦

Efficient use of compute resources for training large-scale models like Transformers.

29:50

The cost and importance of compute pricing for scaling models is discussed.

Testing ideas at different scales is emphasized.

Addressing the central bottleneck of compute in Foundation models and harnessing decentralized resources is highlighted.

Researchers have developed techniques for optimizing compute usage in training models.

✦

Leveraging weekly connected compute for academic research and startups.

32:26

Projects like Folding@home and SETI@home paved the way for distributed computing.

Challenges of handling big data and task decomposition in training AI models.

Exploring incentivizing contributions to a research computer for academic purposes and the potential of open models in the commercial sector for fine-tuning and adapting AI models.

Future may see a variety of Foundation models tailored to different use cases.

✦

Use of Foundation models for training models like PubMed articles and collaboration with Mosaic to create a model called biomed LM.

35:54

Emphasis on efficiency reasons for using smaller models despite larger models developed by Google.

Exploration of using models to detect fraud, plagiarism, and inconsistencies in biomedical information.

Caution advised in making consequential decisions based on model outputs.

✦

Challenges faced by researchers due to the large volume of papers generated, making it hard to separate important information from irrelevant data.

38:50

Importance of tools such as literature review and summarization software for effective research.

Mention of Illicit, a company utilizing language models to assist in research processes.

Potential of advanced tools to read literature, generate hypotheses, propose experiments, and speed up scientific advancements.

Obstacles to fully automated research, but models have the ability to uncover new strategies and insights.

✦

Research on word embeddings from over 10 years ago uncovers new thermodynamic properties of materials and suggests potential advancements with more powerful models.

42:17

Daphne Kohler's work on data generation and optimization in language models is discussed as part of the conversation.

The Helm project evaluates language models rigorously across scenarios and metrics, emphasizing accuracy, robustness, calibration, fairness, bias, toxic content, and efficiency.

The project analyzed 30 models, 42 scenarios, and seven metrics, with detailed results available on the Helm website for further insights.

✦

Language models aim to provide transparency by showcasing their capabilities and deficiencies in a scientific manner.

45:15

The project involves dynamically updating every two weeks with new models and scenarios.

Evolving capabilities of language models now include writing emails, giving life advice, and more.

Concerns about security risks, jailbreaking, and potential cascade of errors if models interact with the world.

The project also aims to explore multimodal models and their implications in policy intersections.

✦

Importance of Transparency in AI Model Construction.

48:01

Differing opinions exist on model construction, with some calling for more transparency.

Accountability and legitimacy in determining human values within AI models are crucial.

Developing norms, starting with transparency, is essential for policy discussions.

Transparency becomes increasingly important as AI models are deployed at scale and impact society.

✦

Highlights of the AI model Neo X and the evolution of AGI perception.

51:22

The importance of community involvement in building Foundation models for AI development.

AGI has shifted from a laughable concept to a serious consideration due to existential risks.

Emphasis on open-mindedness in understanding evolving AI technologies and their social consequences.

The changing landscape of AI and human interaction necessitates a flexible worldview for the future.

00:00[Music]

00:05thanks Percy great welcome

00:08um so I think just to start can you tell

00:10us a little bit about how you got into

00:12uh the machine learning research field

00:14in your personal background yeah so I've

00:17been in the field of machine learning

00:19and natural language processing for over

00:2120 years I started getting into an

00:23undergrad I was undergrad at MIT I like

00:26Theory I had a fascination with

00:28languages I was fascinated by how humans

00:31could just be exposed to just strings of

00:35text uh I mean a speech and somehow

00:39acquire very sophisticated understanding

00:40of the world and also syntax and learn

00:42that in a fairly unsupervised way and my

00:46dream was to get computers to do the

00:48same so then I went to grad school at

00:50Berkeley

00:51and then after that started at uh

00:54Stanford and ever since I've been

00:56pursuit of uh you know developing

00:58systems that could really truly

01:00understand natural language

01:02um and of course in the last four years

01:05um this once upon a time kind of dream

01:08has really kind of taken off in as in a

01:12sense

01:13um maybe in a not a way that I would

01:15necessarily ex expect but with a coming

01:18out of uh large language models such as

01:21gpt3 it's truly kind of astonishing how

01:24much of the structure of language and

01:28the world that these models can can

01:30capture in some ways it kind of Harkens

01:33back when I actually first started in

01:34NLP I was training language models but a

01:39very different type it was based on

01:40hidden Markov models and there the goal

01:43was to discover hidden structure in in

01:46text and we were I was very excited by

01:48the fact that it could learn about tease

01:51apart what words were like city names

01:53versus days of the week and so on but

01:56now it's kind of on a completely

01:58different level was there a moment since

02:00you know you've worked on multiple

02:02generations of NLP at this point you

02:04know pushing the Forefront of semantic

02:06parsing was there a moment at which you

02:09um decided that you know you were going

02:10to focus on Foundation models and large

02:12language models yeah there was a very

02:15decisive moment and that moment was when

02:17gpt3 came out okay that was in the

02:19middle of the pandemic

02:20um and it wasn't so much the

02:22capabilities of the model that um

02:25shocked me but it was a way that the

02:26model was trained which was basically

02:28taking a massive amount of text and

02:30asking them all to predict the next word

02:31over and over again you know billions of

02:34times and just that simple objective and

02:37a very simple principle

02:40what Rose from it was not only a model

02:43of that could generate fluent text but

02:45also a model that could do in context

02:46learning which means that you can prompt

02:49the language model with instructions for

02:51example summarize this document give it

02:53some examples and have the model on the

02:55fly in context figure out what the task

02:59was and this was a paradigm shift in my

03:02opinion because it changed the way that

03:04we conceptualize machine learning and

03:07NLP systems from these bespoke systems

03:09where you're it's trained to do question

03:10answering to train to do this to just a

03:14general

03:15substrate where you can ask the model to

03:18do various things and then the idea of a

03:21task which is so Central to AI I think

03:24begins to dissolve and I find that

03:26extremely exciting

03:28um and that's the reason later in 2021

03:32we founded the center for research and

03:35Foundation models we coined the term

03:37Foundation models because we thought it

03:39there was some Dean that was happening

03:41in the world that was that somehow large

03:44language models didn't really capture

03:46the significance and it was not just

03:48about languages about images and

03:50multimodality it was a more General

03:51phenomenon and we coined the term

03:53Foundation models and then

03:55um then the center started and it's been

03:58sort of you know a kind of a roller

04:00coaster ride ever since we're going to

04:02be talking a thing about both um your

04:04experiences in research in Academia and

04:06then we'll also separately be talking

04:07about together which is a company you're

04:09involved with now can you tell us a

04:11little bit more about what the center

04:12does and what you're focused on

04:14yes so the center for research on

04:16Foundation models uh started two years

04:18ago is under the human-centered AI

04:21Institute at Stanford and the main

04:25mission of the center is I would say to

04:29increase transparency and accessibility

04:31to Foundation models so Foundation

04:34models are becoming more and more

04:37ubiquitous but at the same time one

04:40thing

04:40we have noticed is the lack of

04:43transparency and accessibility of these

04:45models so if you think about the last

04:47decade of deep learning it has profited

04:51a lot from having a culture of openness

04:53with tools like Pi torch or tensorflow

04:56data sets that are open People

04:58publishing openly about paper about the

05:02research and this has led to a lot of

05:05community and and progress not just in

05:09Academia but also in Industry with

05:11different startups and hobbyists and

05:13whoever just getting involved and what

05:15we're seeing now is sort of a retreat of

05:18that open culture where models are now

05:23being only accessible via apis we don't

05:27really know all the secret sauce that's

05:29going behind them and they're sort of

05:30limited access what's your diagnosis of

05:33why that's happening

05:35I think that this is very natural

05:38because these models take a lot of

05:42um you know Capital to to train there

05:45are enormous amount of um you can

05:47generally a lot of value and it's a

05:49competitive Advantage so you know

05:52incentives are to to keep these under

05:54control there's also another Factor

05:57which is safety reasons I think these

06:01models are extremely powerful and

06:04um maybe the models right now I think

06:06are well if they were out and open it

06:09would be maybe okay but in the future

06:10these models could be extremely good and

06:12having them you know anyone anything

06:14goes might uh we might have to think

06:18about that a little bit more carefully

06:19how do you think all this evolves in

06:21terms of if you look at the history of

06:22ml or NLP or AI we've had these waves of

06:26innovation and Academia and then we've

06:28had waves of innovation and

06:29implementation in industry and it's in

06:31some cases we've had both happening

06:32simultaneously but it feels a little bit

06:34like it's ping-ponged over time in

06:35different ways now that people are

06:38starting to be more closed in terms of

06:39some of these models on the industry

06:41side and Publishing less and being less

06:43open how do you view the role of

06:45Academia and Industry diverging if at

06:47all like do you think it'll be different

06:49types of research that each type of

06:51institution tackles do you think

06:53there'll be overlap and sort of curious

06:54how you how you view all that evolving I

06:56mean I think industry and Academia have

06:57very distinctive and important functions

07:00and otherwise when I tell my students

07:02well we should be working on things that

07:04are lean on academia's competitive

07:06advantage and historically I think this

07:08has meant different things so before ml

07:12was that big I think a lot of academic

07:15research was really about developing the

07:17tools to make these models work at all I

07:20remember working on systems and being ml

07:23models back in grad school and basically

07:26it wasn't working I mean computer wasn't

07:28working Vision wasn't working question

07:30answering wasn't working and and I think

07:33the goal of Academia there was to make

07:35things work and and a lot of advances uh

07:40that were born out of Academia then

07:43influence other ideas and influence

07:44other ideas before it started clicking

07:47and now we're seeing this uh a lot of

07:49the fruits of both Academy industry as

07:52research fueling this kind of Industry

07:54Drive that you you see today

07:57and now today I think it's the dynamic

08:00is is quite different because

08:03it's no longer Academy's job isn't just

08:06to get things to work because

08:08um you can do that in other ways there's

08:10a lot of resources going into

08:12um tech companies where there's if you

08:15have data on compute you can just sort

08:17of

08:18scale and blast through a lot of

08:21barriers and I think a lot of the role

08:24of Academia is understanding because

08:27these models for all their impressive

08:31Feats we just don't understand what they

08:33work how they work what the principles

08:36are

08:38um you know what's how does this

08:40training data how does a model

08:41architecture affect the different

08:42behaviors what is a the best way to

08:46weight data how do you what's a training

08:49objective many of these questions

08:50benefit from a more rigorous you know

08:53analysis the other piece which is a

08:56different type of understanding is

08:57understanding social impact and this is

08:59going back to the question about what is

09:01crfm's role is CRF

09:03um is uh is a center with over 30

09:07different faculty across 10 different

09:09departments at Stanford so it's quite

09:12interdisciplinary so we're looking at

09:14Foundation models not just from a

09:15technical perspective of how do you get

09:17these models to work but also thinking

09:20about the their economic impact the

09:23challenges when it comes to uh copyright

09:26and legality or working on a paper that

09:28explores some of those questions we're

09:30looking at you know different questions

09:32of you know uh social biases and

09:35thinking through carefully how the

09:38impact of these models have on issues of

09:42you know homogenization where you have a

09:44central

09:45model that's making perhaps decisions

09:49for a single user across all the

09:52different aspects and this so some of

09:56these are the types of questions they're

09:58also people at the center looking at

10:00risks of disinformation monitoring to

10:03what extent these uh these tools are so

10:05persuasive which they are getting

10:07increasingly so and what are the actual

10:10risks when it comes to let's say foreign

10:13State actors leveraging this technology

10:16and there's also people at the the

10:19center who are in medicine and we're

10:22exploring ways of leveraging foundation

10:24models and deployment and actual

10:27clinical practice and that's very

10:30exciting because that's again something

10:33that's we benefit from having a hospital

10:36attached to to Stanford on your term do

10:39you think some of the deployments are

10:40because if you go back into the 70s

10:42there was like the mycene project here

10:43at Stanford which is an expert system

10:45that outperformed Stanford Medical

10:47School staff at predicting what

10:49infectious disease somebody had for

10:50example and that was 50 years ago or

10:52almost 50 years ago and it never really

10:55got implemented in the real world and so

10:56one of my concerns sometimes in terms of

10:58the impact of some of these things is

10:59are there industries that are resistant

11:01to adoption or resistance to change and

11:03it's exciting to hear that you know at

11:05Stanford they're actually starting to

11:07look at how do you actually integrate

11:08these things into real clinical care do

11:10you view those things as very far out on

11:11the healthcare side to view them as sort

11:13of nearer I know that isn't the main

11:14topic we're going to cover but I'm a

11:15little bit curious given how close you

11:16are to all this yeah I think it's a it's

11:20a good question

11:21um I think there are a bunch of

11:23different issues that need to be

11:25resolved uh for example Foundation

11:27models are training a lot of data how do

11:29you deal with privacy how do you deal

11:31with robustness because once you're

11:34talking about you know in the healthcare

11:36spaces especially there are cases where

11:39we know that these models can still

11:41hallucinate uh facts and sound very

11:43confident in doing so

11:48yeah there you go so but you've also

11:51taken a point of view that we should you

11:53know expect superhuman if we if we see

11:56superhuman performance from these models

11:58like holding them to the standard of a

12:00human doctor is actually insufficient as

12:02well right yeah I I think that's a

12:04that's a great point is that

12:06for ages human level has been the target

12:09for for AI and that has really been kind

12:14of a North star that has fueled many

12:17dreams and efforts and so on over the

12:19decades but I think we're getting to a

12:21point where a lot of many axes it's it's

12:24a superhuman or should be superhuman and

12:27and I think we should maybe Define more

12:30of an objective measure of like what we

12:32actually want we want something that's

12:34very reliable it's grounded uh you know

12:37I often want more statistical evidence

12:40when I you know speak to doctors and

12:42sometimes fail to get that and have

12:45something that would be sort of a lot

12:47more principled and rational and and so

12:51this is more of a general statement

12:53about how we should think about

12:54technology not just chasing after

12:57mimicking a human because we already

12:59have a lot of humans and yeah that's an

13:01interesting point because

13:02um if you're pushing a lot of metrics

13:04around what actually works from an

13:05adoption perspective that's an area that

13:07certain aspects of healthcare work

13:09extremely well at and then certain areas

13:10are still deficient in and so it'll be

13:13interesting to see how you have to

13:14change certain aspects of culture in

13:15order to be able to measure when you

13:17adopt a new technology it's impact in

13:19that specific area so I think it's it's

13:20really fascinating to watch all this

13:22evolve right now now you've done

13:23extensive research on natural language

13:25processing and computational semantics

13:27can you explain what those terms mean

13:29and how they're relevant to the

13:31development of AI

13:33so computational semantics is the

13:37process where you take language text and

13:41compute quote-unquote meaning from it

13:43and that is something I'm not going to

13:46you know maybe try to attempt to

13:49um Define there's a huge literature of

13:52linguistics and you know philosophy

13:54about what what meaning is I would say

13:57that a lot of my research in in the past

13:59maybe 10 five to ten years ago was

14:03adopting this view that

14:05language is a programming language it

14:08computes you can give orders you can

14:10instruct you can do things with language

14:12and therefore it was natural to model

14:14natural language as a formal language so

14:17a lot of semantic parsing is about

14:20mapping natural language into a uh you

14:24know formal space so that machines could

14:26execute this and so one concrete

14:28application of this that I worked on for

14:30a while is mapping natural language

14:31questions into essentially SQL queries

14:34which obviously has many different

14:36applications as well

14:37and what was

14:40nice about this framework is that you

14:43were really sort of understand to really

14:47do this you have to understand how the

14:49words contribute to different parts of

14:51the SQL query and then you could get

14:53something that was a program that you

14:56could execute and you deliver the

14:57results as opposed to many question

14:59answering systems which you ask a

15:02question maybe retrieve some document

15:04and you're retrieving the answer or

15:06either that or making something up

15:08rather than Computing it rigorously so

15:11so that was a paradigm I was working in

15:14um maybe five or ten years ago

15:16but the main problem is that the world

15:18isn't a database a small part of the

15:20world is a database but most of a world

15:22is unstructured and then I started

15:26thinking about question answering in

15:28general and we developed the squad

15:30question answering Benchmark to fuel uh

15:33progress in open domain question

15:36answering and that in turn and many

15:40other data sets that were developed

15:41either both at Stanford and elsewhere I

15:44think led to the the development of

15:47these powerful language models that then

15:49like Bert and Roberta

15:52that and Elmo back in about 2018 to then

15:57many years ago many years ago ancient

15:59history now

16:00um to more like 2020 generation of you

16:05know these large Foundation models

16:08there are cases where you want to just

16:11map natural language into say people

16:14call it tool use like you ask some some

16:18question that reverse calculation you

16:20should just use a calculator rather than

16:21trying to sort of quote do it in the

16:24Transformer's head but there's also a

16:27lot of aspects of reasoning which are

16:30not quite formal we do this all the time

16:33and a lot of that happens kind of

16:36natively in the in the language model

16:38and I think it's still an interesting

16:42question how to kind of marry the two I

16:46feel like the two are still jammed

16:48together in a way where and maybe it's

16:51natural because there's certain things

16:53you can do in your head so certain

16:54things you can invoke a tool to use but

16:58this has been also one of the the

17:00classic debates in AI there's neural

17:04versus symbolic for a while symbolic AI

17:06was dominant now neural AI has come

17:09really taken off and become dominant but

17:13some of those Central problems of how do

17:15you do planning how do you do reasoning

17:17which was this focus and study of

17:20um symbolic AI are now again really

17:23relevant because now we've moved past

17:25just simple classification

17:26and just entity extraction but now more

17:30to to more ambitious tasks what do you

17:32think of some of the more interesting

17:34research programs right now in that area

17:37I think that

17:40it's it's interesting to uh remark on

17:43what's happening because to A first

17:45order approximation

17:47larger models trained on the relevant

17:50data seem to do well on various

17:53benchmarks

17:55I think that maybe there isn't enough

17:57emphasis on

17:58data efficiency and how quickly you can

18:01get and how robustly you can get to

18:03these points because we know uh it has

18:06been well documented that benchmarks can

18:07be gamable so even though you do want a

18:09benchmark doesn't mean you've

18:10necessarily solved the problem so I

18:14think one has to be a little bit

18:15cautious about that

18:17so obviously scale on more data is just

18:19one clear Direction but in terms of

18:22orthogonal directions what are the the

18:23methods several things have to happen

18:26one is uh we have to have ability to

18:30handle greater context lines if you

18:32think about a long reasoning chain you

18:35know Transformers are fixed and there's

18:38ways to extend it but fundamentally it's

18:40sort of a fixed model

18:42um there's

18:44let's say Advanced problem solving for

18:46example if you want to solve

18:48a math problem you'll improve something

18:51the language model generates sort of

18:53thinks I'll uh this Chain of Thought and

18:55generates token by token and then it

18:57generates something but we know that

19:00humans when they solve a problem you try

19:03different things you backtrack there's

19:05it's much more flexible iterative

19:09um and it can last a lot longer than

19:12you're going for a few iterations and

19:15what is the architecture that can handle

19:18that level of complexity I think is

19:20still an outstanding question is there

19:24any aspects of foundation or large

19:26language models that are emergent that

19:27you didn't anticipate or that really

19:28surprised you

19:30I think going back to gpd3 I think in

19:34context learning is something that

19:37um surprised many people including me so

19:41here you're prompting a language model

19:42with an instruction and input output

19:45pairs

19:46um you know here's a sentence it's

19:49classified positive here's a sentence a

19:51classified negative

19:54um and the model is somehow able to

19:56latch on to these examples and sort of

20:00figure out what you're trying to do

20:02um and solve the task and this is really

20:05intriguing because it's it's emergent it

20:08wasn't hand coded by the designers too

20:10oh I want to do in context learning this

20:12way now of course you could have done

20:14that but I think the the real sort of

20:16magic is you didn't have to do that in

20:18it yet it still does something it's not

20:20completely reliable but but it's it sort

20:23of can get better with um with better

20:26models and you know better data then

20:28there's Chain of Thought that sort of

20:31emerges from at a certain scale

20:35um do you want to explain what that is

20:36so the idea is if I have a question

20:39that's present to a language model

20:42the language model could just answer and

20:45it'll maybe get it right or wrong but if

20:47you ask a language model to generate an

20:49explanation of how it would solve the

20:53problems kind of thinking out loud then

20:55it's much more likely to get the answer

20:57right and this is very

20:59natural that um you know it would be the

21:02case for humans as well but the fact

21:04that again the Chain of Thought uh the

21:07capability is something that you know

21:09emerges the other thing I think is

21:12really wild is this and I think it's

21:15maybe a general principle which is the

21:17ability to mix and match so you can ask

21:21the model to explain the quicksort

21:23algorithm in the style of Shakespeare

21:25and we'll actually construct something

21:27that is semantically pretty on point but

21:31also uh stylistically you know much much

21:35better than whatever many people could

21:37come up with which means that it has

21:41learned different concepts of what

21:43Shakespeare and what quicksort are and

21:46is able to fuse them so if you think

21:48about creativity I think this is sort of

21:50an example of creative use

21:53um you know people say that sometimes

21:55all language models just memorize

21:56because they're so big and train on

21:58clearly a lot of text but these examples

22:01I think really indicate that there's no

22:03way that these language models are just

22:05memorizing because this text just

22:07doesn't exist and you have to have uh

22:10some creative juice and invent something

22:12new

22:14um and I I think it's just to kind of go

22:16on Riff on that a little bit I think the

22:19creative aspects of these language

22:21models with the potential for scientific

22:24discovery or doing research or pushing

22:27the boundaries of what we uh beyond what

22:32humans can do I think is really really

22:34fascinating because up until now again

22:36remember the the AI dream tops out at

22:39humans but but now we can actually go

22:42beyond in many many ways and I think

22:45that unlocks a lot of possibilities yeah

22:47there are a lot of really interesting

22:48examples I mean you could actually argue

22:50that uh like connecting New Concepts in

22:53any novel way is creativity but I love

22:56the one that is just discovering like

22:58new tactics and go that humans haven't

23:00discovered after thousands of years at

23:01play yeah yeah actually it will ask if

23:04you'll risk making a prediction that is

23:06impossible

23:08um emergent behaviors of models at the

23:11next level of scale anything you might

23:12predict

23:14emerging capabilities if we wouldn't

23:15have thought chain of uh Chain of

23:18Thought or in context learning would

23:19work

23:20I I can give you an example of something

23:23I think is emerging and I can give you

23:25an example of a hope but I don't know

23:28what I would call a prediction so what

23:30we're seeing today is

23:33the ability to instruct a model

23:36using natural language to do certain

23:39things you see a lot of this online with

23:42chai gbt and Bing chat where you can

23:44just and some of anthropics work as well

23:46you can instruct a model uh to be

23:50succinct generate three paragraphs in

23:53the style of and so on you can lay out

23:56these guidelines and have the model

23:59actually follow so this instruction

24:00following ability is getting extremely

24:03good now I will say that

24:07how much is emerging and how much is uh

24:10not it's hard to tell because a lot of

24:13these models it's not just a language

24:14model that's trained to predict the next

24:16word there's a lot of you know Secret

24:19Sauce that goes under the hood so and if

24:21you Define emergence of you know it was

24:24not intended by the designers I don't

24:26know how much of that is emergent but at

24:28least it's a capability that I think is

24:29very striking

24:31language models currently makes stuff up

24:33they hallucinate and this is clearly a

24:38pig problem

24:40um and almost in some ways a very

24:44difficult problem to crack the hope is

24:46that as models get better that some of

24:51this will actually go away

24:53I don't know if that will happen

24:56um but but I think that would be

24:58extremely nice because I I guess the way

25:01I think about these these models is that

25:03on their they're doing some sort of if

25:07anything about predicting the next word

25:09it's it seems very simple but you have

25:12to really internalize a lot of what is

25:15going on in this context what are the

25:17previous words what's the syntax what's

25:19who's saying them and all of that

25:23information and context has to get

25:25compressed and then that allows you to

25:27predict the next word

25:29and if you're able to do this extremely

25:33well then you sort of

25:36have a model of what's happening in the

25:38world at least the world that you've

25:40captured in in text and so while the

25:45notion of Truth might be you know

25:46ambiguous in in many cases I think the

25:49model can get an idea of what certain

25:53you know parts of Internet are maybe

25:55reliable and what parts of the internet

25:57are not and what kind of you know the

26:00idea of having

26:02you know entities and you know dates and

26:04locations and what activities there are

26:08I I think that will maybe uh become more

26:13Salient in the model like if you think

26:15of model language model that's just

26:18um predicting the next word and it's

26:19only trained to do that and you say elad

26:23travel to blank

26:25of course it's going to mix you know

26:27something up without further context but

26:30if it has a better understanding of

26:32what's happening and of of course with

26:34more context then maybe it can use that

26:37context to actually know that well okay

26:40well I don't know maybe I should ask

26:42words yeah yeah so scale is basically

26:44increasing the statistical accuracy of

26:46the prediction on the next word because

26:47you have more context and more data by

26:49which to infer what's coming and

26:51therefore it will reduce hallucinations

26:52because you're increasing accuracy

26:54yeah so I I think there's

26:58pre-training which is uh predicting the

27:01the next word and developing a world uh

27:04model so to speak and with those

27:07capabilities then you can you still have

27:11to say don't hallucinate but it will be

27:13much easier to control that model if it

27:16has a notion of what hallucination even

27:19is

27:20there was um I was talking to somebody

27:22who was close to the development of the

27:24uh Transformer model and his claim was

27:26that one of the reasons it's done so

27:28well is to your point around scale right

27:29eventually you hit enough scale that you

27:31see that it's it clearly has these

27:32really interesting emergent properties

27:33so you keep scaling it up and you keep

27:35sort of growing it and so therefore it's

27:37like a self-reinforcing loop to keep

27:38using these types of models and his

27:40claim was that um it's expensive to do

27:42that sort of scale and so therefore

27:44there may be other architectures or

27:46approaches that we've just never scaled

27:48up sufficiently in order to actually see

27:50if they have the same emergent

27:51Properties or certain characteristics

27:52that may be superior how do you think

27:54about that from the perspective of you

27:56know just going down the path of the

27:57Transformer side versus other

28:00architectures it may be really

28:01interesting and maybe neglected because

28:02we just yeah we just haven't thrown

28:03enough compute at them because it's

28:04expensive yeah I really hope that in 10

28:07years we won't be reusing the

28:08Transformer because I think the

28:10Transformer is as I mean it's a very

28:12good Orchestra people have tried to

28:13improve it but it's sort of like kind of

28:15good enough for for people to press

28:17ahead but scientifically there's no

28:20reason to believe that this is the one

28:24and there have been some efforts so one

28:27of my colleagues Chris Ray and his

28:28students have developed other

28:30architectures which are actually at

28:33smaller skills competitive with with

28:35Transformers and actually don't require

28:38the central operation of attention and I

28:41would love to see much more research

28:44exploring other alternatives to

28:45transformers this is something again

28:47that Academia I think is very well

28:49suited to do because it involves kind of

28:52challenging the the status quo you're

28:54not really trying to just get it to work

28:57and get it out there but you're trying

28:59to reflect on what are the principles

29:00what can we learn from Transformers what

29:03is it trying to do and how can we

29:04incorporate them in a much more you know

29:07principled way

29:09at some level it's still going to be

29:12about compute right so scaling loss for

29:15lstm show that if you were able to scale

29:17up lstms

29:19um maybe they would work you know pretty

29:22well as well but the amount of compute

29:24is you know many times more and given a

29:28fixed compute budget we're always in a

29:30compute constrained environment it's an

29:32efficient enough architecture to keep

29:34trying yeah you would you would not use

29:35an LSM that's Transformer strictly

29:37dominates an lstm from the perspective

29:39of giving a fixed compute budget so this

29:42question of like what if I could scale

29:44the LCM it becomes a little bit sort of

29:47irrelevant so for the things where you

29:49see transformer-like performance what

29:50sort of compute budget would you need in

29:52order to be able to test them out is it

29:54the scale of a million dollars 10

29:55million dollars 100 million dollars of

29:56compute I know it changes based on

29:57compute pricing and I'm just trying to

29:59get a rough sense of you know how

30:01expensive is it to try it in and then if

30:03we extrapolate down a compute curve

30:04three years from now maybe it's

30:05tractable again or something so yeah I

30:08it really depends on the the gaps that

30:11you're seeing

30:12um right now in Academia you can train

30:14one billion in parameter models I mean

30:17it's not it's not cheap by Academia

30:19standards but you can you can do it and

30:22you know here at crfm we're training

30:24like you know six or seven billion uh

30:27parameter models

30:29um and I think it's

30:30enough to

30:33um be able to try out some ideas but

30:36ultimately because of emerging

30:39properties and importance of scale you

30:43do need to go out farther along the

30:46curve to see whether you're you can only

30:48make a hypothesis you can find something

30:51like oh this seems promising at smaller

30:53scales you still have to go out and test

30:55whether it's really pans out or the Gap

30:58just closes

30:59and maybe this is a good segue to talk

31:02about to compute and the the uh together

31:05so

31:07we found it together on the the premise

31:10that compute was is a central bottleneck

31:13in Foundation models

31:15on the other hand there's a lot of

31:18compute that's decentralized that's

31:20maybe underutilized or idle and if we

31:24could harness that compute and bring a

31:27bear to

31:29um for both you know research and also

31:31commercial purposes then we could

31:34actually do all a lot more there are

31:37some you know pretty hefty technical

31:40challenges around doing that because

31:42Foundation models are typically trained

31:44in

31:45very high-end data center environments

31:48where they interconnect between devices

31:50is extremely good whereas if you just

31:54grab your average desktop or home

31:57interconnect it's it's you know a

31:59hundred times or more you know slower

32:02but you know with uh you know Chris Ray

32:06and Sir John and others really they did

32:08a sort of most of the credit for this

32:11um we've developed some techniques that

32:14allow you to Leverage this weekly

32:16connected compute and actually get

32:20um you know pretty interesting training

32:22going so so hopefully with that type of

32:26infrastructure we can begin to unlock a

32:30bit more of compute both for academic

32:33research but also for you know other you

32:37know startups and so it's really cool so

32:38it sounds a little bit like earlier

32:40predecessors of this may be things like

32:42folding at home where people did protein

32:43folding collectively on their computers

32:46or study at home where there was search

32:48through different astronomical data and

32:50now you can actually do this for

32:51training a an AI system on your desktop

32:55or you know access compute the existed

32:57data centers or in other places

32:59yeah so so folding at home is I think a

33:03great uh inspiration for a lot of this

33:05work at some point during the middle of

33:07the pandemic they actually had the

33:09world's largest supercomputer in terms

33:10of flop count because it was used to

33:14discover uh

33:15um do molecular Dynamic simulations for

33:18covid

33:20um the main challenge with Foundation

33:22models is that there's a lot of big

33:24models and big data that needs to be

33:26shuffled around so the task

33:28decomposition is much much harder so

33:31that's why uh many of the the technical

33:34things that that we're doing about

33:37scheduling and compression enable

33:41us to overcome these hurdles

33:46um and then there's a question of

33:47incentives so I think there's two

33:50aspects of what together is building one

33:52is so sort of what I would call a

33:54research computer which is for academic

33:57you know research purposes where

34:00um people can contribute compute

34:03um and in the in the process of

34:05contributing compute they're able to use

34:09um the sort of the the decentralized

34:12cloud for doing training when they're

34:16not using it and when they are using it

34:19they can use much more of that so the

34:21hope is that it provides a much more

34:23efficient use of the the compute because

34:26you're spreading it across a larger set

34:29of people

34:30and then you know on the commercial side

34:33the hope is that the open models that

34:36are developed and through this um in the

34:41open source ecosystem can the together

34:45platform can allow people to fine-tune

34:49and adapt these models to various

34:51different uh use cases

34:54um one thing I think is noteworthy is

34:56that you know we think of foundation

34:59models today as you know maybe there's a

35:01few Foundation models that are you know

35:04very good and exist but I think in the

35:07future there's going to be many

35:09different ones for different kind of use

35:12cases as the space takes off many of

35:15them will be derived from maybe existing

35:18Foundation models but many of them will

35:20also be perhaps trained on from from

35:23scratch as well I think this is actually

35:25a pretty uncommon Viewpoint right now

35:27can you talk a little bit about like

35:28where you um or you know research

35:31efforts you're associated with choose to

35:33train models like and maybe buy a PubMed

35:37or whatever else you think is relevant

35:39here okay so there's Foundation models

35:41is a pretty broad

35:43um category of and many of the the sort

35:47of the core Center is you know large

35:49language models that are trained on lots

35:51of you know internet data

35:54um we've trained a model here at crfm

35:58um in collaboration with Mosaic and a

36:02bot called biomed LM and it's not a huge

36:06model but it's trained on PubMed

36:07articles and it exhibits

36:10um you know pretty good you know

36:12performance on various benchmarks for a

36:16while uh you know we were able to be

36:19state of the art on the U.S medical

36:21licensing exam you know Google did come

36:24up with a model that was I think 200

36:26times larger and they they beat that

36:28model so you know scaled doesn't matter

36:30but but I think there are many cases

36:32where you for efficiency reasons maybe

36:35you do want a smaller model since cost I

36:38think is a is a

36:43scientific fraud using this I'm just

36:45wondering if effectively you could

36:46screen all the papers and see which ones

36:48appear to be off relative to the

36:50literature or reuse of images or it just

36:52seems like there's

36:53some interesting things that you could

36:54potentially on surface through the use

36:56of

36:57um yeah the Corpus of this information

36:58so so I think uh you know stepping back

37:01I alluded to how these models can be

37:04misused you know for frauds spam

37:07disinformation but also plagiarism you

37:09know a lot of students are using chat

37:12gbt to basically do their homework

37:14um and you know I think there are

37:17you know several things that one can do

37:20so we I was excited about I was actually

37:22thinking about the other way can you use

37:23the model to detect fraud

37:25um given that you understand the Corpus

37:26of biomedical information you should be

37:29able to say well this is inconsistent or

37:31this is a result that is somehow

37:33duplicative or plagiaristic or

37:36yeah so definitely I think you you can

37:42um you can well it's gonna try this

37:43tonight yeah I'm actually thinking about

37:45that sounds really interesting well you

37:46can uh you can review paper stuff I mean

37:49I I think that one has to be a little

37:52bit careful

37:53um when uh you know doing these things

37:56um especially for a more consequential

37:58decisions

38:00um but in principle you know if we think

38:03about these models as truly capturing

38:05enough knowledge about a field

38:09um at least it can flag certain things

38:12yeah I don't know if you know Finding

38:14plagiarism is necessary the sure the the

38:16ultimate application ultimate

38:18application maybe there's some pseudo

38:19form of peer review that it helps with

38:21before you do open publication or I'm

38:23just brainstorming right but it just

38:24seems like a really interesting area

38:25that yeah I haven't heard a lot of

38:27discussion around so I was just curious

38:29um about it yeah I think there is a

38:31problem right now where there's just so

38:34many papers that are generated I mean

38:36written

38:41um and for a researcher it's actually

38:45becoming hard to you know really just uh

38:49distinguish the signal from the noise so

38:50having tools that could do literature

38:55review and really summarize and allow

38:58you to ask questions and search for

39:00things I think would be a really

39:02important part of um you know the

39:05research pipeline

39:07um you know I know illicit is a company

39:09that builds these tools based on

39:11language models that can Aid in some of

39:15these processes yeah it seems like

39:17there's also work being done on the

39:18embedding side to to do similar ways to

39:21just you know have a mini Corpus that

39:22you're synthesizing or looking over

39:24interacting with so it seems like a

39:26really exciting area yeah I think you

39:29know one of my you know now dreams is if

39:31you could really have them

39:33um A system that could

39:36really do research in the sense of

39:39reading it has already read the

39:42literature can you generate hypotheses

39:44can generate interesting questions and

39:46can it proposed experiments can I write

39:48code can it under actually run the

39:51experiments and use the results to

39:54revise it's it's understanding of the

39:57world it's sort of like a scientist and

40:01I think you know obviously having a a

40:04human in the in the loop um to you know

40:07guide it to say okay I think these are

40:09the right questions I think that would

40:11really accelerate the the pace of

40:13scientific progress how far away do you

40:15think we are from that is that something

40:16we can do today is that two years away

40:18is that five years away

40:20you know these um projection questions

40:22are extremely hard these days

40:25um

40:25I think you can

40:27Deuce uh limited things

40:30um already in terms of

40:33doing literature research it can

40:35generate things

40:37um

40:38and I think you're at the level where it

40:41could probably generate things and you

40:43know I think it would still be a lot of

40:45you know human Loop but you could

40:46generally probably uh let's say a I know

40:49a class project uh type of project

40:53um could it really do something

40:54completely like a breakthrough that

40:57seems maybe harder but on the other hand

41:00alphago was able to discover

41:02completely kind of alien different

41:05strategies and with the right you have

41:09to set it up correctly I don't think you

41:11can just generate from a language model

41:12but if you set it up properly maybe

41:13these models can actually discover new

41:16things

41:17I remember reading a paper even maybe it

41:21was like five years ago where a bunch of

41:23materials scientists used

41:25um just word effect which is just War

41:27vectors from over 10 years ago and they

41:31were able to discover new

41:33um you know thermodynamic properties of

41:35you know materials and I imagine that

41:38today with a much more powerful models

41:41you should be able to do you know a lot

41:43more than that

41:45yeah I mean we just did talk to um

41:47Daphne Kohler who you I'm sure know very

41:49well about what ncitro is doing and so

41:51you know as some heavily assisted

41:54version of data generation and you know

41:57better search and optimization like I

41:58think that's one example of that sort of

42:01effort yeah I think that aspect is

42:03really exciting yeah I want to talk

42:05about

42:06um some of the I think like most

42:08important or hopefully most important

42:10work that the center's done so far can

42:12you explain what a Helm is and what the

42:14goal has been yeah so Helm stands for

42:17holistic evaluation of language models

42:20which is this project that happened over

42:23the last year and the goal is to

42:26evaluate language models so the trouble

42:30is that language models is

42:34um a very generic

42:37um thing it's like saying evaluate the

42:39internet

42:40um what is that what does that mean the

42:42language model takes text in and text

42:43out and one of the features of a

42:47language model is that it can be used

42:49for a myriad uh different applications

42:55um and so what we did in that paper is

42:59to be a systematically and as rigorous

43:01as we could in laying out the different

43:03scenarios in which language models could

43:05be used and also measure aspects of the

43:10these uses which include not just

43:13accuracy which a lot of benchmarks focus

43:16on but also issues of how robust it is

43:18how well it's calibrated meaning that

43:21whether does the model know what it

43:24doesn't know

43:25um

43:26whether the models are

43:29um you know Fair according to of you

43:32know of some definition of fairness

43:34whether they're they're biased whether

43:36they spew out toxic content how

43:39efficient they are and then we go and we

43:42basically grab every language model

43:44that's prominent that we could access

43:46which includes open source models like

43:49opt and Bloom but also getting access to

43:53apis from cohere AI 21 openai and also

43:58anthropic and you know Microsoft so

44:01overall there were 30 different models

44:0342 scenarios and seven metrics and we

44:07ran the same evaluations on on all of

44:11that we've put all the results on our

44:15the helm website so that you could see

44:18the top level statistics and accuracies

44:21but also you can drill down into on this

44:25particular Benchmark what are the

44:26instances is one of the predictions that

44:27these models are making all the way down

44:30to what prompts are you using for the

44:32language models so the idea here is that

44:36we're trying to provide transparency to

44:38the space right we know that these

44:40models are powerful they have some

44:43deficiencies

44:45um and we're trying to lay that all out

44:49in a kind of a scientific uh manner so

44:53I'm pretty excited about this project

44:54the challenging thing about this project

44:56is since we put out the paper maybe

44:58three months ago a bunch of different

45:01models have come out including chat gbt

45:03llama you know coherent AI 21 have

45:07updated their models

45:09um GPT form might come out at some point

45:12um so what had this project has evolved

45:15into is this dynamically updating where

45:19every two weeks we refresh it with new

45:24um models that are coming out as well as

45:26new

45:27um scenarios because one thing we also

45:30realize with uh which was made clear by

45:32Chachi PT is that the type of things

45:35that we ask of a language model is

45:37changing we don't ask it just to do

45:38question answering or just to do is they

45:40increase in capability increasing

45:42capabilities now they can do a lot more

45:44they can you know write an email or

45:46um or give you you know life advice on

45:50XYZ if you've put in a scenario and

45:53in or write a you know an essay about

45:56XYZ

45:58and I think what we need to do with the

46:01Benchmark is also add the scenarios that

46:04capture these capabilities as well as

46:07kind of new uh risks so we're definitely

46:09interested in

46:11um benchmarking how persuasive are these

46:14these language models which governs you

46:16know what are the risks that someone is

46:18going to be using them to

46:21um and how and also how secure they are

46:23one thing I'm actually also worried

46:26about is given all that the jailbreaking

46:28that is extremely common with these

46:30models where you can get the models to

46:32can do any basically bypass safety

46:35controls

46:36um

46:37if these models start interacting with

46:40the world and accepting external inputs

46:42now you can not only just sort of

46:46jailbreak your own model but you can

46:48Jailbreak other people's model and get

46:49them to do various things and then so

46:52that could lead to sort of a Cascade of

46:54Errors so

46:56um some of these are the concerns that

46:58we hope to also capture with the model I

47:01should also mention we're also trying to

47:03look at multimodal models which I think

47:05is going to be pretty pertinent so lots

47:08to do a bunch of the things that you've

47:11described as uh sort of the role you see

47:14for the center or even like Academia in

47:17the age of foundation models broadly

47:19like they have more of an intersection

47:21with policy than traditionally like

47:23machine learning research like how do

47:25you think about that yeah actually we've

47:27I'm glad you asked that because we've

47:29been thinking a lot

47:30about this social implications of these

47:33models and sort of the

47:35not the models themselves which we focus

47:38a lot on talking about but the

47:40environment in which these models are

47:42are are built

47:44um

47:45so

47:46I think it's interesting to think about

47:49there are a few players in the space

47:52um with different opinions about how the

47:56models should be built some are more

47:58closed some are more open

48:01um and there's also again this sort of

48:05lack of transparency where we have a

48:09model that's produced and it's aligned

48:14um apparently to human values but then

48:16once you start a kind of questioning you

48:18can ask a question okay well you know

48:21which which value which humans are we

48:23talking about who determines these

48:26values what legitimacy does that have

48:29um and what's the sort of accountability

48:31then you start noticing that well a lot

48:34of this is just kind of completely of a

48:37black box so one thing that we've been

48:39working at the center on is developing

48:42Norms

48:44um starting with transparency I think

48:46transparency is necessary but not

48:48sufficient you need some level of

48:50transparency to even have a conversation

48:52about any of the the policy issues

48:56um so making sure that uh the public can

49:00understand how these models are are

49:04built

49:05um what's at least some notion of like

49:08what the data is what are the

49:11instructions that are given to

49:14um to align the models

49:16um we're trying to advocate for

49:19greater you know transparency there

49:23um and I think this will be really

49:26important as these models really get

49:29deployed at scale and start impacting

49:33um you know our lives

49:34um you know what kind of a analogy I

49:37like to think about is you know

49:38nutrition labels or any sort of

49:40specifications institutes on electronic

49:42devices there's some sort of uh

49:45obligation I think that um you know

49:48producers of some products should have

49:50to make sure that their product is used

49:53properly and has

49:56um you know some bounds on it

49:59I I guess I'll ask two questions

50:01um one is if people wanted to

50:02participate in together is there a

50:03client they can download and install or

50:05use or how can people help support the

50:07together efforts yeah so we are

50:10developing a client

50:11um that will be made available both from

50:15the perspective joining the together

50:17clouds so that you can contribute your

50:19compute but also where we have an API

50:22that we're developing so that people can

50:24use

50:26um the the together infrastructure to do

50:29inference and fine-tuning models we are

50:33also training some open models so we

50:35have this um something called open chat

50:38kit that's

50:40um uh we're releasing soon and this is

50:43built on top of illusory ai's Neo X

50:47model but um you know improve to include

50:51various different types of capabilities

50:54um it's still of you should think about

50:56it as really a work in progress what

50:59we're trying to do is open it up so that

51:01people can

51:03um play with it give feedback and have

51:05the community improve this

51:08um together

51:10um rather than us trying to produce some

51:13finished product and putting it out

51:14there this goes back to the point about

51:17involving you know the spirit of Open

51:20Source and involving the community to

51:22build these Foundation models together

51:25as opposed to someone unilaterally

51:27building them

51:29while we're talking uh timelines and

51:31predictions that you don't uh quite feel

51:33comfortable making how do you think is a

51:36rigorous scientist about AGI

51:39I must say that my opinions about egi

51:41have changed over time I think that for

51:47a while

51:49um it was you know perceived by most of

51:51the community as

51:53um you know laughable yeah I will say

51:55that uh in the last 10 years I have been

51:59aware of you know there's a kind of a

52:02certain community of uh

52:04um who think about AGI and also

52:06existential risk and things like that

52:08you know so I've

52:10been in touch with people who think

52:13about these I think I see the world

52:15maybe differently I think of perhaps um

52:19certainly these are powerful

52:21Technologies and could have extreme

52:23social consequences and but there's a

52:25lot of more near-term issues I focus a

52:28lot on kind of robustness of ml systems

52:31um in the last you know five years but

52:34you know one thing I've learned about

52:37Foundation models because of their

52:39emerging qualities I've learned to be

52:41very kind of no

52:43um uh open-minded I would say I was

52:46asking a lot earlier about what no

52:47priors where that comes from and I think

52:49it's a fitting way to think about

52:52um you know the world because I think

52:54even you know everyone including

52:57scientists often get sort of drawn into

53:00a particular world view and Paradigm and

53:03and I think that you know the world is

53:06is changing both on the technical side

53:09but also

53:10how we conceive of AI and you know maybe

53:15even humans at some level and I think we

53:18have to be open-minded to you know how

53:21that's going to evolve over the next few

53:23years

53:25awesome thanks for doing this

53:26conversation yeah thank you very much

53:29[Music]

53:32foreign

🎥 Related Videos

What vaccinating vampire bats can teach us about pandemics | Daniel Streicker

a16z Podcast | Things Come Together -- Truths about Tech in Africa

2024 TSCRS Applications of anterior segments diagnostic instruments in cataract surgery

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

NES Controllers Explained

🔥 Recently Summarized Examples

4 Steps to Master Any Complex Skill (quickly)

40 Years of Fitness Experience in Less Than 11 Minutes.

Gun Controlling Media Makes FATAL Mistake... They Have Tied Their Fate To Biden's & Gun Rights Win

GET READY! Palantir Is Officially The Next Nvidia.

Abundant Thinking: The Hidden Key to Get Everything You Want (Audiobook)

The Coming Demonic Invasion (Revelation 9:12–21)

View original video