CHM Seminar Series: Developing a 'Theory of Machine' to Examine Perceptions of Algorithms — J. Logg

Max Planck Institute for Human Development2021-05-11

230 views|3 years ago

💫 Short Summary

The video discusses research on algorithm appreciation and algorithmic hiring, exploring how people respond to algorithmic advice and the importance of understanding human judgment in decision-making processes. Findings show a preference for algorithms over human advice, challenging assumptions of algorithm aversion. Factors such as numeracy, expertise, and competition influence preferences. The study highlights the need for clear communication, collaboration between disciplines, and addressing biases in algorithmic decision-making for effective outcomes. Amazon's biased hiring algorithms serve as a cautionary example, emphasizing the importance of diversity in algorithm development teams and auditing algorithms for fairness.

✨ Highlights

📊 Transcript

✦

Jennifer Locke discusses algorithm appreciation and algorithmic hiring at Georgetown University.

00:53

People's reactions to algorithmic advice in decision-making contexts are explored.

The importance of algorithmic judgment in organizations is highlighted.

The need for a better understanding of maximizing the benefits of algorithmic advice is emphasized.

Research delves into how stakeholders assess algorithmic judgment and the evolving role of algorithms in managerial decision-making processes.

✦

Importance of understanding how people respond to algorithmic advice.

03:25

Algorithms generally outperform human experts in accuracy.

Algorithms can only enhance human judgment if people are willing to listen.

Connection between producing insights and applying them is often overlooked.

Psychological distrust of algorithms has been noted, with expert clinicians showing reluctance to believe in the accuracy of simple mathematical calculations over their own judgment.

✦

Research on algorithm accuracy and people's perceptions.

08:09

People tend to distrust algorithms, but this belief is based on anecdotes.

Study focuses on how people update judgments based on advice sources.

Results show a preference for algorithms over humans, challenging algorithm aversion.

Importance of benchmarking algorithmic advice against human advice is highlighted.

✦

Algorithm appreciation: People value advice more when labeled as coming from an algorithm rather than a person.

10:12

Researchers initially predicted the opposite results, highlighting a disconnect between expectations and empirical evidence.

The study also examines how subjective confidence in one's judgment affects the use of algorithmic judgment.

Participants make choices before receiving advice to analyze decision-making processes more effectively.

✦

Impact of confidence and expertise on decision-making between algorithmic advice and personal judgment.

14:07

People tend to have more confidence in their own estimates than external advice, leading them to choose personal judgment.

Expert national security professionals and laypeople were compared in forecasting abilities.

Despite subjective expertise manipulation, participants still leaned towards algorithmic advice.

Research aimed to understand how confidence and expertise affect responses to algorithmic advice, providing insights into decision-making processes.

✦

Algorithmic advice is discounted more by experts than lay people, leading to less accurate forecasts.

17:29

Individuals with higher numeracy levels are more likely to trust algorithmic advice.

Decision makers tend to devalue algorithmic advice when comparing it to their own knowledge.

Expertise in a specific domain results in discounting advice, regardless of the source.

Various factors play a role in algorithm appreciation, emphasizing the complexity of decision-making processes.

✦

Preferences between algorithmic and human advice.

21:22

Older individuals rely on algorithms as much as younger ones.

75% of participants chose algorithms over human advice.

Labels on algorithms influenced people's preferences.

Many prefer expert human advice over algorithms, despite the latter being cheaper and more accessible.

✦

Research challenges assumptions on algorithm aversion and human judgment differences.

23:55

Study compares views on algorithmic hiring vs. human managers, with real-world implications.

Results show applicants have preferences on how their applications are reviewed.

Experiment involved tasks for potential hiring and a murder mystery puzzle element.

Research highlights complexity of algorithm aversion in decision-making processes.

✦

Preferences for hiring algorithms versus human judgment were explored, with 70% of applicants choosing a person over an algorithm in study one.

26:35

Factors influencing this preference included competition within the applicant pool.

Study three looked at how the hiring manager's characteristics impact applicants' preferences, revealing a reversal of preference based on in-group or out-group relations.

Study four found that 75% accuracy is needed for people to prefer an algorithm over a hiring manager.

✦

Study tasks involved an anagram quiz, a trivia quiz, and a murder mystery to measure performance and competitiveness.

30:19

Participants were motivated by bonus pay and the opportunity to create a competitive application packet.

Active participants were assigned roles as applicants or hiring managers, with all participants acting as applicants.

Deception was used in the study to measure performance and competitiveness.

Participants had the opportunity to write persuasive essays and choose who would review their application packet.

✦

Preferences for human or algorithm assessment of application packets are influenced by competition in applicant pools.

33:31

In low competition, more people preferred a person over an algorithm, while the opposite was true in high competition.

Preferences for in-group members to assess applications were shown, but this preference changed if the manager disagreed on key political issues.

Subsequent studies explored systematic versus random error in decision-making biases.

✦

Study findings on applicant preferences for human review vs. algorithm in hiring process.

37:45

70% of applicants prefer human review over an algorithm for their application packets.

Preference weakens with higher competition in the applicant pool.

People also prefer human judgment when the decision is self-relevant.

Results indicate a 75% accuracy benchmark for applicants to prefer an algorithm over a person in the hiring process.

✦

Discrepancies between stated beliefs and actions when updating beliefs are discussed.

40:00

A new project compares responses to algorithmic and human advice in judgment scenarios.

Differences in algorithm appreciation and judge advisor conditions are found in the study.

Deception is mentioned as a potential tool for efficiency, along with the proposal of field studies to test accuracy feedback in real-time.

Viewers' questions prompt further discussion on the wording and introduction of algorithms to study subjects, leading to a discussion on operationalizing algorithm appreciation.

✦

Study on different operationalizations of algorithms and people's responses to advice from a black box algorithm.

43:21

Algorithm appreciation was consistent regardless of the type presented in the study.

Participants had varied definitions of algorithms, categorizing them as math, rules, and computer-related.

Importance of understanding public perceptions of algorithms and the need for clarity in communication about their functions and purpose was emphasized in the study.

✦

Reasons for choosing algorithms in high competition scenarios.

46:48

Participants cited time constraints and concerns about hiring managers as reasons for their choice.

Efficiency played a significant role in their decision-making process.

Lack of a clear explanation sparked further exploration into the mechanisms behind their decisions.

Speaker encouraged future research and experiments to better understand decision-making in competitive environments.

✦

Research projects focus on algorithm appreciation and algorithmic hiring.

50:55

Theoretical framework created to analyze human and algorithmic judgment expectations and perceptions.

Framework aims to document differences in input, process, and output between human and algorithmic judgments.

Importance of understanding people's expectations in leveraging data for both types of judgment emphasized.

Predictions made based on research findings to better understand mechanisms at play.

✦

Perceptions and expectations of algorithmic and human judgment.

54:37

Students worry algorithms may not recognize unique qualities, focusing on objective criteria only.

Algorithms may process fewer categories of cues, potentially leading to bias in decision-making.

People expect algorithms to provide less relevant data and explanations for their judgments, making them less persuasive.

The 'special snowflake' versus the average person influences perceptions of algorithm recommendations.

✦

Importance of Human Interaction in Algorithmic Decision-Making.

57:40

Emphasizes the need to expand the input-process-output framework to include decision-making and judgment.

Discusses applying this framework to algorithmic hiring and feedback scenarios.

Aims to reconcile various research perspectives and contribute to a comprehensive theory of machine learning.

Highlights the significance of human experiences with data analytics and algorithmic recommendations in the development of a collective theory of machine.

✦

Importance of clear communication between data analytics teams and decision-makers.

01:00:01

Disconnect between producing analytics and acting on them, with decision-makers not utilizing data effectively.

Emphasis on collaboration between computer science, human-computer interaction, and psychology for actionable data.

Example of Amazon using biased hiring algorithms and limitations of reverting to human judgment.

Significance of addressing biases and improving communication for effective decision-making based on data.

✦

Addressing algorithmic bias in hiring practices.

01:02:36

Biased language in resumes led to algorithmic hiring bias, favoring certain language associated with confidence, which correlated with gender.

Diversity in algorithm development teams can help mitigate bias.

Auditing algorithms before launch is crucial to identify and address potential biases.

Opportunities for academics and researchers to contribute to the ongoing discussion on algorithmic fairness.

00:00hello everyone here um it's a great

00:03honor for me to coordinate this event

00:06and host jennifer as our guest the

00:08speaker

00:09it's really exciting to have you after

00:11reading so many of your insightful

00:13papers

00:14so first please allow me to briefly

00:17introduce jennifer locke

00:19so jennifer is an assistant professor of

00:22management

00:23at georgetown university's mcdonald's

00:26school of business

00:28her work on overconfidence received the

00:312019 early career award

00:33by the journal of experimental

00:35psychology

00:36and counter to the well-established

00:39phenomenon of

00:40algorithm aversion her paper on

00:43algorithm appreciation

00:45suggests new insights on when people are

00:48willing to embrace algorithms

00:50and improve the quality of their

00:52decisions

00:53so today she will give us a talk about

00:56algorithmic hiring

00:58so let's welcome jennifer and i will

01:01give the floor to you now

01:03thanks so much for having me here i'm

01:04really looking forward to

01:06um discussing this work with

01:09especially this group and i'm really

01:13curious to get your thoughts i'll share

01:15some work

01:16on algorithm appreciation and then focus

01:19on algorithmic hiring

01:21all with the eye towards developing

01:24a more overarching theoretical framework

01:27which i call theory of machine

01:28and i think this is an especially great

01:30group to get

01:32feedback on that theory that i'm

01:33starting to build so thank you so much

01:35for having me

01:36i'm interested in how managers can

01:38assess themselves in the world more

01:39accurately

01:41and as ming chen mentioned the research

01:43i'll share today examines if people are

01:45willing to listen to algorithmic advice

01:47which is important because that can

01:48actually help them improve the accuracy

01:50of their decisions

01:51in many decision contexts the second

01:54paper that i'll focus on today

01:56is how people want their own performance

01:58to actually be assessed when they're

01:59applying for a role on a team

02:01either by a person or an algorithm and

02:04over the course of conducting this

02:05research i've developed a

02:07theoretical framework which i'm really

02:09looking forward to getting your thoughts

02:10on theory of machine

02:12i probably don't have to tell it to this

02:14crowd but although historically

02:16managers and organizations have received

02:18advice from people

02:20really with the rise of big data more

02:21and more organizations are trying to

02:23leverage the accuracy of algorithmic

02:25advice

02:26to inform their managerial decisions

02:29so some use algorithms to hire promising

02:31applicants already this is rising

02:34and some use algorithms to predict

02:36performance for current employees

02:38and some to predict choose at risk for

02:40leaving in order to improve their

02:42of their retention so the issue here

02:44really

02:45is that while many organizations are

02:47swimming in data

02:48and investing in algorithms to sort

02:51through that data and produce

02:53this new source of advice many are

02:56really trying to understand how they can

02:57fully realize or maximize the benefits

03:00of algorithmic advice

03:01so specifically it's unclear what

03:04happens when this algorithmic advice is

03:05generated

03:06actually gets in the hands of managers

03:09and other decision makers and the second

03:11research i'll share today looks at

03:12what happens when algorithmic judgment

03:15is being assessed by stakeholders so not

03:17the decision maker but people who are

03:19applying for jobs

03:21so first we'll look at how do managers

03:24listen

03:25to algorithmic advice or how do they

03:27override it

03:30so we'll go through if people are

03:31willing to listen to algorithmic advice

03:33in my algorithm appreciation paper this

03:35is when people were making predictions

03:37about the world

03:38so for uncertain events like brexit and

03:40other geopolitical events

03:42um and then additionally how do how do

03:45stakeholders

03:46um actually respond to being assessed

03:49themselves either by a person or an

03:52algorithm and this will all lead up to

03:54a theory of machine so i'll just plant

03:56the seed for you the idea of theory of

03:58machine

03:58this is a um inspired by a long line of

04:02research in both philosophy and

04:03psychology

04:05called theory of mind so theory of mind

04:08looks at how we infer intentions and

04:10beliefs

04:11in the minds of other people and i'm

04:14looking to develop a theory of machine

04:16which describes lee people's theories

04:17about

04:18algorithmic judgment and human judgment

04:21and how those two compare at their

04:24finest

04:26so the importance of understanding how

04:27people respond to an algorithmic advice

04:29is really twofold

04:30first it has potential to greatly

04:32improve decision making

04:34algorithms generally outperform the

04:36accuracy of human experts when the two

04:38are actually directly compared and

04:40there's a long line of literature on

04:41this

04:42and i saw that um some of your past

04:44speakers had had touched on this as well

04:46which was exciting to see second

04:49algorithms can only improve human

04:50judgment if people are actually willing

04:52to listen

04:53um and so while the field of data

04:55analytics or the systematic computation

04:58of data most commonly using algorithms

05:00continues to evolve at a rapid rate

05:03um the important connection between

05:06producing insights and actually applying

05:07them

05:08is often overlooked especially when you

05:11start to talk to folks in industry

05:13the people on data analytics teams are

05:15just assuming that whatever output

05:17they're producing people are going to

05:18follow

05:19100 but that's not always the case

05:21necessarily

05:22and deserves empirical testing i think

05:25so

05:25as i mentioned the first paper i'll

05:27share test if people are willing to even

05:28listen to algorithmic advice in the

05:30first place

05:33um to give a little bit of background

05:35and i'm

05:36i'm sure most of you know this work

05:37pretty well um but just so we're all on

05:39the same page

05:40really the enormous strength of

05:41algorithms in algorithmic accuracy and

05:44judgment accuracy has prompted

05:46speculation

05:47as to how comfortable people are relying

05:49on algorithmic advice

05:51so in his classic book on the accuracy

05:53of algorithms which

05:54sat on my bookshelf in grad school for

05:57many years

05:58um neil looked at the accuracy of

06:01algorithms rather

06:02relative to human judgment and neil

06:05may have made the first academic mention

06:08of psychological distrust of algorithms

06:10he described how when he was describing

06:15the

06:15um empirical results comparing accuracy

06:19of algorithmic

06:20to human judgment when he shared this

06:23work

06:24with expert clinicians from the 50s

06:26these expert clinicians were actually

06:28really reluctant to believe that a

06:29simple mathematical calculation

06:31could outperform their own precious

06:34judgment

06:35this sentiment is echoed in other

06:37research on the accuracy of algorithms

06:40that didn't necessarily look at

06:42people's perceptions of algorithms and

06:45really what that

06:46did is actually really important because

06:48it led to conventional wisdom that

06:49people just distrust algorithms

06:52and that idea survives to this day

06:55with limited empirical testing so i know

06:57that berkeley dead forest

06:59have visited you folks he has some great

07:01work on this

07:02and nate fast as well and there's a lot

07:04more work bubbling up on this topic

07:06which is

07:07really thrilling but i think it's useful

07:08to know originally

07:10the idea of distressing algorithms

07:13really came from anecdotes

07:15and so much so that even in kahneman's

07:17best-selling book

07:18thinking fast and slow this idea of

07:21algorithm aversion is very strong

07:25um one thing that i want to mention

07:27about the first paper i'll talk about

07:29through our experimental design we

07:31actually eliminate

07:32issues that have made some prior

07:34empirical

07:35results a little bit more difficult to

07:37interpret so

07:39some prior research that has looked at

07:41actual perceptions rather than just the

07:43accuracy of algorithmic

07:44versus human judgment um had looked at

07:47choice

07:48so we use a paradigm where we present

07:51people with identical advice

07:53that helps us control for a lot of

07:54factors including the accuracy of the

07:56advice

07:57and we merely manipulate the label of

07:59the source so rather than measuring

08:01choice which a lot of past work has done

08:03we measure how much people update to the

08:05device that they receive

08:07based on whether they think it comes

08:09from an algorithm or a person

08:12and so a little spoiler alert instead of

08:14finding a version we find algorithm

08:17appreciation

08:18hence the title of our paper um

08:21so in this paper i'm just going to give

08:23a brief overview because most of the

08:25studies in this paper

08:26use this paradigm we use a methodology

08:29called the judge advisor system

08:31it's frequently used to study how much

08:33people incorporate the judgments

08:35normally of other human beings

08:37into one's own judgment this paradigm

08:40enabled us to measure the percentage

08:42that people actually adjust

08:43towards the advice from their initial

08:46estimate

08:47and you just want to pause here in case

08:48anyone has any questions about that so

08:50people make an

08:51initial numeric estimate then they

08:54receive advice that's also numeric

08:56and then they have the opportunity to

08:58make a final incentivized estimate

09:00so if they updated fully to the device

09:02that would be a weight on advice of one

09:04if they completely discounted the advice

09:06from either the algorithm or other

09:08people that would be a weight on advice

09:10of xero

09:11so any questions about that

09:16i'll i'll keep chugging along unless i i

09:18hear otherwise

09:20so when our first studies we benchmark

09:22what utilization of algorithmic advice

09:24looks like

09:24relative to utilization of human advice

09:27and it's useful to have this benchmark

09:28of how people respond

09:30to human advice um they're there because

09:33uh there's past work and advice taking

09:35literature that people tend to really

09:37heavily discount

09:38advice from other people on average

09:40people discount

09:42um advice so heavily that they actually

09:44only update 30

09:46to advice when it comes from other

09:47people so we wanted to know

09:49well how controlling for the fact that

09:51we know that people tend to just

09:52discount advice in general

09:54how do they then respond to advice if it

09:56comes from a new source

09:58across our experiments we find a really

10:01robust effect

10:02um that people consistently give more

10:05weight to identical advice

10:07when it's labeled as coming from an

10:09algorithm than a person

10:10so we call this effect algorithm

10:12depreciation and we find it across a lot

10:14of different domains both

10:16objective and subjective domain so for

10:19visual estimates

10:20as well as the most subjective domain we

10:23could think of which is people

10:24perception

10:25so will two people um

10:28described in the study get along

10:30romantically and that's the

10:32the judgment that participants were

10:33making there and regardless of the

10:36subjectivity of the domain here we

10:38consistently find that people

10:40um rely more on the same advice when

10:43they think it comes from the algorithm

10:44which is the blue bars here so

10:47it's a pretty robust finding across

10:49these studies

10:50um and what we wanted to know was after

10:54that

10:55well algorithm aversion seems alive and

10:58well

10:58in people's thinking from the side of

11:00researchers so we actually in study two

11:03asked researchers to predict the results

11:05from

11:06our matchmakers study where people were

11:08predicting romantic attraction between

11:10two people that they'd read about

11:13some of you may even have taken our

11:14survey which we shared with the judgment

11:16and decision making

11:18conference email list so although our

11:20results from studies 1a through 1d

11:23may sound intuitive now that you know

11:25the results interestingly

11:27when we asked researchers they predicted

11:29the opposite results to what we found

11:31empirically with our participants

11:33they did predict aversion when we

11:36actually found appreciation

11:41so far our experiments intentionally

11:43controlled for excessive certainty

11:45in one's own knowledge in the studies

11:48that i've shown you so far

11:50we provided advice from external

11:52advisors regardless of the source

11:54in both the human and algorithm

11:56conditions

11:58why did we do this well it ensures that

12:00participants compare

12:01their own judgment with advice in both

12:03conditions

12:04so basically we're not confounding human

12:07judgment

12:08with someone's own judgment because in

12:11both the human and algorithmic

12:13conditions in our past studies

12:15everyone was comparing their own

12:17decision

12:18with an external advisor so in

12:21experiment three

12:23we basically wanted to know if thinking

12:25or special snowflake moderates algorithm

12:27appreciation so let me explain a little

12:28bit more here

12:30we examine whether subjective confidence

12:32in your own judgment

12:34plays a role in the use of algorithmic

12:36judgment

12:38here people in one condition were

12:41choosing

12:42before they ever saw so this is a little

12:44different from our

12:45judge advisor system paradigm people are

12:47making a choice here

12:48before they saw the advice this allowed

12:51us

12:51to have one condition where people were

12:52choosing between

12:54advice that they might receive from

12:56another person

12:58or from an algorithm this we replicate

13:01algorithm appreciation where people are

13:03making a choice in their advisor

13:06prior to getting any information about

13:08what their advice might be

13:09so that's consistent with the results

13:11that i've shared with you so far

13:14and that 88 shows the algorithm which is

13:17statistically significantly different

13:19from 50 if they had merely averaged

13:24so our new condition here is people

13:26choosing between an algorithm and their

13:28own estimate and we we thought i was

13:31thinking that we would actually

13:34completely do away with algorithm

13:35appreciation but

13:37it was so strong in the study that it um

13:40moderates algorithm appreciation the

13:42role of the self and your own judgment

13:44to make a direct comparison

13:46to the algorithm um 66 percent

13:50is different from 50 percent but

13:53the key here is that they're still

13:55choosing the algorithm

13:57it's important to note though that 88

14:00is statistically significantly greater

14:03than 66

14:04we did moderate algorithm appreciation

14:07but we couldn't turn it off fully

14:10and indeed when we asked participants

14:12how confident

14:13they were in these estimates before they

14:15received them people were more confident

14:17in their own estimate

14:18being correct than in that of another

14:20person

14:22which is consistent with work on

14:23overconfidence so

14:25here these results suggest the

14:27confidence really drove the propensity

14:29to choose the human estimate

14:31more when the human was the self rather

14:34than when the human was an external

14:35advisor

14:38this study i think is really important

14:40because it also

14:41partly helps us reconcile our work with

14:44empirical work that had been coming out

14:46at the same time

14:48and finally i think my favorite study um

14:52in study four we collected data from a

14:54really unique example

14:56national security professionals who are

14:58arguably experts at forecasting

15:02so we compared this expert sample to a

15:04lay sample

15:05that made identical judgments this

15:08allowed us to see how objective

15:10expertise

15:10influences responses to algorithmic

15:13advice

15:14and keep in mind uh full disclosure

15:16obviously we're comparing two samples

15:18that probably differ on more things than

15:20just expertise

15:22but we thought it would be useful if

15:24we're going to be able to get data

15:25i it took me two years to actually be

15:28able to get our survey

15:29circulated international security

15:31professionals

15:32that if we had the chance to get data

15:34from experts

15:36to then be able to have a benchmark so

15:37the late sample serves as a nice

15:39benchmark

15:40if you're interested in subjective

15:43expertise and how that influences

15:45responses to algorithmic advice i'm

15:47happy to talk more offline about that

15:49i ran about 12 studies where we

15:51manipulated

15:52subjective expertise without

15:55manipulating

15:56objective expertise or knowledge which

15:59ideally would be key

16:02just a brief summary of that we were

16:05able to manipulate felt expertise in

16:07those 12 studies

16:08but people still responded the same

16:11where they were relying on algorithmic

16:12advice

16:13that basically told me that the advice

16:17the

16:17expertise that was so strong in our

16:20national security professionals

16:22which had been developed over the course

16:24of some of them had been in their jobs

16:25for 30 years

16:27it's just that it's difficult to

16:29replicate that strong

16:31sense of expertise online in an

16:34experiment the last

16:36about five minutes but it's still a

16:38topic i'm really interested in

16:40um so we compared this expert sample

16:42with a lay sample

16:44and here we tested for algorithm

16:46appreciate appreciation

16:47in visual estimates business forecasts

16:50or how much tesla would sell

16:52and two geopolitical forecasts about

16:54cyber sanctions and brexit

16:57this allowed us to test for algorithm

16:58appreciation and domains of even

17:00extreme uncertainty remember how

17:02uncertain it was whether or not brexit

17:04would go through

17:06or not by a certain time

17:09and here although lay people showed

17:11algorithm appreciation

17:12as our past samples did experts actually

17:15discounted algorithmic advice

17:17more than lay people so when experts

17:20were receiving advice

17:21they just didn't listen to anyone and

17:24then importantly

17:25this ended up hurting their accuracy

17:29um so experts discount algorithmic

17:32advice more than lay people

17:34and this comes at a detriment to their

17:36accuracy so people who are

17:38um paid for living to make geopolitical

17:40forecasts

17:41or actually making less accurate

17:44forecasts

17:45then lay our lay participants which

17:49i always think is very depressing for

17:50the world but fascinating for research

17:55in summary we did find some interesting

17:57moderators so algorithm appreciation is

17:59moderated by two key factors

18:02first when a decision maker is directly

18:04comparing his or her own knowledge

18:06algorithmic advice

18:07algorithm appreciation weakens

18:10and when people have expertise in a

18:12domain

18:14our work suggests that they're just

18:16going to discount advice regardless of

18:18the source

18:19which importantly ends up decreasing

18:21their accuracy

18:23um and one other kind of tidbit that i

18:26always found interesting is we

18:28did find um a mechanism where

18:31we tested for numeracy in our

18:33participants in earlier studies and

18:35the more numerate people were the more

18:37they were willing to rely on algorithmic

18:39advice

18:40maybe a little bit less surprising than

18:42the other moderators but i think still

18:44useful to keep in mind so numeracy

18:48was an 11 item scale that basically

18:50measures kind of comfort

18:51with numbers on simple math um

18:55math questions i saw in the chat that

18:57there might be a question but

18:59hopefully i answered it it looked like

19:00the last message that it was answered

19:04i'll pause here

19:14i can't see the chat so if there are any

19:17questions please feel free to meet

19:18yourself

19:20i can monitor the chat if there's any

19:23question i can

19:24yeah bring it out to you great thanks

19:28um so a lot of you might be thinking

19:30about

19:31a lot of other moderators of what might

19:34kind of flip this effective algorithm

19:35appreciation

19:36over to a version and i spent

19:40a lot of years of my dissertation trying

19:43to find a strong moderator but

19:44i was just kind of met with really

19:46robust effects for algorithm

19:48appreciation

19:49when people are making predictions about

19:50the world so so keep that in mind

19:52um the decision context is always

19:54predictions

19:56about what's going on in the world or

19:58other people

19:59not necessarily related to themselves um

20:02one thing i thought might actually um

20:06be a moderator to algorithm appreciation

20:08was for

20:09familiarity with algorithms themselves

20:12um so if people are just

20:14uh aware that they use algorithms all

20:16the time maybe they're just more likely

20:17to listen to advice from it

20:19uh relative to people who never really

20:21use algorithms or

20:22maybe an older generation who doesn't

20:24even know what netflix is right

20:26but we actually found so one proxy for

20:28familiarity with algorithms might be age

20:30we found that older people rely just as

20:32much on algorithms as younger people do

20:34which was quite surprising to me um and

20:36we did have a wide range of ages in our

20:39samples

20:40um and another thing i thought might

20:43flip this effect

20:44is a difference between choosing between

20:48um in a wisdom subject's design

20:50algorithmic versus human advice because

20:52most of the studies that i ran

20:54were between subjects so people were

20:56only responding to one source of advice

20:58um and there's a lot of great work from

21:01max phasermen and others

21:02where they show that there's a

21:04difference in psychologically and how

21:06people think about

21:08uh choices when they're presented

21:10jointly

21:11versus separately and separately would

21:13be how i

21:15that would map onto the studies i showed

21:17you this between subjects design

21:19even when we looked at choice 75 percent

21:22of people still chose the algorithm over

21:24the person so again robust to that

21:27and finally i thought what if people

21:30just have more

21:31control over the um

21:34advice in our studies because they're

21:37making the final estimate right

21:38um and i thought well what if they have

21:40to when they're choosing

21:42um the advice that they're receiving

21:45they're choosing

21:46before they see the advice that they'll

21:48make sure

21:50the estimate provided by the advisor

21:52that they

21:53don't really have any information about

21:56is going to determine their final

21:58participant payment without them

21:59adjusting at all

22:01and even when people were not only

22:03outsourcing

22:05the um

22:08the final estimate to the algorithm or

22:11person

22:12uh when that was actually determining

22:13their final pain they couldn't adjust

22:15away

22:15or towards the advice 61 percent still

22:18chose the algorithm

22:20when the advisor would be in full

22:22control and determine the final

22:23incentivized outcome

22:24so robust to that as well one moderator

22:27that i thought was kind of interesting

22:28we

22:29i ran a study where we changed uh the

22:31labels on the algorithm

22:33and the person across a number of cells

22:36and it seems that in scenario studies

22:40uh people prefer an expert person to an

22:44algorithm and i think that this

22:46jives with some work that's been coming

22:48out recently

22:50and i think this is pretty interesting

22:52because algorithmic advice is often

22:55less expensive than expert human advice

22:58you can think of doctors and things like

22:59that

23:00um and it's also just more readily

23:03accessible so you don't

23:04uh i think with like video chats with

23:07the

23:08doctors this past year maybe that's kind

23:09of changing but generally algorithmic

23:12advice

23:12has the potential to completely displace

23:15advice that we normally would

23:17pay experts to give us um so i think

23:20that this is kind of a useful

23:22piece of evidence there

23:26so overall these results suggest that

23:28algorithm aversion

23:29is really not such a straightforward

23:31story as received wisdom would have us

23:33believe

23:34and it partly overturns what a lot of

23:36researchers have assumed we've known for

23:38over 50 years

23:40but importantly i think it opens the

23:42doors to

23:43many questions about how expectations of

23:47algorithmic and human judgment at their

23:49finest

23:50differ from each other

23:53so one aspect of this paper that's

23:55useful to mine which i kind of flagged

23:57was that

23:57people are making predictions about the

23:58world um

24:00romantic attraction between other people

24:03political events

24:04but i started to wonder this was my

24:07dissertation work and and through that i

24:09was starting to think i

24:11really want to um collect new data

24:14on what happens when people are in

24:15domains where judgments are being made

24:17about them

24:18an algorithm and a person is giving

24:21producing judgments about their own

24:23performance so something that's really

24:24personal

24:25rather than making judgments about the

24:28world

24:29so that's why i turn to the domain of

24:30algorithmic hiring it's being adopted by

24:33many organizations so amazon uses

24:36um algorithmic hiring in a very

24:39widespread way but

24:41every time i talk to prasad seti at

24:44google

24:44he tells me that they really don't use

24:47algorithms that much

24:49in their hiring processes because the

24:51engineers just don't want to have that

24:53um part of me still can't believe that

24:56like the engineers don't want algorithms

24:58um to help with promotion decisions but

25:02um google has kind of not changed with

25:06that for many years

25:07so i wanted to test empirically how do

25:09stakeholders or the job applicants

25:12view algorithmic hiring compared to

25:15being hired by

25:16a human manager

25:20um so you can imagine here that if

25:23people don't want to be hired by an

25:25algorithm

25:26when the labor market becomes tighter

25:28again

25:29um in the future they may even forego

25:31applying to that job so i think that

25:33this question has some interesting

25:36um applications for the real world that

25:38way

25:40so in study one i'll show you this is

25:43new work that i'm really excited about

25:44so really looking forward to feedback on

25:46this

25:47um in study one i'll show you applicants

25:49preference for how they want

25:50their application packet reviewed when

25:52they're applying for a role on the team

25:54and we created this pretty intricate

25:57paradigm

25:58where m cherkers were um

26:02all go into the details but m turkeys

26:04basically had

26:05the opportunity to take a few tasks and

26:08then they knew that based on their

26:09performance

26:10on those initial tasks they could be

26:13hired

26:13um to be part of a team to um solve

26:18kind of a puzzle that everyone responded

26:20to very strongly

26:22um it was basically a murder mystery and

26:25if there's one thing that keeps prolific

26:27uh participants attention they are very

26:30into true crime and they were very

26:31excited at the opportunity to become

26:33part of the team to potentially solve a

26:35murder mystery

26:37um so spoiler here a whopping 70 of

26:40applicants in study one

26:41chose a person over an algorithm so this

26:43was really the first time that i

26:45finally found this algorithm aversion

26:47that everyone's kind of been talking

26:49about

26:49and that was exciting and i think part

26:51of that is because of the domain

26:54um of the judgment itself and that it's

26:56actually about

26:57the participant themselves rather than

26:59the world so this effect within hiring

27:03appears to be robust but we do find

27:05important factors that weaken

27:07and even reverse it so in study two

27:10we find that aspects of the application

27:12pool

27:13the applicant pool itself influence our

27:15effect

27:16so preference for the person weakens

27:18when competition is higher within the

27:20applicant pool

27:22but when competition is lower applicants

27:24prefer the person

27:27as more competitors are vying for the

27:29role it seems that this preference

27:30weakens

27:32and in study three we shift to examine

27:35how characteristics of the hiring

27:37manager themselves actually influence

27:39applicants preferences

27:41again applicants do prefer a person over

27:43an algorithm

27:44when the hiring manager is a member of

27:46the in group

27:48but when the hiring manager is a member

27:50of the applicant's out group this

27:51preference reverses

27:53and here when the hiring manager is now

27:55group member people strongly prefer the

27:57algorithm

28:00and one thing the the reason why i'm

28:02really excited about this project is

28:03here we kind of in

28:04study four or study three dive into the

28:08mechanism so

28:09is it that people think the out group

28:11member is going to be biased against

28:13them

28:14or as an out group member does the

28:17applicant think

28:18oh well that person is just not

28:19competent enough to see how good of an

28:21applicant i truly

28:22am so that's kind of the difference

28:24between

28:25uh cis uh systematic error versus uh

28:29random error

28:30something that daniel kahneman has

28:32talked about in some of his writing

28:35uh in study four

28:38oh so what were the in group and out

28:39group in the study i see in the chat

28:41i'll

28:41walk you through in more detail in this

28:44paper

28:45than i did in the first paper the first

28:46paper i kind of wanted to give you an

28:48overview to

28:49to see where i was starting from diving

28:51into this

28:52so we'll we'll go over that in a few

28:54slides thanks for a good question

28:56in study four we manipulate the

28:58algorithms past performance

29:00basically i wanted to know like how good

29:03does an algorithm need to be

29:05before people wanted to assess their

29:07performance

29:09how accurate does an algorithm need to

29:10be before people actually prefer

29:13it to a hiring manager and so another

29:15spoiler here is that and how could they

29:17actually need

29:1775 accuracy before the preference

29:21actually flips that's

29:22a pretty high bar for people to prefer

29:25the algorithm

29:26or at least i thought so i was pretty

29:28surprised by that

29:30um so in study one participants read an

29:34overview

29:35of the study they knew what they were

29:37going to

29:38um be working on in

29:42the the time they were in the study so

29:44they read that there were two tasks

29:45and that depending on their performance

29:47on this anagram quiz and this trivia

29:49quiz

29:50there was a possibility to work as part

29:52of a team with other um

29:54participants on task three then they

29:57read task three

29:58and they read it was a murder mystery

30:00and they all kind of lost their minds

30:01they got pretty excited about this

30:03so it's just always nice to know when

30:06participants are really involved

30:08and we know we're getting good data here

30:11participants have opportunity to win

30:13bonus pay from smurder mystery while

30:15coordinating with others

30:16under time pressure so you could

30:19imagine that most people were pretty

30:21incentivized to do as well as they could

30:23on the quizzes to create the most

30:27competitive application packet that they

30:28could and so they took it a little bit

30:30more seriously

30:32so then they read that in order to

30:33create four teams

30:35active participants would be assigned to

30:37roles with 75 percent of participants

30:39being assigned to the role of applicant

30:41to the team and 25 percent the role of

30:43hiring manager

30:45uh normally i try to avoid deception at

30:47all costs

30:48i didn't use deception in any of the

30:51studies for algorithmic covering for

30:53this

30:54we did use deception because all

30:57participants were applicants

30:59and so the survey also stopped before

31:02task 3

31:03started ideally some of the next studies

31:05will run

31:06we'll actually have people take the

31:08murder mystery and we'll be able to

31:09measure people's performance as an extra

31:11kind of

31:12dependent variable there

31:16so after people took the quiz everyone

31:20found out

31:20that they actually were an applicant

31:24and they read a little bit about what

31:26their application packet would look like

31:29so they read a page that said your

31:31application packet

31:33will include your quiz scores including

31:34your time spent so you can imagine

31:36if you answered a lot of the trivia

31:39questions correctly in a short amount of

31:41time

31:41you're feeling pretty good about your

31:42performance the difficulty of correctly

31:45answered questions so you also know a

31:47little bit more about how competitive

31:49you are

31:50as well as a short essay so we wanted to

31:52have a mix of both objective and pretty

31:54subjective criteria

31:55so you can imagine if you have um an

31:58opportunity to write a short

32:00essay here people could use it

32:02potentially to persuade

32:05um and then they made a choice

32:08of who they wanted to review their

32:11application packet

32:14so here they wrote their essay and then

32:16they made their choice but in other

32:18studies i'll show you

32:19we actually find the same results if

32:20they make their choice and then they

32:22write the essay

32:25and they chose how do you want your

32:27application reviewed

32:29from a person or an algorithm we wanted

32:31to

32:32um counterbalance that order we changed

32:34the order in the other studies

32:36in case here maybe people thought well

32:38i'm writing this

32:39essay and that would maybe lead them to

32:42want to choose the person because they

32:44think the person would be more persuaded

32:45by the essay but we still find

32:47the same results even if the order's

32:49different

32:50so here 70 of people chose to have the

32:54person

32:54over the algorithm assess their

32:56application packet

32:58next we wanted to know well is this

33:02preference

33:02affected by the competition of the

33:05applicant pool itself

33:07and you can imagine that um

33:11especially kind of with covet and the

33:14application pools

33:15changing in terms of job loss and things

33:17like that

33:18that could easily influence people's

33:20decision of whether or not to apply

33:22to one role or another depending on how

33:24they think they might be assessed to

33:26give them

33:27a more efficient use of their time at a

33:29job they think they might be more likely

33:31to get so here we operationalize low

33:35competition

33:36as four spots available but there's five

33:38other applicants

33:40um and in high competition 21 other

33:42applicants

33:44here we found a moderation that

33:47moderator

33:48of competition so when competition was

33:50low

33:51people um i should note that the

33:54y-axis is the percentage of people

33:57choosing the person

33:59relative to the algorithm so more people

34:02chose the person in the low competition

34:04than the high competition condition

34:11and those are both significantly

34:13different from

34:1450 so an indifference point

34:18and then finally in study three um this

34:21in-group out group

34:22will hopefully answer the question

34:24someone had before um

34:26we wanted to know if people might prefer

34:28an algorithm when the hiring manager

34:30is a member of the out group more

34:33so you can imagine um a few reasons why

34:36people might switch to the algorithm

34:38which i alluded to before

34:40so people report their beliefs on hot

34:43button political issues which included

34:45minimum wage gun ownership and abortion

34:48for the study

34:49and then they were told when

34:52they found out that they were in the

34:54role of the applicant

34:56that the hiring manager either agreed

34:58with them

34:59or disagreed with them so if they agreed

35:02they were a member of the in group

35:03they disagreed they were a member of

35:04that group

35:08um and here we found that when the

35:12hiring manager was a member of the in

35:14group again people really prefer to have

35:17that person assess their application

35:19packet

35:19but it flips and people start to prefer

35:21the algorithm when they find out that

35:23this person

35:24really disagrees with them on hot button

35:25political issues

35:27so we ran a follow-up study to this

35:30looking at

35:31if this is driven by systematic or

35:34random error

35:35expectations of that systematic error

35:38meaning

35:39um this person will just be biased

35:41against me and random arab

35:43um meaning that i expect their judgments

35:45to kind of be all over the place because

35:47they're incompetent

35:48um at actually making this assessment

35:51and

35:51the way that we did that was we either

35:54told people

35:55um explicitly that the hiring manager

35:58a member of your out group will know

36:01that you are a member

36:03you guys are out group members or we

36:05explicitly said

36:06the hiring manager won't have

36:08information on whether or not

36:10you are an angry brow group member

36:14and so here we found that expecting the

36:16app group member

36:17is incompetent drives this moderation

36:21rather than being biased against them so

36:24basically we find that people

36:26want the algorithm even when

36:29the out group hiring manager is not

36:31going to know

36:32or have any idea about the

36:33identification being different

36:36so you just think that they're not good

36:37at making these judgments

36:39and then finally so before um

36:42we wanted to know just how good or

36:44accurate really does the algorithm need

36:46to be

36:47before people prefer it

36:50um so we randomly assign people to be in

36:53a condition of no information

36:55about past success at putting together

36:58a successful team that actually did

37:00solve the murder mystery

37:01um or um that the algorithm

37:06was 60 percent uh successful at putting

37:09together

37:10um teams 75 or 90

37:13of the time and here um we find that

37:16with no info

37:17we replicate our effect of a preference

37:20for a person

37:22um and that this weakens

37:25as the algorithm becomes more accurate

37:27but the the flip point here

37:29is the 75 percent an algorithm needs to

37:32be 75 percent

37:33before people actually prefer it

37:35relative to the

37:37hiring manager

37:42so in summary we find that 70

37:45of applicants prefer to have a person

37:47review their application packet instead

37:48of an algorithm

37:50preference for person weakens when

37:52competition in the pool itself

37:54is higher so not even related to

37:56anything

37:57about the um decision maker

38:01potential decision makers from the

38:02organization and that people prefer an

38:05algorithm

38:05when when the hiring manager is a member

38:07of out group

38:08and it's driven more by expectations of

38:10random error

38:11um rather than systematic error which

38:14was pretty surprising to us

38:15and um i'm i'm hoping that we run

38:19a number of follow-up studies to kind of

38:21dive into that and tease that out

38:24um and then finally an algorithm

38:25requires a pretty high benchmark of

38:27accuracy 75

38:29before applicants prefer it to a person

38:34um so here

38:37it seems like when the domain is

38:38relevant to the self people prefer human

38:40judgment

38:41when it's about the self uh we also ran

38:43three studies where we ask people how do

38:45you want your teammates to be hired

38:48once we told them you're hired for this

38:50murder mystery

38:51now the the um choice is not how you're

38:54hired but how the rest of your teammates

38:56are hired

38:57and they still said the person which i

38:59thought we might flip the effect there

39:00so there might be a self other

39:02difference

39:03um but there wasn't and then i have a

39:05another project that's a lot more

39:07nascent

39:08where we do have i think of maybe five

39:11studies where people

39:12are responding to feedback on their

39:15performance so

39:16rather than an assessment that leads

39:18them to

39:19um achieve a role on team or not just

39:22feedback on their writing

39:23and there people say they want feedback

39:26on their writing from a person

39:28but when they actually get feedback it

39:30doesn't matter if it comes from an

39:32algorithm or person

39:33they still everyone kind of updates to

39:35that feedback that they received so

39:38i would say that that evidence is a

39:39little bit um

39:41it deserves a little bit more time to

39:43kind of sort through and understand

39:45but my takeaway just from our data so

39:48far

39:48on this uh judgment type of feedback

39:51is that people kind of say one thing but

39:53then when push comes to shove and

39:54they're actually updating their beliefs

39:56they do something different

39:57so i've started a new project to dive

40:00into that comparing how people respond

40:02to judgments of algorithmic and human

40:04advice

40:06based on if it's in the judge advisor

40:08system

40:09where they actually see the advice and

40:11update

40:13as much or as little as they want to it

40:15versus if they're in a scenario domain

40:17and they're choosing between advisors

40:19and you can imagine in a scenario domain

40:22the psychologically this psycho

40:25psychologically rich mechanism there

40:27would be that um

40:29in a scenario you can imagine all the

40:31different types of ways that advice

40:34might um basically differ between an

40:38algorithm and a person but in the judge

40:41advisor system

40:42you are seeing the source with numeric

40:44information in front of you so you're

40:46already attending

40:47to that advice but in the scenario

40:49condition your mind could

40:51come up with all these different ways

40:52that the advice might differ between the

40:54sources

40:55and so there we found algorithm

40:57appreciation and judge advisor

40:59conditions but in the scenario

41:00conditions people seem

41:03um the data is noisy and people seem to

41:05be

41:06a little bit um we don't we don't find

41:08out an inversion in the scenario which

41:10is what i thought we'd find

41:11we find it kind of like a little bit of

41:12a difference there um

41:14so i think there's some interesting

41:16moderators to look at here and i'm

41:18curious to hear

41:19your thoughts as well if there's any

41:22questions here i'd love to take them

41:24and then i can go on to kind of another

41:27project i've been working on recently

41:29and

41:29developing this theory of machine

41:33uh we actually have one question from

41:35victor

41:36so he asked about whether were the

41:39accuracy levels actually measured or

41:42just

41:42made up information so i guess that's a

41:44question about deception

41:45yeah yeah so we did use deception there

41:48as well

41:48um i

41:54in terms of efficiency it just made more

41:56sense to

41:57change the label of how accurate it was

42:00but

42:01i think it could be really fun to

42:02potentially run a field study where

42:05uh companies like testing algorithmic

42:07judgment

42:08um and giving people

42:11the accuracy feedback like real time

42:16but this was just a way where we could

42:18actually

42:19test more conditions that way

42:25great question any others

42:29um another question from taha uh

42:32he or she is wondering how the wording

42:35and the content

42:36of the introduction of the algorithm to

42:38subjects matter

42:39so how did you introduce algorithms

42:42great question i'm wondering if i can

42:46go to this slide without it messing up

42:49the slide deck

42:52so we operationalized algorithm in

42:55algorithm appreciation in a few ways i'm

42:58hoping

42:59[Music]

43:02i might need to stop sharing my screen

43:05to

43:05share this

43:10um okay

43:15can you see this so

43:18can you see regardless of the type great

43:21yes

43:21um so in algorithm appreciation

43:25we tested a few different

43:27operationalizations

43:28of the term algorithm itself because we

43:31had a similar

43:32question does it matter how we're

43:34describing the algorithm

43:36i was most intrigued by testing how

43:39people responded to advice they thought

43:41was coming from a black box algorithm

43:43mostly that's just because that's how

43:46our algorithmic advice is normally

43:47produced

43:48we don't know the actual mathematics

43:51behind

43:52netflix algorithm or you know pandora's

43:54algorithm or dating apps algorithm

43:56i don't know if anyone's seen recently

43:58on netflix

44:00when you tile through i think it

44:02happened in the last week when you tile

44:04through

44:04uh movies that you want to view trailers

44:07for um

44:08they actually have one big screen that

44:10comes up that's uh like a random choice

44:13and it said it says something like do

44:14you want our algorithm to choose for you

44:17um which is kind of uh i thought funny

44:19but even there is still a black box

44:21where

44:22we don't really know the

44:25data that's being input into the

44:27algorithm we don't know how it's being

44:28processed

44:29um so our results and algorithm

44:32appreciation

44:32hold regardless of the type of algorithm

44:34that we presented so we had

44:36started off describing a simple

44:38algorithm

44:39and we had used an average in algorithm

44:42appreciation we didn't use any deception

44:43and we averaged

44:45between 300 to 400 um

44:48separate participants to create the

44:50advice so that allowed us to present

44:52advice

44:54that came from other people but we could

44:55also um frame it as coming from an

44:57algorithm because an average is really

44:59this one of the simplest algorithms

45:02um so there we found algorithm

45:04appreciation and then we went to a black

45:05box where we just changed the label

45:07which is

45:08um study four the the national security

45:11one

45:12that's partly why it's one of my

45:13favorite studies because we just didn't

45:15give any information

45:17which allowed people to rely

45:20on their lay perceptions like what

45:23whatever definition they were bringing

45:24to the table and and one question that

45:26you might have

45:27if you're asking this great question is

45:29that um

45:31another one normally comes up is well

45:34what do people think an algorithm is

45:36like do our participants know what an

45:38algorithm

45:38means and we asked in a number of our

45:42studies

45:42if people could define the term

45:44algorithm and then we had our rays

45:46code those responses to create

45:49categories

45:50and normally the responses fall into

45:52categories of

45:53it's some sort of math or formula it's a

45:56rule kind of like a

45:57some sort of rule based on logic um or

46:00if

46:00there was a kind of miscellaneous

46:02character um

46:04category which was people kind of

46:06mentioning computers

46:08and my takeaway from that is

46:09mathematicians and computer scientists

46:12wouldn't be upset to read those

46:15definitions people have a pretty good

46:16idea

46:18of what an algorithm means so if we give

46:19them a kind of black box

46:22operationalization it's not that they

46:24don't know what we're talking about

46:26great question thanks so much for asking

46:28that yeah

46:29taha are you satisfied with the answer

46:32or do you have any

46:33follow-up questions absolutely thank you

46:35very much

46:36okay so i will move on to the question

46:40of alicia

46:41so she wonders like do you have any

46:43intuition or findings concerning the

46:45mechanism

46:46why higher competition induce the people

46:48to choose the algorithm more often

46:51yeah so we asked a lot of open-ended

46:54questions

46:56um in these studies because

46:59it was kind of new territory for me

47:01because it wasn't

47:02predictions about the world and we just

47:05wanted to hear directly from

47:06participants

47:07either how they would rationalize it or

47:09try to post-talk

47:11explain their decision and they said

47:14they said some interesting things which

47:17um

47:18i think could even be potentially a

47:21follow-up to this

47:23um so some people had said

47:26like they didn't think so there were

47:29time constraints and people knew that a

47:31decision would be made my fear was that

47:33if it's high competition people want the

47:35algorithm because

47:36they wanted they don't want to wait 21

47:39extra minutes for a person to go through

47:41the applications

47:42um but they said things like oh i don't

47:43want to do that to joe

47:45so there that was a little bit

47:48surprising because it was kind of

47:49empathy

47:50towards the hiring manager that they

47:52didn't want to put them through that

47:54many

47:55application packets to review

47:58um and so i think that there

48:03people do have top of mind a matter of

48:05efficiency

48:06but there wasn't a straightforward

48:08explanation that made me

48:10think of an experiment that i could

48:12directly follow up on with that

48:14but if you have ideas i would love to

48:16hear them what kind of mechanism do you

48:17think is going on there

48:21so i didn't have something special in

48:23mind but i think the efficiency part is

48:25already very interesting that you tell

48:27okay people might have an efficiency

48:29argument in there and they think okay it

48:31just takes time from the hiring manager

48:32and it's better for both if the if the

48:34algorithm does it so i think this is

48:36really interesting

48:37i was just wondering whether you have

48:38any complimentary findings but thanks a

48:40lot for your answer

48:41thanks yeah and if you think of um any

48:44like potential

48:45new mechanisms a test i would definitely

48:48be please send an email i'd be open to

48:50hearing that

48:52question oh sorry

48:57alicia do you want to continue

49:00okay um i guess uh do you have any other

49:03slides you want to show

49:05yeah um

49:08when is are we over at 10 30 or 11

49:13uh i think we ended 11 but

49:16after that do you have like 20 or 30

49:18minutes to

49:19have a like a brief talk with definitely

49:22i actually

49:23was trying to save some time so we could

49:25have kind of a more open-ended

49:27discussion here

49:28um yeah so maybe you first finish your

49:31slides and then we open the official q a

49:33session

49:34okay great great great can you um see

49:37these slides the chapter

49:40great yes um so

49:43throughout the course of um

49:46both of these uh research projects

49:50i was kind of really just

49:53trying to figure out any moderators i

49:56could

49:57especially because in algorithm

49:58appreciation we never really flipped the

50:00effect

50:01to algorithm reversion in algorithmic

50:03hiring we were able to flip the effect

50:05um and i think that that project

50:09deserves

50:09uh we're definitely going to spend more

50:11time kind of digging up mechanisms but i

50:13think we're focusing on the in-group out

50:15group

50:16and the systematic versus um

50:20random error um

50:23but throughout all of that i really

50:26wanted to

50:27create a framework to kind of fit not

50:30only

50:30the kind of data that i have evidence i

50:33have but evidence that berkeley has

50:35from his work berkeley dead forest and

50:37nate fast had

50:38from his work um and adam waits has some

50:42really lovely work that's definitely

50:44relevant here um

50:45and so i started thinking well we're

50:48really just looking at

50:50how do people think what their lay

50:53perceptions and expectations

50:55of what algorithmic judgment

50:59can produce in terms of accuracy

51:01compared to human judgment

51:03and as i was thinking through that i

51:05thought i had a different graphic here

51:07oh um so from the dissertation i kind of

51:09created this

51:10theoretical framework um to

51:14help kind of make my research more

51:16systematic but hopefully it could also

51:18be

51:18useful for other scholars who are

51:21interested in this area

51:22um so the idea for this framework is

51:25that

51:26the research related to this um would

51:28document how people expect

51:30human and algorithmic judgment to differ

51:33in their input

51:34and their process and their output and

51:36so most of

51:37algorithm appreciation really focused on

51:39how are people responding to the output

51:41and i really just think that's the tip

51:42of the iceberg

51:43um with algorithmic hiring and people

51:45thinking about systematic versus random

51:48error

51:49i think that that does dive in a little

51:51bit more to process

51:52and um getting back to alicia's comment

51:55um

51:56about what might be the mechanism

51:58between the high and low competition

52:00i actually think there if we turn to

52:02looking at people's expectations of

52:05the input to algorithmic and human

52:07judgment

52:08when there's a lot of data versus a

52:11small amount of data

52:12and how that influences the judgments

52:14overall i think

52:16that's where we'll really start to

52:17leverage um people's expectations for

52:20what

52:21the algorithm and human judgment can

52:22handle in terms of

52:24um data that it can it can use as input

52:28um so uh

52:31one thing that i was kind of interested

52:35in was just kind of creating some

52:37predictions of what i thought

52:38could be going on here just from what i

52:41was learning from my own evidence

52:44uh in my research and so you can imagine

52:47that

52:47this these are basically predictions

52:49that i have for people's lay

52:51perceptions and expectations of

52:53algorithmic and human judgment like at

52:55their finest what they can get us

52:56so you can imagine that people have

52:59assumptions that an

53:00algorithm utilizes data that is less

53:02nuanced

53:03or um it's less um

53:07it's less abstract categorical and the

53:10algorithms can't utilize data that's

53:12more subjective

53:14or intangible and i think nate in his

53:16paper had talked about kind of like

53:17holistic assessment

53:19um which i think very much relates to

53:21this but whenever i ask

53:22my undergrads well how would you want to

53:24be hired when you're on the job market

53:26because i don't want an algorithm

53:28hiring me because they don't understand

53:29especially they i will translate this

53:32from what they say but basically they

53:33make an argument like

53:35the algorithm won't understand how

53:37special the snowflake i

53:38am really um because i have a

53:42really wonderful personality and humor

53:43and all these things and

53:45maybe they list things that aren't even

53:46related to the job performance

53:48right but i think people have this

53:50assumption that

53:52there's certain input that an algorithm

53:56can't attend to as well as a human and i

53:58think that that's definitely worthwhile

54:01question to test and so in terms of

54:03quantity

54:04i could imagine i alluded to it earlier

54:06that algorithm people expect algorithms

54:08people to utilize larger

54:09amounts of data as input so that's a

54:11little bit separate from the efficiency

54:13argument and

54:14it kind of interesting to disentangle

54:16that as well

54:17um and so in terms of process for

54:19people's expectations of quality and

54:21quantity

54:23i'm predicting that people might uh

54:25think that algorithms process

54:26less holistically without taking broader

54:28patterns

54:30or even context into account and then in

54:32terms of quantity algorithms process

54:34fewer categories of cues so

54:37they might be able to

54:39[Music]

54:41take into account your scores on an

54:44anagram or trivia test

54:46but that is

54:49a numeric score on an objective outcome

54:52and i would consider maybe that one

54:54category

54:55and when you talk to students who are in

54:57the midst of their recruiting process

54:59they talk about all these other types of

55:01categories of cues like being able to

55:03get along on a team that's a totally

55:05separate

55:06category and so people might see

55:07algorithms as

55:09processing fewer categories of cues so

55:12they can focus on the objective criteria

55:14but there's other categories out there

55:16that they might not be able to consider

55:17and then finally um

55:21with output people might expect that

55:23algorithms can't provide

55:24an explanation behind their judgments

55:27which ends up making them less

55:29persuasive

55:30um an algorithm they might expect

55:33algorithms produce

55:34less relevant data to an individual so

55:36this idea of the special snowflake

55:38versus

55:39um kind of a recommendation for the

55:40average person so

55:43people might have the idea that

55:47recommendations for things that people

55:50like

55:50uh where i might think i have quirky

55:53preferences

55:53in music um but the algorithm can

55:57predict that everyone

55:58on average kind of likes taylor swift

56:01but i'm special

56:02and i have a unique taste and then

56:04finally people might also expect that

56:06algorithms produce less output

56:08at a at a time so people can provide

56:10information and an explanation

56:12and they might not expect an algorithm

56:14kind of separate from what's actually

56:16out there in the real world right but

56:17people might not expect an algorithm

56:20to have a conversation with them about

56:21say a medical diagnosis

56:23where maybe an algorithm could diagnose

56:25them

56:26but if they want to ask follow-up

56:27questions they might feel more

56:28comfortable with the person

56:30and so i've kind of taken this framework

56:34of your machine and and tried to expand

56:36it even more actually so not just

56:37thinking about input process and output

56:39in this book chapter i wrote um it's

56:42it's currently under review but

56:43i'm happy to share it if anyone is

56:45interested and if there might be

56:47at least one person who reads a book

56:49chapter that would be

56:50uh great i'd love to have a discussion

56:53about it um

56:55i'm interested in kind of overlaying

56:57this input process

56:58output framework with

57:01is the decision or judgment in the

57:04context of making a prediction

57:06or is it an assessment so there you have

57:09things

57:10related to algorithmic hiring and and

57:12feedback i think even draws in

57:15you have different goals than you would

57:17have of making a prediction right so

57:19feedback you might also want motivation

57:22you might want to know that the feedback

57:24is giving

57:25useful information that is actually

57:27actionable so that people can make

57:28progress

57:30um and i think that if you can kind of

57:32create

57:33a matrix of input process output

57:35people's expectations for this

57:36relative to if they're in these

57:38different domains

57:40um well i kind of came up with

57:43predictions that i thought might be

57:44interesting and wrote them up in this

57:46chapter and then the last thing i'll say

57:50i'm hoping that this framework um

57:52especially in an area where

57:55there's been more and more attention

57:56over i've seen the last

57:58five six years or so um hopefully this

58:01can also help us

58:03systematically bring different

58:06individual researchers evidence and

58:08place it into

58:10a more formal kind of matrix of how they

58:12might reconcile and fit together

58:15so that we do have an overarching theory

58:18of machine that

58:19people are really kind of currently

58:21building so just lay people out in the

58:23world the more that we experience

58:25output from data analytics and

58:28recommendations from algorithms

58:31financial

58:33any any recommendation that can be made

58:36almost

58:37there's algorithms being built to try to

58:39make that recommendation and i think

58:41the more in the real world we're coming

58:44into contact with the

58:45this new source of advice we're actually

58:49in a really exciting time in human

58:51history where

58:53lay people are developing their theory

58:55of machine

58:57um and my kind of last plug for this

59:00chapter is

59:01i've been thinking a lot about um some

59:03fun discussions i've had with computer

59:05scientists so

59:07one day in the before times one day i

59:09would talk

59:10to um c-suite individuals

59:13uh in exec ed and they would say oh

59:16could you tell me the five questions i

59:18should ask my data analytics team and i

59:20thought

59:21i'm not at coca-cola like i i don't know

59:25the context to be able to tell you

59:27five general questions you should ask

59:30context does matter um and they

59:34wanted to know how they could

59:35communicate with their data analytics

59:37team and computer scientists

59:39more how they could communicate more

59:41clearly but

59:42then a month later i would go over to

59:45the computer science department

59:46and talk uh give a talk on algorithm

59:49appreciation with computer scientists

59:52and they were a little bit more hesitant

59:54to come out and say this but they

59:55would basically um tell me that they

59:58wanted to know how they could more

01:00:01clearly

01:00:01share the results that they were getting

01:00:03from their analyses

01:00:05and they wanted to make sure that people

01:00:07would would actually listen to this so

01:00:09we have siloed in organizations which

01:00:11ends up becoming reflected in silos on

01:00:14university campuses where there's the

01:00:17decision makers

01:00:18who want to know the questions to ask

01:00:20the analytics teams and there's the

01:00:22analytic teams

01:00:24that want to know what verbiage is most

01:00:27clear

01:00:28so that people will understand their

01:00:30output and then the decision makers will

01:00:32actually act on that

01:00:33so i've seen kind of a disconnect of

01:00:36producing analytics

01:00:38versus acting on them and oftentimes

01:00:41um not to generalize too much but

01:00:43oftentimes people who are producing the

01:00:44analytics

01:00:45just kind of end there right they're not

01:00:48looking to see

01:00:50well is anyone acting on the analytics

01:00:53i've shared with them like is this

01:00:54changing decision making and that's

01:00:56where i think

01:00:57um behavioral sciences and psychologists

01:01:00uh especially in the business school i'm

01:01:02in a management department

01:01:04we're in a really great we're poised

01:01:07really well

01:01:08to kind of pull together computer

01:01:11science

01:01:12human computer interaction and

01:01:14psychology

01:01:15and think about well how do people

01:01:17actually respond to algorithmic versus

01:01:19human judgment

01:01:20how do we solve this last mile problem

01:01:22of the analytics

01:01:24and data being there but how to make it

01:01:26actionable and so i started thinking

01:01:28about um this kind of last part of the

01:01:30chapter

01:01:31um every step of actually creating an

01:01:35algorithm

01:01:36requires a decision and wherever there's

01:01:38a decision psychology can say something

01:01:40about that which is actually pretty fun

01:01:42so in the stage of preparing to build

01:01:44the algorithm

01:01:45um computer scientists should be

01:01:48asking like is this data relevant to the

01:01:51prediction i'm making or the decision or

01:01:53judgment i'm trying to make

01:01:54is it biased in any way so amazon

01:01:58was using 500 models in their hiring

01:02:01algorithm to

01:02:04make their hiring more efficient and

01:02:06they basically a few years ago walked

01:02:08away from that

01:02:09because they were finding that they were

01:02:11only hiring males and not hiring females

01:02:14uh but i i wrote an hbr on this that

01:02:17they kind of threw the baby out with the

01:02:19bath water there

01:02:20because when you just walk away from the

01:02:22models you've developed

01:02:24to predict who will be a good performer

01:02:26well when you throw that out what's left

01:02:28you're reverting back to human judgment

01:02:30which we know is riddled with biases

01:02:32um and i would say the thing that we

01:02:34learned from amazon fail amazon's

01:02:36failure a few years back

01:02:37is that we actually uncovered the

01:02:40verbiage

01:02:41in resumes that led an algorithm to hire

01:02:44those people

01:02:45so um yes the output was biased because

01:02:49the input was historically

01:02:51the historic data was biased um their

01:02:54past hiring managers

01:02:55were hiring more men than women one

01:02:57that's useful to flag

01:02:59you shouldn't just throw the baby out

01:03:01with the bathwater

01:03:02in an attempt to kind of distance

01:03:04yourself from that bias

01:03:06you should what i kind of make the

01:03:08argument for in that paper

01:03:10it's kind of a thought piece in hbr is

01:03:12we can use

01:03:13algorithms as magnifying glasses and

01:03:15when you're at the step of preparing to

01:03:17build the algorithm

01:03:19um what amazon ended up learning from

01:03:21this was

01:03:22um the resumes that were more likely to

01:03:26get hired

01:03:27were resumes that use words like i

01:03:29captured value

01:03:31um and other kind of confidence laid in

01:03:33words that also

01:03:35were very close to kind of like warfare

01:03:37terms

01:03:38and though that confidence laden

01:03:40language was directly correlated

01:03:42strongly correlated with um

01:03:44gender so i tell my students

01:03:47use these words um make the playing

01:03:50field a little more

01:03:51uh even keel but amazon i mean they

01:03:54haven't given up on algorithmic hiring

01:03:56they're back to it especially

01:03:58now with covid um but what amazon could

01:04:00do

01:04:02if they wanted to use those same models

01:04:04is redact words

01:04:07um adjectives that have really little

01:04:10to know predictive value in terms of

01:04:12people's performance so

01:04:14you can any anytime you are making

01:04:18decisions about

01:04:19input that you want to use for your

01:04:20algorithm i think

01:04:22what we already know from um different

01:04:25lines of

01:04:27psychological literature we can apply to

01:04:29that

01:04:30and even ask new questions too and

01:04:32building the algorithm so

01:04:34i think it's important to think about

01:04:36who is actually building an algorithm

01:04:38it was mostly males building hiring

01:04:40algorithm it shouldn't be surprising

01:04:42that mostly males are getting hired

01:04:43once you have more diversity in the

01:04:46actual

01:04:46team that's building the algorithm more

01:04:49women a more culturally diverse set of

01:04:51people

01:04:52you are also going to help um think of

01:04:56questions that maybe

01:04:57someone from a certain perspective might

01:04:59not have considered before right

01:05:00um and then interpreting output from the

01:05:03algorithm i think one of the hot topics

01:05:05that's just going to keep growing is

01:05:07auditing algorithms and that's not to

01:05:10say that auditing should only happen

01:05:12once the algorithm is built i think the

01:05:14whole point of auditing

01:05:15is you're going through a process before

01:05:18you actually launch this algorithm

01:05:20and start to use its judgments but all

01:05:22of those steps

01:05:23are i think really right to ask new

01:05:26empirical research questions so as

01:05:29industry is kind of grappling with that

01:05:31i think that's an opportunity for

01:05:33researchers and academics to also

01:05:35ask those questions and test them

01:05:38rigorously

01:05:38so thank you so much i'm looking forward

01:05:40to hearing your thoughts

🎥 Related Videos

What vaccinating vampire bats can teach us about pandemics | Daniel Streicker

a16z Podcast | Things Come Together -- Truths about Tech in Africa

2024 TSCRS Applications of anterior segments diagnostic instruments in cataract surgery

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

NES Controllers Explained

🔥 Recently Summarized Examples

The Hitler-Stalin Pact | Reflections Episode 9

Uncovering Corruption From Health "Experts" | Scott Carney

The Forgotten Geometry: A New Path to Unification

Joe Rogan Experience #2194 - Luis Elizondo

From Tesla to DNA: The Science of Scalar Waves - Dr. Sandra Rose Michael - Think Tank E44

Bitcoin Holders...Watch Out for Sept

View original video