00:00hello everyone here um it's a great
00:03honor for me to coordinate this event
00:06and host jennifer as our guest the
00:09it's really exciting to have you after
00:11reading so many of your insightful
00:14so first please allow me to briefly
00:17introduce jennifer locke
00:19so jennifer is an assistant professor of
00:23at georgetown university's mcdonald's
00:28her work on overconfidence received the
00:312019 early career award
00:33by the journal of experimental
00:36and counter to the well-established
00:40algorithm aversion her paper on
00:43algorithm appreciation
00:45suggests new insights on when people are
00:48willing to embrace algorithms
00:50and improve the quality of their
00:53so today she will give us a talk about
00:58so let's welcome jennifer and i will
01:01give the floor to you now
01:03thanks so much for having me here i'm
01:04really looking forward to
01:06um discussing this work with
01:09especially this group and i'm really
01:13curious to get your thoughts i'll share
01:16on algorithm appreciation and then focus
01:19on algorithmic hiring
01:21all with the eye towards developing
01:24a more overarching theoretical framework
01:27which i call theory of machine
01:28and i think this is an especially great
01:32feedback on that theory that i'm
01:33starting to build so thank you so much
01:36i'm interested in how managers can
01:38assess themselves in the world more
01:41and as ming chen mentioned the research
01:43i'll share today examines if people are
01:45willing to listen to algorithmic advice
01:47which is important because that can
01:48actually help them improve the accuracy
01:51in many decision contexts the second
01:54paper that i'll focus on today
01:56is how people want their own performance
01:58to actually be assessed when they're
01:59applying for a role on a team
02:01either by a person or an algorithm and
02:04over the course of conducting this
02:05research i've developed a
02:07theoretical framework which i'm really
02:09looking forward to getting your thoughts
02:10on theory of machine
02:12i probably don't have to tell it to this
02:14crowd but although historically
02:16managers and organizations have received
02:20really with the rise of big data more
02:21and more organizations are trying to
02:23leverage the accuracy of algorithmic
02:26to inform their managerial decisions
02:29so some use algorithms to hire promising
02:31applicants already this is rising
02:34and some use algorithms to predict
02:36performance for current employees
02:38and some to predict choose at risk for
02:40leaving in order to improve their
02:42of their retention so the issue here
02:45is that while many organizations are
02:48and investing in algorithms to sort
02:51through that data and produce
02:53this new source of advice many are
02:56really trying to understand how they can
02:57fully realize or maximize the benefits
03:00of algorithmic advice
03:01so specifically it's unclear what
03:04happens when this algorithmic advice is
03:06actually gets in the hands of managers
03:09and other decision makers and the second
03:11research i'll share today looks at
03:12what happens when algorithmic judgment
03:15is being assessed by stakeholders so not
03:17the decision maker but people who are
03:21so first we'll look at how do managers
03:25to algorithmic advice or how do they
03:30so we'll go through if people are
03:31willing to listen to algorithmic advice
03:33in my algorithm appreciation paper this
03:35is when people were making predictions
03:38so for uncertain events like brexit and
03:40other geopolitical events
03:42um and then additionally how do how do
03:46um actually respond to being assessed
03:49themselves either by a person or an
03:52algorithm and this will all lead up to
03:54a theory of machine so i'll just plant
03:56the seed for you the idea of theory of
03:58this is a um inspired by a long line of
04:02research in both philosophy and
04:05called theory of mind so theory of mind
04:08looks at how we infer intentions and
04:11in the minds of other people and i'm
04:14looking to develop a theory of machine
04:16which describes lee people's theories
04:18algorithmic judgment and human judgment
04:21and how those two compare at their
04:26so the importance of understanding how
04:27people respond to an algorithmic advice
04:30first it has potential to greatly
04:32improve decision making
04:34algorithms generally outperform the
04:36accuracy of human experts when the two
04:38are actually directly compared and
04:40there's a long line of literature on
04:42and i saw that um some of your past
04:44speakers had had touched on this as well
04:46which was exciting to see second
04:49algorithms can only improve human
04:50judgment if people are actually willing
04:53um and so while the field of data
04:55analytics or the systematic computation
04:58of data most commonly using algorithms
05:00continues to evolve at a rapid rate
05:03um the important connection between
05:06producing insights and actually applying
05:08is often overlooked especially when you
05:11start to talk to folks in industry
05:13the people on data analytics teams are
05:15just assuming that whatever output
05:17they're producing people are going to
05:19100 but that's not always the case
05:22and deserves empirical testing i think
05:25as i mentioned the first paper i'll
05:27share test if people are willing to even
05:28listen to algorithmic advice in the
05:33um to give a little bit of background
05:36i'm sure most of you know this work
05:37pretty well um but just so we're all on
05:40really the enormous strength of
05:41algorithms in algorithmic accuracy and
05:44judgment accuracy has prompted
05:47as to how comfortable people are relying
05:49on algorithmic advice
05:51so in his classic book on the accuracy
05:54sat on my bookshelf in grad school for
05:58um neil looked at the accuracy of
06:02relative to human judgment and neil
06:05may have made the first academic mention
06:08of psychological distrust of algorithms
06:10he described how when he was describing
06:15um empirical results comparing accuracy
06:20to human judgment when he shared this
06:24with expert clinicians from the 50s
06:26these expert clinicians were actually
06:28really reluctant to believe that a
06:29simple mathematical calculation
06:31could outperform their own precious
06:35this sentiment is echoed in other
06:37research on the accuracy of algorithms
06:40that didn't necessarily look at
06:42people's perceptions of algorithms and
06:46did is actually really important because
06:48it led to conventional wisdom that
06:49people just distrust algorithms
06:52and that idea survives to this day
06:55with limited empirical testing so i know
06:57that berkeley dead forest
06:59have visited you folks he has some great
07:02and nate fast as well and there's a lot
07:04more work bubbling up on this topic
07:07really thrilling but i think it's useful
07:10the idea of distressing algorithms
07:13really came from anecdotes
07:15and so much so that even in kahneman's
07:18thinking fast and slow this idea of
07:21algorithm aversion is very strong
07:25um one thing that i want to mention
07:27about the first paper i'll talk about
07:29through our experimental design we
07:32issues that have made some prior
07:35results a little bit more difficult to
07:39some prior research that has looked at
07:41actual perceptions rather than just the
07:43accuracy of algorithmic
07:44versus human judgment um had looked at
07:48so we use a paradigm where we present
07:51people with identical advice
07:53that helps us control for a lot of
07:54factors including the accuracy of the
07:57and we merely manipulate the label of
07:59the source so rather than measuring
08:01choice which a lot of past work has done
08:03we measure how much people update to the
08:05device that they receive
08:07based on whether they think it comes
08:09from an algorithm or a person
08:12and so a little spoiler alert instead of
08:14finding a version we find algorithm
08:18hence the title of our paper um
08:21so in this paper i'm just going to give
08:23a brief overview because most of the
08:25studies in this paper
08:26use this paradigm we use a methodology
08:29called the judge advisor system
08:31it's frequently used to study how much
08:33people incorporate the judgments
08:35normally of other human beings
08:37into one's own judgment this paradigm
08:40enabled us to measure the percentage
08:42that people actually adjust
08:43towards the advice from their initial
08:47and you just want to pause here in case
08:48anyone has any questions about that so
08:51initial numeric estimate then they
08:54receive advice that's also numeric
08:56and then they have the opportunity to
08:58make a final incentivized estimate
09:00so if they updated fully to the device
09:02that would be a weight on advice of one
09:04if they completely discounted the advice
09:06from either the algorithm or other
09:08people that would be a weight on advice
09:11so any questions about that
09:16i'll i'll keep chugging along unless i i
09:20so when our first studies we benchmark
09:22what utilization of algorithmic advice
09:24relative to utilization of human advice
09:27and it's useful to have this benchmark
09:28of how people respond
09:30to human advice um they're there because
09:33uh there's past work and advice taking
09:35literature that people tend to really
09:38advice from other people on average
09:42um advice so heavily that they actually
09:46to advice when it comes from other
09:47people so we wanted to know
09:49well how controlling for the fact that
09:51we know that people tend to just
09:52discount advice in general
09:54how do they then respond to advice if it
09:56comes from a new source
09:58across our experiments we find a really
10:02um that people consistently give more
10:05weight to identical advice
10:07when it's labeled as coming from an
10:09algorithm than a person
10:10so we call this effect algorithm
10:12depreciation and we find it across a lot
10:14of different domains both
10:16objective and subjective domain so for
10:20as well as the most subjective domain we
10:23could think of which is people
10:25so will two people um
10:28described in the study get along
10:30romantically and that's the
10:32the judgment that participants were
10:33making there and regardless of the
10:36subjectivity of the domain here we
10:38consistently find that people
10:40um rely more on the same advice when
10:43they think it comes from the algorithm
10:44which is the blue bars here so
10:47it's a pretty robust finding across
10:50um and what we wanted to know was after
10:55well algorithm aversion seems alive and
10:58in people's thinking from the side of
11:00researchers so we actually in study two
11:03asked researchers to predict the results
11:06our matchmakers study where people were
11:08predicting romantic attraction between
11:10two people that they'd read about
11:13some of you may even have taken our
11:14survey which we shared with the judgment
11:18conference email list so although our
11:20results from studies 1a through 1d
11:23may sound intuitive now that you know
11:25the results interestingly
11:27when we asked researchers they predicted
11:29the opposite results to what we found
11:31empirically with our participants
11:33they did predict aversion when we
11:36actually found appreciation
11:41so far our experiments intentionally
11:43controlled for excessive certainty
11:45in one's own knowledge in the studies
11:48that i've shown you so far
11:50we provided advice from external
11:52advisors regardless of the source
11:54in both the human and algorithm
11:58why did we do this well it ensures that
12:00participants compare
12:01their own judgment with advice in both
12:04so basically we're not confounding human
12:08with someone's own judgment because in
12:11both the human and algorithmic
12:13conditions in our past studies
12:15everyone was comparing their own
12:18with an external advisor so in
12:23we basically wanted to know if thinking
12:25or special snowflake moderates algorithm
12:27appreciation so let me explain a little
12:30we examine whether subjective confidence
12:32in your own judgment
12:34plays a role in the use of algorithmic
12:38here people in one condition were
12:42before they ever saw so this is a little
12:45judge advisor system paradigm people are
12:47making a choice here
12:48before they saw the advice this allowed
12:51to have one condition where people were
12:54advice that they might receive from
12:58or from an algorithm this we replicate
13:01algorithm appreciation where people are
13:03making a choice in their advisor
13:06prior to getting any information about
13:08what their advice might be
13:09so that's consistent with the results
13:11that i've shared with you so far
13:14and that 88 shows the algorithm which is
13:17statistically significantly different
13:19from 50 if they had merely averaged
13:24so our new condition here is people
13:26choosing between an algorithm and their
13:28own estimate and we we thought i was
13:31thinking that we would actually
13:34completely do away with algorithm
13:37it was so strong in the study that it um
13:40moderates algorithm appreciation the
13:42role of the self and your own judgment
13:44to make a direct comparison
13:46to the algorithm um 66 percent
13:50is different from 50 percent but
13:53the key here is that they're still
13:55choosing the algorithm
13:57it's important to note though that 88
14:00is statistically significantly greater
14:04we did moderate algorithm appreciation
14:07but we couldn't turn it off fully
14:10and indeed when we asked participants
14:13they were in these estimates before they
14:15received them people were more confident
14:17in their own estimate
14:18being correct than in that of another
14:22which is consistent with work on
14:25here these results suggest the
14:27confidence really drove the propensity
14:29to choose the human estimate
14:31more when the human was the self rather
14:34than when the human was an external
14:38this study i think is really important
14:41partly helps us reconcile our work with
14:44empirical work that had been coming out
14:48and finally i think my favorite study um
14:52in study four we collected data from a
14:54really unique example
14:56national security professionals who are
14:58arguably experts at forecasting
15:02so we compared this expert sample to a
15:05that made identical judgments this
15:08allowed us to see how objective
15:10influences responses to algorithmic
15:14and keep in mind uh full disclosure
15:16obviously we're comparing two samples
15:18that probably differ on more things than
15:22but we thought it would be useful if
15:24we're going to be able to get data
15:25i it took me two years to actually be
15:28able to get our survey
15:29circulated international security
15:32that if we had the chance to get data
15:36to then be able to have a benchmark so
15:37the late sample serves as a nice
15:40if you're interested in subjective
15:43expertise and how that influences
15:45responses to algorithmic advice i'm
15:47happy to talk more offline about that
15:49i ran about 12 studies where we
15:52subjective expertise without
15:56objective expertise or knowledge which
15:59ideally would be key
16:02just a brief summary of that we were
16:05able to manipulate felt expertise in
16:08but people still responded the same
16:11where they were relying on algorithmic
16:13that basically told me that the advice
16:17expertise that was so strong in our
16:20national security professionals
16:22which had been developed over the course
16:24of some of them had been in their jobs
16:27it's just that it's difficult to
16:29replicate that strong
16:31sense of expertise online in an
16:36about five minutes but it's still a
16:38topic i'm really interested in
16:40um so we compared this expert sample
16:44and here we tested for algorithm
16:46appreciate appreciation
16:47in visual estimates business forecasts
16:50or how much tesla would sell
16:52and two geopolitical forecasts about
16:54cyber sanctions and brexit
16:57this allowed us to test for algorithm
16:58appreciation and domains of even
17:00extreme uncertainty remember how
17:02uncertain it was whether or not brexit
17:06or not by a certain time
17:09and here although lay people showed
17:11algorithm appreciation
17:12as our past samples did experts actually
17:15discounted algorithmic advice
17:17more than lay people so when experts
17:20were receiving advice
17:21they just didn't listen to anyone and
17:25this ended up hurting their accuracy
17:29um so experts discount algorithmic
17:32advice more than lay people
17:34and this comes at a detriment to their
17:36accuracy so people who are
17:38um paid for living to make geopolitical
17:41or actually making less accurate
17:45then lay our lay participants which
17:49i always think is very depressing for
17:50the world but fascinating for research
17:55in summary we did find some interesting
17:57moderators so algorithm appreciation is
17:59moderated by two key factors
18:02first when a decision maker is directly
18:04comparing his or her own knowledge
18:07algorithm appreciation weakens
18:10and when people have expertise in a
18:14our work suggests that they're just
18:16going to discount advice regardless of
18:19which importantly ends up decreasing
18:23um and one other kind of tidbit that i
18:26always found interesting is we
18:28did find um a mechanism where
18:31we tested for numeracy in our
18:33participants in earlier studies and
18:35the more numerate people were the more
18:37they were willing to rely on algorithmic
18:40maybe a little bit less surprising than
18:42the other moderators but i think still
18:44useful to keep in mind so numeracy
18:48was an 11 item scale that basically
18:50measures kind of comfort
18:51with numbers on simple math um
18:55math questions i saw in the chat that
18:57there might be a question but
18:59hopefully i answered it it looked like
19:00the last message that it was answered
19:14i can't see the chat so if there are any
19:17questions please feel free to meet
19:20i can monitor the chat if there's any
19:24yeah bring it out to you great thanks
19:28um so a lot of you might be thinking
19:31a lot of other moderators of what might
19:34kind of flip this effective algorithm
19:36over to a version and i spent
19:40a lot of years of my dissertation trying
19:43to find a strong moderator but
19:44i was just kind of met with really
19:46robust effects for algorithm
19:49when people are making predictions about
19:50the world so so keep that in mind
19:52um the decision context is always
19:56about what's going on in the world or
19:59not necessarily related to themselves um
20:02one thing i thought might actually um
20:06be a moderator to algorithm appreciation
20:09familiarity with algorithms themselves
20:12um so if people are just
20:14uh aware that they use algorithms all
20:16the time maybe they're just more likely
20:17to listen to advice from it
20:19uh relative to people who never really
20:22maybe an older generation who doesn't
20:24even know what netflix is right
20:26but we actually found so one proxy for
20:28familiarity with algorithms might be age
20:30we found that older people rely just as
20:32much on algorithms as younger people do
20:34which was quite surprising to me um and
20:36we did have a wide range of ages in our
20:40um and another thing i thought might
20:44is a difference between choosing between
20:48um in a wisdom subject's design
20:50algorithmic versus human advice because
20:52most of the studies that i ran
20:54were between subjects so people were
20:56only responding to one source of advice
20:58um and there's a lot of great work from
21:01max phasermen and others
21:02where they show that there's a
21:04difference in psychologically and how
21:08uh choices when they're presented
21:11versus separately and separately would
21:15that would map onto the studies i showed
21:17you this between subjects design
21:19even when we looked at choice 75 percent
21:22of people still chose the algorithm over
21:24the person so again robust to that
21:27and finally i thought what if people
21:34advice in our studies because they're
21:37making the final estimate right
21:38um and i thought well what if they have
21:40to when they're choosing
21:42um the advice that they're receiving
21:46before they see the advice that they'll
21:50the estimate provided by the advisor
21:53don't really have any information about
21:56is going to determine their final
21:58participant payment without them
22:01and even when people were not only
22:08the final estimate to the algorithm or
22:12uh when that was actually determining
22:13their final pain they couldn't adjust
22:15or towards the advice 61 percent still
22:20when the advisor would be in full
22:22control and determine the final
22:23incentivized outcome
22:24so robust to that as well one moderator
22:27that i thought was kind of interesting
22:29i ran a study where we changed uh the
22:31labels on the algorithm
22:33and the person across a number of cells
22:36and it seems that in scenario studies
22:40uh people prefer an expert person to an
22:44algorithm and i think that this
22:46jives with some work that's been coming
22:50and i think this is pretty interesting
22:52because algorithmic advice is often
22:55less expensive than expert human advice
22:58you can think of doctors and things like
23:00um and it's also just more readily
23:03accessible so you don't
23:04uh i think with like video chats with
23:08doctors this past year maybe that's kind
23:09of changing but generally algorithmic
23:12has the potential to completely displace
23:15advice that we normally would
23:17pay experts to give us um so i think
23:20that this is kind of a useful
23:22piece of evidence there
23:26so overall these results suggest that
23:29is really not such a straightforward
23:31story as received wisdom would have us
23:34and it partly overturns what a lot of
23:36researchers have assumed we've known for
23:40but importantly i think it opens the
23:43many questions about how expectations of
23:47algorithmic and human judgment at their
23:50differ from each other
23:53so one aspect of this paper that's
23:55useful to mine which i kind of flagged
23:57people are making predictions about the
24:00romantic attraction between other people
24:04but i started to wonder this was my
24:07dissertation work and and through that i
24:09was starting to think i
24:11really want to um collect new data
24:14on what happens when people are in
24:15domains where judgments are being made
24:18an algorithm and a person is giving
24:21producing judgments about their own
24:23performance so something that's really
24:25rather than making judgments about the
24:29so that's why i turn to the domain of
24:30algorithmic hiring it's being adopted by
24:33many organizations so amazon uses
24:36um algorithmic hiring in a very
24:41every time i talk to prasad seti at
24:44he tells me that they really don't use
24:47algorithms that much
24:49in their hiring processes because the
24:51engineers just don't want to have that
24:53um part of me still can't believe that
24:56like the engineers don't want algorithms
24:58um to help with promotion decisions but
25:02um google has kind of not changed with
25:07so i wanted to test empirically how do
25:09stakeholders or the job applicants
25:12view algorithmic hiring compared to
25:20um so you can imagine here that if
25:23people don't want to be hired by an
25:26when the labor market becomes tighter
25:29um in the future they may even forego
25:31applying to that job so i think that
25:33this question has some interesting
25:36um applications for the real world that
25:40so in study one i'll show you this is
25:43new work that i'm really excited about
25:44so really looking forward to feedback on
25:47um in study one i'll show you applicants
25:49preference for how they want
25:50their application packet reviewed when
25:52they're applying for a role on the team
25:54and we created this pretty intricate
25:58where m cherkers were um
26:02all go into the details but m turkeys
26:05the opportunity to take a few tasks and
26:08then they knew that based on their
26:10on those initial tasks they could be
26:13um to be part of a team to um solve
26:18kind of a puzzle that everyone responded
26:22um it was basically a murder mystery and
26:25if there's one thing that keeps prolific
26:27uh participants attention they are very
26:30into true crime and they were very
26:31excited at the opportunity to become
26:33part of the team to potentially solve a
26:37um so spoiler here a whopping 70 of
26:40applicants in study one
26:41chose a person over an algorithm so this
26:43was really the first time that i
26:45finally found this algorithm aversion
26:47that everyone's kind of been talking
26:49and that was exciting and i think part
26:51of that is because of the domain
26:54um of the judgment itself and that it's
26:57the participant themselves rather than
26:59the world so this effect within hiring
27:03appears to be robust but we do find
27:05important factors that weaken
27:07and even reverse it so in study two
27:10we find that aspects of the application
27:13the applicant pool itself influence our
27:16so preference for the person weakens
27:18when competition is higher within the
27:22but when competition is lower applicants
27:27as more competitors are vying for the
27:29role it seems that this preference
27:32and in study three we shift to examine
27:35how characteristics of the hiring
27:37manager themselves actually influence
27:39applicants preferences
27:41again applicants do prefer a person over
27:44when the hiring manager is a member of
27:48but when the hiring manager is a member
27:50of the applicant's out group this
27:53and here when the hiring manager is now
27:55group member people strongly prefer the
28:00and one thing the the reason why i'm
28:02really excited about this project is
28:04study four or study three dive into the
28:09is it that people think the out group
28:11member is going to be biased against
28:14or as an out group member does the
28:18oh well that person is just not
28:19competent enough to see how good of an
28:22am so that's kind of the difference
28:25uh cis uh systematic error versus uh
28:30something that daniel kahneman has
28:32talked about in some of his writing
28:38oh so what were the in group and out
28:39group in the study i see in the chat
28:41walk you through in more detail in this
28:45than i did in the first paper the first
28:46paper i kind of wanted to give you an
28:49to see where i was starting from diving
28:52so we'll we'll go over that in a few
28:54slides thanks for a good question
28:56in study four we manipulate the
28:58algorithms past performance
29:00basically i wanted to know like how good
29:03does an algorithm need to be
29:05before people wanted to assess their
29:09how accurate does an algorithm need to
29:10be before people actually prefer
29:13it to a hiring manager and so another
29:15spoiler here is that and how could they
29:1775 accuracy before the preference
29:21actually flips that's
29:22a pretty high bar for people to prefer
29:26or at least i thought so i was pretty
29:30um so in study one participants read an
29:35of the study they knew what they were
29:42the the time they were in the study so
29:44they read that there were two tasks
29:45and that depending on their performance
29:47on this anagram quiz and this trivia
29:50there was a possibility to work as part
29:52of a team with other um
29:54participants on task three then they
29:58and they read it was a murder mystery
30:00and they all kind of lost their minds
30:01they got pretty excited about this
30:03so it's just always nice to know when
30:06participants are really involved
30:08and we know we're getting good data here
30:11participants have opportunity to win
30:13bonus pay from smurder mystery while
30:15coordinating with others
30:16under time pressure so you could
30:19imagine that most people were pretty
30:21incentivized to do as well as they could
30:23on the quizzes to create the most
30:27competitive application packet that they
30:28could and so they took it a little bit
30:32so then they read that in order to
30:35active participants would be assigned to
30:37roles with 75 percent of participants
30:39being assigned to the role of applicant
30:41to the team and 25 percent the role of
30:45uh normally i try to avoid deception at
30:48i didn't use deception in any of the
30:51studies for algorithmic covering for
30:54we did use deception because all
30:57participants were applicants
30:59and so the survey also stopped before
31:03started ideally some of the next studies
31:06we'll actually have people take the
31:08murder mystery and we'll be able to
31:09measure people's performance as an extra
31:12dependent variable there
31:16so after people took the quiz everyone
31:20that they actually were an applicant
31:24and they read a little bit about what
31:26their application packet would look like
31:29so they read a page that said your
31:33will include your quiz scores including
31:34your time spent so you can imagine
31:36if you answered a lot of the trivia
31:39questions correctly in a short amount of
31:41you're feeling pretty good about your
31:42performance the difficulty of correctly
31:45answered questions so you also know a
31:47little bit more about how competitive
31:50as well as a short essay so we wanted to
31:52have a mix of both objective and pretty
31:55so you can imagine if you have um an
31:58opportunity to write a short
32:00essay here people could use it
32:02potentially to persuade
32:05um and then they made a choice
32:08of who they wanted to review their
32:14so here they wrote their essay and then
32:16they made their choice but in other
32:18studies i'll show you
32:19we actually find the same results if
32:20they make their choice and then they
32:25and they chose how do you want your
32:27application reviewed
32:29from a person or an algorithm we wanted
32:32um counterbalance that order we changed
32:34the order in the other studies
32:36in case here maybe people thought well
32:39essay and that would maybe lead them to
32:42want to choose the person because they
32:44think the person would be more persuaded
32:45by the essay but we still find
32:47the same results even if the order's
32:50so here 70 of people chose to have the
32:54over the algorithm assess their
32:58next we wanted to know well is this
33:02affected by the competition of the
33:05applicant pool itself
33:07and you can imagine that um
33:11especially kind of with covet and the
33:15changing in terms of job loss and things
33:18that could easily influence people's
33:20decision of whether or not to apply
33:22to one role or another depending on how
33:24they think they might be assessed to
33:27a more efficient use of their time at a
33:29job they think they might be more likely
33:31to get so here we operationalize low
33:36as four spots available but there's five
33:40um and in high competition 21 other
33:44here we found a moderation that
33:48of competition so when competition was
33:51people um i should note that the
33:54y-axis is the percentage of people
33:59relative to the algorithm so more people
34:02chose the person in the low competition
34:04than the high competition condition
34:11and those are both significantly
34:1450 so an indifference point
34:18and then finally in study three um this
34:22will hopefully answer the question
34:24someone had before um
34:26we wanted to know if people might prefer
34:28an algorithm when the hiring manager
34:30is a member of the out group more
34:33so you can imagine um a few reasons why
34:36people might switch to the algorithm
34:38which i alluded to before
34:40so people report their beliefs on hot
34:43button political issues which included
34:45minimum wage gun ownership and abortion
34:49and then they were told when
34:52they found out that they were in the
34:54role of the applicant
34:56that the hiring manager either agreed
34:59or disagreed with them so if they agreed
35:02they were a member of the in group
35:03they disagreed they were a member of
35:08um and here we found that when the
35:12hiring manager was a member of the in
35:14group again people really prefer to have
35:17that person assess their application
35:19but it flips and people start to prefer
35:21the algorithm when they find out that
35:24really disagrees with them on hot button
35:27so we ran a follow-up study to this
35:31if this is driven by systematic or
35:35expectations of that systematic error
35:39um this person will just be biased
35:41against me and random arab
35:43um meaning that i expect their judgments
35:45to kind of be all over the place because
35:48um at actually making this assessment
35:51the way that we did that was we either
35:55um explicitly that the hiring manager
35:58a member of your out group will know
36:01that you are a member
36:03you guys are out group members or we
36:06the hiring manager won't have
36:08information on whether or not
36:10you are an angry brow group member
36:14and so here we found that expecting the
36:17is incompetent drives this moderation
36:21rather than being biased against them so
36:24basically we find that people
36:26want the algorithm even when
36:29the out group hiring manager is not
36:32or have any idea about the
36:33identification being different
36:36so you just think that they're not good
36:37at making these judgments
36:39and then finally so before um
36:42we wanted to know just how good or
36:44accurate really does the algorithm need
36:47before people prefer it
36:50um so we randomly assign people to be in
36:53a condition of no information
36:55about past success at putting together
36:58a successful team that actually did
37:00solve the murder mystery
37:01um or um that the algorithm
37:06was 60 percent uh successful at putting
37:13of the time and here um we find that
37:17we replicate our effect of a preference
37:22um and that this weakens
37:25as the algorithm becomes more accurate
37:27but the the flip point here
37:29is the 75 percent an algorithm needs to
37:33before people actually prefer it
37:42so in summary we find that 70
37:45of applicants prefer to have a person
37:47review their application packet instead
37:50preference for person weakens when
37:52competition in the pool itself
37:54is higher so not even related to
37:57about the um decision maker
38:01potential decision makers from the
38:02organization and that people prefer an
38:05when when the hiring manager is a member
38:08and it's driven more by expectations of
38:11um rather than systematic error which
38:14was pretty surprising to us
38:15and um i'm i'm hoping that we run
38:19a number of follow-up studies to kind of
38:21dive into that and tease that out
38:24um and then finally an algorithm
38:25requires a pretty high benchmark of
38:29before applicants prefer it to a person
38:37it seems like when the domain is
38:38relevant to the self people prefer human
38:41when it's about the self uh we also ran
38:43three studies where we ask people how do
38:45you want your teammates to be hired
38:48once we told them you're hired for this
38:51now the the um choice is not how you're
38:54hired but how the rest of your teammates
38:57and they still said the person which i
38:59thought we might flip the effect there
39:00so there might be a self other
39:03um but there wasn't and then i have a
39:05another project that's a lot more
39:08where we do have i think of maybe five
39:11studies where people
39:12are responding to feedback on their
39:16rather than an assessment that leads
39:19um achieve a role on team or not just
39:22feedback on their writing
39:23and there people say they want feedback
39:26on their writing from a person
39:28but when they actually get feedback it
39:30doesn't matter if it comes from an
39:33they still everyone kind of updates to
39:35that feedback that they received so
39:38i would say that that evidence is a
39:41it deserves a little bit more time to
39:43kind of sort through and understand
39:45but my takeaway just from our data so
39:48on this uh judgment type of feedback
39:51is that people kind of say one thing but
39:53then when push comes to shove and
39:54they're actually updating their beliefs
39:56they do something different
39:57so i've started a new project to dive
40:00into that comparing how people respond
40:02to judgments of algorithmic and human
40:06based on if it's in the judge advisor
40:09where they actually see the advice and
40:13as much or as little as they want to it
40:15versus if they're in a scenario domain
40:17and they're choosing between advisors
40:19and you can imagine in a scenario domain
40:22the psychologically this psycho
40:25psychologically rich mechanism there
40:29in a scenario you can imagine all the
40:31different types of ways that advice
40:34might um basically differ between an
40:38algorithm and a person but in the judge
40:42you are seeing the source with numeric
40:44information in front of you so you're
40:47to that advice but in the scenario
40:49condition your mind could
40:51come up with all these different ways
40:52that the advice might differ between the
40:55and so there we found algorithm
40:57appreciation and judge advisor
40:59conditions but in the scenario
41:00conditions people seem
41:03um the data is noisy and people seem to
41:06a little bit um we don't we don't find
41:08out an inversion in the scenario which
41:10is what i thought we'd find
41:11we find it kind of like a little bit of
41:12a difference there um
41:14so i think there's some interesting
41:16moderators to look at here and i'm
41:19your thoughts as well if there's any
41:22questions here i'd love to take them
41:24and then i can go on to kind of another
41:27project i've been working on recently
41:29developing this theory of machine
41:33uh we actually have one question from
41:36so he asked about whether were the
41:39accuracy levels actually measured or
41:42made up information so i guess that's a
41:44question about deception
41:45yeah yeah so we did use deception there
41:54in terms of efficiency it just made more
41:57change the label of how accurate it was
42:01i think it could be really fun to
42:02potentially run a field study where
42:05uh companies like testing algorithmic
42:08um and giving people
42:11the accuracy feedback like real time
42:16but this was just a way where we could
42:19test more conditions that way
42:25great question any others
42:29um another question from taha uh
42:32he or she is wondering how the wording
42:36of the introduction of the algorithm to
42:39so how did you introduce algorithms
42:42great question i'm wondering if i can
42:46go to this slide without it messing up
42:52so we operationalized algorithm in
42:55algorithm appreciation in a few ways i'm
43:02i might need to stop sharing my screen
43:18can you see regardless of the type great
43:21um so in algorithm appreciation
43:25we tested a few different
43:28of the term algorithm itself because we
43:32question does it matter how we're
43:34describing the algorithm
43:36i was most intrigued by testing how
43:39people responded to advice they thought
43:41was coming from a black box algorithm
43:43mostly that's just because that's how
43:46our algorithmic advice is normally
43:48we don't know the actual mathematics
43:52netflix algorithm or you know pandora's
43:54algorithm or dating apps algorithm
43:56i don't know if anyone's seen recently
44:00when you tile through i think it
44:02happened in the last week when you tile
44:04uh movies that you want to view trailers
44:08they actually have one big screen that
44:10comes up that's uh like a random choice
44:13and it said it says something like do
44:14you want our algorithm to choose for you
44:17um which is kind of uh i thought funny
44:19but even there is still a black box
44:22we don't really know the
44:25data that's being input into the
44:27algorithm we don't know how it's being
44:29um so our results and algorithm
44:32hold regardless of the type of algorithm
44:34that we presented so we had
44:36started off describing a simple
44:39and we had used an average in algorithm
44:42appreciation we didn't use any deception
44:45between 300 to 400 um
44:48separate participants to create the
44:50advice so that allowed us to present
44:54that came from other people but we could
44:55also um frame it as coming from an
44:57algorithm because an average is really
44:59this one of the simplest algorithms
45:02um so there we found algorithm
45:04appreciation and then we went to a black
45:05box where we just changed the label
45:08um study four the the national security
45:12that's partly why it's one of my
45:13favorite studies because we just didn't
45:15give any information
45:17which allowed people to rely
45:20on their lay perceptions like what
45:23whatever definition they were bringing
45:24to the table and and one question that
45:27if you're asking this great question is
45:31another one normally comes up is well
45:34what do people think an algorithm is
45:36like do our participants know what an
45:38means and we asked in a number of our
45:42if people could define the term
45:44algorithm and then we had our rays
45:46code those responses to create
45:50and normally the responses fall into
45:53it's some sort of math or formula it's a
45:57some sort of rule based on logic um or
46:00there was a kind of miscellaneous
46:04category which was people kind of
46:06mentioning computers
46:08and my takeaway from that is
46:09mathematicians and computer scientists
46:12wouldn't be upset to read those
46:15definitions people have a pretty good
46:18of what an algorithm means so if we give
46:19them a kind of black box
46:22operationalization it's not that they
46:24don't know what we're talking about
46:26great question thanks so much for asking
46:29taha are you satisfied with the answer
46:33follow-up questions absolutely thank you
46:36okay so i will move on to the question
46:41so she wonders like do you have any
46:43intuition or findings concerning the
46:46why higher competition induce the people
46:48to choose the algorithm more often
46:51yeah so we asked a lot of open-ended
46:56um in these studies because
46:59it was kind of new territory for me
47:02predictions about the world and we just
47:05wanted to hear directly from
47:07either how they would rationalize it or
47:11explain their decision and they said
47:14they said some interesting things which
47:18i think could even be potentially a
47:23um so some people had said
47:26like they didn't think so there were
47:29time constraints and people knew that a
47:31decision would be made my fear was that
47:33if it's high competition people want the
47:36they wanted they don't want to wait 21
47:39extra minutes for a person to go through
47:42um but they said things like oh i don't
47:43want to do that to joe
47:45so there that was a little bit
47:48surprising because it was kind of
47:50towards the hiring manager that they
47:52didn't want to put them through that
47:55application packets to review
47:58um and so i think that there
48:03people do have top of mind a matter of
48:06but there wasn't a straightforward
48:08explanation that made me
48:10think of an experiment that i could
48:12directly follow up on with that
48:14but if you have ideas i would love to
48:16hear them what kind of mechanism do you
48:17think is going on there
48:21so i didn't have something special in
48:23mind but i think the efficiency part is
48:25already very interesting that you tell
48:27okay people might have an efficiency
48:29argument in there and they think okay it
48:31just takes time from the hiring manager
48:32and it's better for both if the if the
48:34algorithm does it so i think this is
48:37i was just wondering whether you have
48:38any complimentary findings but thanks a
48:41thanks yeah and if you think of um any
48:45new mechanisms a test i would definitely
48:48be please send an email i'd be open to
48:57alicia do you want to continue
49:00okay um i guess uh do you have any other
49:03slides you want to show
49:08when is are we over at 10 30 or 11
49:13uh i think we ended 11 but
49:16after that do you have like 20 or 30
49:19have a like a brief talk with definitely
49:23was trying to save some time so we could
49:25have kind of a more open-ended
49:28um yeah so maybe you first finish your
49:31slides and then we open the official q a
49:34okay great great great can you um see
49:37these slides the chapter
49:43throughout the course of um
49:46both of these uh research projects
49:50i was kind of really just
49:53trying to figure out any moderators i
49:57especially because in algorithm
49:58appreciation we never really flipped the
50:01to algorithm reversion in algorithmic
50:03hiring we were able to flip the effect
50:05um and i think that that project
50:09uh we're definitely going to spend more
50:11time kind of digging up mechanisms but i
50:13think we're focusing on the in-group out
50:16and the systematic versus um
50:23but throughout all of that i really
50:27create a framework to kind of fit not
50:30the kind of data that i have evidence i
50:33have but evidence that berkeley has
50:35from his work berkeley dead forest and
50:38from his work um and adam waits has some
50:42really lovely work that's definitely
50:45and so i started thinking well we're
50:48really just looking at
50:50how do people think what their lay
50:53perceptions and expectations
50:55of what algorithmic judgment
50:59can produce in terms of accuracy
51:01compared to human judgment
51:03and as i was thinking through that i
51:05thought i had a different graphic here
51:07oh um so from the dissertation i kind of
51:10theoretical framework um to
51:14help kind of make my research more
51:16systematic but hopefully it could also
51:18useful for other scholars who are
51:21interested in this area
51:22um so the idea for this framework is
51:26the research related to this um would
51:28document how people expect
51:30human and algorithmic judgment to differ
51:34and their process and their output and
51:37algorithm appreciation really focused on
51:39how are people responding to the output
51:41and i really just think that's the tip
51:43um with algorithmic hiring and people
51:45thinking about systematic versus random
51:49i think that that does dive in a little
51:52and um getting back to alicia's comment
51:56about what might be the mechanism
51:58between the high and low competition
52:00i actually think there if we turn to
52:02looking at people's expectations of
52:05the input to algorithmic and human
52:08when there's a lot of data versus a
52:11small amount of data
52:12and how that influences the judgments
52:16that's where we'll really start to
52:17leverage um people's expectations for
52:21the algorithm and human judgment can
52:24um data that it can it can use as input
52:31one thing that i was kind of interested
52:35in was just kind of creating some
52:37predictions of what i thought
52:38could be going on here just from what i
52:41was learning from my own evidence
52:44uh in my research and so you can imagine
52:47this these are basically predictions
52:49that i have for people's lay
52:51perceptions and expectations of
52:53algorithmic and human judgment like at
52:55their finest what they can get us
52:56so you can imagine that people have
53:00algorithm utilizes data that is less
53:07it's less abstract categorical and the
53:10algorithms can't utilize data that's
53:14or intangible and i think nate in his
53:16paper had talked about kind of like
53:19um which i think very much relates to
53:21this but whenever i ask
53:22my undergrads well how would you want to
53:24be hired when you're on the job market
53:26because i don't want an algorithm
53:28hiring me because they don't understand
53:29especially they i will translate this
53:32from what they say but basically they
53:33make an argument like
53:35the algorithm won't understand how
53:37special the snowflake i
53:38am really um because i have a
53:42really wonderful personality and humor
53:43and all these things and
53:45maybe they list things that aren't even
53:46related to the job performance
53:48right but i think people have this
53:52there's certain input that an algorithm
53:56can't attend to as well as a human and i
53:58think that that's definitely worthwhile
54:01question to test and so in terms of
54:04i could imagine i alluded to it earlier
54:06that algorithm people expect algorithms
54:08people to utilize larger
54:09amounts of data as input so that's a
54:11little bit separate from the efficiency
54:14it kind of interesting to disentangle
54:17um and so in terms of process for
54:19people's expectations of quality and
54:23i'm predicting that people might uh
54:25think that algorithms process
54:26less holistically without taking broader
54:30or even context into account and then in
54:32terms of quantity algorithms process
54:34fewer categories of cues so
54:37they might be able to
54:41take into account your scores on an
54:44anagram or trivia test
54:49a numeric score on an objective outcome
54:52and i would consider maybe that one
54:55and when you talk to students who are in
54:57the midst of their recruiting process
54:59they talk about all these other types of
55:01categories of cues like being able to
55:03get along on a team that's a totally
55:06category and so people might see
55:09processing fewer categories of cues so
55:12they can focus on the objective criteria
55:14but there's other categories out there
55:16that they might not be able to consider
55:21with output people might expect that
55:23algorithms can't provide
55:24an explanation behind their judgments
55:27which ends up making them less
55:30um an algorithm they might expect
55:34less relevant data to an individual so
55:36this idea of the special snowflake
55:39um kind of a recommendation for the
55:43people might have the idea that
55:47recommendations for things that people
55:50uh where i might think i have quirky
55:53in music um but the algorithm can
55:57predict that everyone
55:58on average kind of likes taylor swift
56:02and i have a unique taste and then
56:04finally people might also expect that
56:06algorithms produce less output
56:08at a at a time so people can provide
56:10information and an explanation
56:12and they might not expect an algorithm
56:14kind of separate from what's actually
56:16out there in the real world right but
56:17people might not expect an algorithm
56:20to have a conversation with them about
56:21say a medical diagnosis
56:23where maybe an algorithm could diagnose
56:26but if they want to ask follow-up
56:27questions they might feel more
56:28comfortable with the person
56:30and so i've kind of taken this framework
56:34of your machine and and tried to expand
56:36it even more actually so not just
56:37thinking about input process and output
56:39in this book chapter i wrote um it's
56:42it's currently under review but
56:43i'm happy to share it if anyone is
56:45interested and if there might be
56:47at least one person who reads a book
56:49chapter that would be
56:50uh great i'd love to have a discussion
56:55i'm interested in kind of overlaying
56:58output framework with
57:01is the decision or judgment in the
57:04context of making a prediction
57:06or is it an assessment so there you have
57:10related to algorithmic hiring and and
57:12feedback i think even draws in
57:15you have different goals than you would
57:17have of making a prediction right so
57:19feedback you might also want motivation
57:22you might want to know that the feedback
57:25useful information that is actually
57:27actionable so that people can make
57:30um and i think that if you can kind of
57:33a matrix of input process output
57:35people's expectations for this
57:36relative to if they're in these
57:40um well i kind of came up with
57:43predictions that i thought might be
57:44interesting and wrote them up in this
57:46chapter and then the last thing i'll say
57:50i'm hoping that this framework um
57:52especially in an area where
57:55there's been more and more attention
57:56over i've seen the last
57:58five six years or so um hopefully this
58:03systematically bring different
58:06individual researchers evidence and
58:10a more formal kind of matrix of how they
58:12might reconcile and fit together
58:15so that we do have an overarching theory
58:19people are really kind of currently
58:21building so just lay people out in the
58:23world the more that we experience
58:25output from data analytics and
58:28recommendations from algorithms
58:33any any recommendation that can be made
58:37there's algorithms being built to try to
58:39make that recommendation and i think
58:41the more in the real world we're coming
58:44into contact with the
58:45this new source of advice we're actually
58:49in a really exciting time in human
58:53lay people are developing their theory
58:57um and my kind of last plug for this
59:01i've been thinking a lot about um some
59:03fun discussions i've had with computer
59:07one day in the before times one day i
59:10to um c-suite individuals
59:13uh in exec ed and they would say oh
59:16could you tell me the five questions i
59:18should ask my data analytics team and i
59:21i'm not at coca-cola like i i don't know
59:25the context to be able to tell you
59:27five general questions you should ask
59:30context does matter um and they
59:34wanted to know how they could
59:35communicate with their data analytics
59:37team and computer scientists
59:39more how they could communicate more
59:42then a month later i would go over to
59:45the computer science department
59:46and talk uh give a talk on algorithm
59:49appreciation with computer scientists
59:52and they were a little bit more hesitant
59:54to come out and say this but they
59:55would basically um tell me that they
59:58wanted to know how they could more
01:00:01share the results that they were getting
01:00:03from their analyses
01:00:05and they wanted to make sure that people
01:00:07would would actually listen to this so
01:00:09we have siloed in organizations which
01:00:11ends up becoming reflected in silos on
01:00:14university campuses where there's the
01:00:18who want to know the questions to ask
01:00:20the analytics teams and there's the
01:00:24that want to know what verbiage is most
01:00:28so that people will understand their
01:00:30output and then the decision makers will
01:00:32actually act on that
01:00:33so i've seen kind of a disconnect of
01:00:36producing analytics
01:00:38versus acting on them and oftentimes
01:00:41um not to generalize too much but
01:00:43oftentimes people who are producing the
01:00:45just kind of end there right they're not
01:00:50well is anyone acting on the analytics
01:00:53i've shared with them like is this
01:00:54changing decision making and that's
01:00:57um behavioral sciences and psychologists
01:01:00uh especially in the business school i'm
01:01:02in a management department
01:01:04we're in a really great we're poised
01:01:08to kind of pull together computer
01:01:12human computer interaction and
01:01:15and think about well how do people
01:01:17actually respond to algorithmic versus
01:01:20how do we solve this last mile problem
01:01:24and data being there but how to make it
01:01:26actionable and so i started thinking
01:01:28about um this kind of last part of the
01:01:31um every step of actually creating an
01:01:36requires a decision and wherever there's
01:01:38a decision psychology can say something
01:01:40about that which is actually pretty fun
01:01:42so in the stage of preparing to build
01:01:45um computer scientists should be
01:01:48asking like is this data relevant to the
01:01:51prediction i'm making or the decision or
01:01:53judgment i'm trying to make
01:01:54is it biased in any way so amazon
01:01:58was using 500 models in their hiring
01:02:04make their hiring more efficient and
01:02:06they basically a few years ago walked
01:02:09because they were finding that they were
01:02:11only hiring males and not hiring females
01:02:14uh but i i wrote an hbr on this that
01:02:17they kind of threw the baby out with the
01:02:20because when you just walk away from the
01:02:22models you've developed
01:02:24to predict who will be a good performer
01:02:26well when you throw that out what's left
01:02:28you're reverting back to human judgment
01:02:30which we know is riddled with biases
01:02:32um and i would say the thing that we
01:02:34learned from amazon fail amazon's
01:02:36failure a few years back
01:02:37is that we actually uncovered the
01:02:41in resumes that led an algorithm to hire
01:02:45so um yes the output was biased because
01:02:49the input was historically
01:02:51the historic data was biased um their
01:02:54past hiring managers
01:02:55were hiring more men than women one
01:02:57that's useful to flag
01:02:59you shouldn't just throw the baby out
01:03:01with the bathwater
01:03:02in an attempt to kind of distance
01:03:04yourself from that bias
01:03:06you should what i kind of make the
01:03:08argument for in that paper
01:03:10it's kind of a thought piece in hbr is
01:03:13algorithms as magnifying glasses and
01:03:15when you're at the step of preparing to
01:03:17build the algorithm
01:03:19um what amazon ended up learning from
01:03:22um the resumes that were more likely to
01:03:27were resumes that use words like i
01:03:31um and other kind of confidence laid in
01:03:35were very close to kind of like warfare
01:03:38and though that confidence laden
01:03:40language was directly correlated
01:03:42strongly correlated with um
01:03:44gender so i tell my students
01:03:47use these words um make the playing
01:03:50field a little more
01:03:51uh even keel but amazon i mean they
01:03:54haven't given up on algorithmic hiring
01:03:56they're back to it especially
01:03:58now with covid um but what amazon could
01:04:02if they wanted to use those same models
01:04:07um adjectives that have really little
01:04:10to know predictive value in terms of
01:04:12people's performance so
01:04:14you can any anytime you are making
01:04:19input that you want to use for your
01:04:20algorithm i think
01:04:22what we already know from um different
01:04:27psychological literature we can apply to
01:04:30and even ask new questions too and
01:04:32building the algorithm so
01:04:34i think it's important to think about
01:04:36who is actually building an algorithm
01:04:38it was mostly males building hiring
01:04:40algorithm it shouldn't be surprising
01:04:42that mostly males are getting hired
01:04:43once you have more diversity in the
01:04:46team that's building the algorithm more
01:04:49women a more culturally diverse set of
01:04:52you are also going to help um think of
01:04:56questions that maybe
01:04:57someone from a certain perspective might
01:04:59not have considered before right
01:05:00um and then interpreting output from the
01:05:03algorithm i think one of the hot topics
01:05:05that's just going to keep growing is
01:05:07auditing algorithms and that's not to
01:05:10say that auditing should only happen
01:05:12once the algorithm is built i think the
01:05:14whole point of auditing
01:05:15is you're going through a process before
01:05:18you actually launch this algorithm
01:05:20and start to use its judgments but all
01:05:23are i think really right to ask new
01:05:26empirical research questions so as
01:05:29industry is kind of grappling with that
01:05:31i think that's an opportunity for
01:05:33researchers and academics to also
01:05:35ask those questions and test them
01:05:38so thank you so much i'm looking forward
01:05:40to hearing your thoughts