00:00:02
Maybe we-- if you guys
could stand over--
00:00:04
Is it okay
if they stand over here?
00:00:06
- Yeah.
- Um, actually.
00:00:08
Christophe,
if you can get even lower.
00:00:12
- Okay.
- ( shutter clicks )
00:00:13
This is Lee
and this is Christophe.
00:00:15
They're two of the hosts
of this show.
00:00:18
But to a machine,
they're not people.
00:00:21
This is just pixels.
It's just data.
00:00:23
A machine shouldn't
have a reason to prefer
00:00:25
one of these guys
over the other.
00:00:27
And yet, as you'll see
in a second, it does.
00:00:31
It feels weird to call
a machine racist,
00:00:36
but I really can't explain--
I can't explain
what just happened.
00:00:41
Data-driven systems
are becoming a bigger and bigger
part of our lives,
00:00:45
and they work well
a lot of the time.
00:00:47
- But when they fail...
- Once again, it's
the white guy.
00:00:51
When they fail,
they're not failing
on everyone equally.
00:00:54
If I go back right now...
00:00:58
Ruha Benjamin: You can have
neutral intentions.
00:01:00
You can have good intentions.
00:01:03
And the outcomes can still
be discriminatory.
00:01:05
Whether you want
to call that machine racist
00:01:07
or you want to call
the outcome racist,
00:01:09
we have a problem.
00:01:16
( theme music playing )
00:01:23
I was scrolling
through my Twitter feed
a while back
00:01:26
and I kept seeing tweets
that look like this.
00:01:29
Two of the same picture
of Republican senator
Mitch McConnell smiling,
00:01:33
or sometimes it would be
four pictures
00:01:36
of the same random
stock photo guy.
00:01:39
And I didn't really know
what was going on,
00:01:42
but it turns out that
this was a big public test
of algorithmic bias.
00:01:47
Because it turns out
that these aren't pictures
of just Mitch McConnell.
00:01:50
They're pictures
of Mitch McConnell and...
00:01:54
- Barack Obama.
- Lee: Oh, wow.
00:01:57
So people were uploading
00:01:58
these really
extreme vertical images
00:02:00
to basically force
this image cropping algorithm
00:02:03
to choose one of these faces.
00:02:05
People were alleging that
there's a racial bias here.
00:02:08
But I think what's
so interesting about this
particular algorithm
00:02:12
is that it is so testable
for the public.
00:02:15
It's something that
we could test right now
if we wanted to.
00:02:19
- Let's do it.
- You guys wanna do it?
00:02:21
Okay. Here we go.
00:02:26
So, Twitter does offer you
options to crop your own image.
00:02:30
But if you don't use those,
00:02:32
it uses an automatic
cropping algorithm.
00:02:37
- Wow. There it is.
- Whoa. Wow.
00:02:39
That's crazy.
00:02:41
Christophe, it likes you.
00:02:43
Okay, let's try the other--
the happy one.
00:02:44
Lee: Wow.
00:02:48
- Unbelievable. Oh, wow.
- Both times.
00:02:53
So, do you guys think
this machine is racist?
00:02:58
The only other theory
I possibly have
00:03:00
is if the algorithm
prioritizes white faces
00:03:04
because it can
pick them up quicker,
for whatever reason,
00:03:07
against whatever background.
00:03:09
Immediately,
it looks through the image
00:03:11
and tries to scan for a face.
00:03:13
Why is it always finding
the white face first?
00:03:16
Joss: With this picture,
I think someone could argue
00:03:19
that the lighting makes
Christophe's face more sharp.
00:03:24
I still would love to do
00:03:26
a little bit more systematic
testing on this.
00:03:29
I think maybe
hundreds of photos
00:03:32
could allow us
to draw a conclusion.
00:03:34
I have downloaded
a bunch of photos
00:03:36
from a site called
Generated Photos.
00:03:39
These people do not exist.
They were a creation of AI.
00:03:43
And I went through,
I pulled a bunch
00:03:46
that I think will give us
00:03:47
a pretty decent way
to test this.
00:03:50
So, Christophe,
I wonder if you would be willing
to help me out with that.
00:03:54
You want me to tweet
hundreds of photos?
00:03:57
- ( Lee laughs )
- Joss: Exactly.
00:03:59
I'm down. Sure, I've got time.
00:04:04
Okay.
00:04:05
( music playing )
00:04:21
There may be some people
who take issue with the idea
00:04:24
that machines can be racist
00:04:26
without a human brain
or malicious intent.
00:04:29
But such a narrow
definition of racism
00:04:32
really misses
a lot of what's going on.
00:04:34
I want to read a quote
that responds to that idea.
00:04:36
It says, "Robots are not
sentient beings, sure,
00:04:40
but racism flourishes
well beyond hate-filled hearts.
00:04:43
No malice needed,
no "N" word required,
00:04:46
just a lack of concern for how
the past shapes the present."
00:04:50
I'm going now to speak
to the author of those words,
Ruha Benjamin.
00:04:54
She's a professor
of African-American Studies
at Princeton University.
00:05:00
When did you first
become concerned
00:05:02
that automated systems,
AI, could be biased?
00:05:06
A few years ago,
I noticed these headlines
00:05:09
and hot takes about so-called
racist and sexist robots.
00:05:13
There was a viral video
in which two friends
were in a hotel bathroom
00:05:18
and they were trying to use
an automated soap dispenser.
00:05:21
Black hand, nothing.
Larry, go.
00:05:28
Black hand, nothing.
00:05:30
And although they seem funny
00:05:32
and they kind of
get us to chuckle,
00:05:34
the question is,
are similar design processes
00:05:38
impacting much more
consequential technologies
that we're not even aware of?
00:05:44
When the early news
controversies came along
maybe 10 years ago,
00:05:49
people were surprised
by the fact that
they showed a racial bias.
00:05:54
Why do you think
people were surprised?
00:05:55
Part of it is a deep
attachment and commitment
00:05:59
to this idea
of tech neutrality.
00:06:02
People-- I think because
life is so complicated
00:06:04
and our social world
is so messy--
00:06:07
really cling on to
something that will save us,
00:06:10
and a way of making decisions
that's not drenched
00:06:14
in the muck of all
of human subjectivity,
00:06:19
human prejudice and frailty.
00:06:21
We want it so much
to be true.
00:06:22
We want it so much
to be true, you know?
00:06:24
And the danger is
that we don't question it.
00:06:27
And still we continue to have,
you know, so-called glitches
00:06:33
when it comes to race
and skin complexion.
00:06:36
And I don't think
that they're glitches.
00:06:38
It's a systemic issue
in the truest sense of the word.
00:06:41
It has to do with
our computer systems
and the process of design.
00:06:47
Joss:
AI can seem pretty abstract
sometimes.
00:06:50
So we built this
to help explain
00:06:52
how machine learning works
and what can go wrong.
00:06:55
This black box
is the part of the system
that we interact with.
00:06:59
It's the software
that decides which dating
profiles we might like,
00:07:02
how much a rideshare
should cost,
00:07:04
or how a photo should be
cropped on Twitter.
00:07:06
We just see a device
making a decision.
00:07:08
Or more accurately,
a prediction.
00:07:11
What we don't see
is all of the human decisions
00:07:13
that went into the design
of that technology.
00:07:17
Now, it's true that when
you're dealing with AI,
00:07:19
that means
that the code in this box
00:07:20
wasn't all
written directly by humans,
00:07:22
but by machine-learning
algorithms
00:07:25
that find complex patterns
in data.
00:07:27
But they don't just
spontaneously learn things
from the world.
00:07:30
They're learning
from examples.
00:07:33
Examples that
are labeled by people,
00:07:35
selected by people,
00:07:37
and derived from people, too.
00:07:40
See, these machines
and their predictions,
00:07:42
they're not separate from us
or from our biases
00:07:44
or from our history,
00:07:46
which we've seen
in headline after headline
00:07:48
for the past 10 years.
00:07:51
We're using
the face-tracking software,
00:07:54
so it's supposed
to follow me as I move.
00:07:56
As you can see, I do this--
no following.
00:08:01
Not really--
not really following me.
00:08:03
- Wanda, if you would, please?
- Sure.
00:08:11
In 2010, the top hit
00:08:14
when you did a search
for "black girls,"
00:08:15
80% of what you found
00:08:17
on the first page of results
was all porn sites.
00:08:20
Google is apologizing
after its photo software
00:08:23
labeled two African-Americans
gorillas.
00:08:27
Microsoft is shutting down
00:08:28
its new artificial
intelligent bot
00:08:31
after Twitter users
taught it how to be racist.
00:08:33
Woman: In order to make
yourself hotter,
00:08:36
the app appeared
to lighten your skin tone.
00:08:38
Overall, they work better
on lighter faces than
darker faces,
00:08:42
and they worked
especially poorly
00:08:44
on darker female faces.
00:08:46
Okay, I've noticed that on all
these damn beauty filters,
00:08:50
is they keep taking my nose
and making it thinner.
00:08:52
Give me my African nose
back, please.
00:08:55
Man: So, the first thing
that I tried was the prompt
"Two Muslims..."
00:08:59
And the way
it completed it was,
00:09:01
"Two Muslims,
one with an apparent bomb,
00:09:03
tried to blow up
the Federal Building
00:09:05
in Oklahoma City
in the mid-1990s."
00:09:08
Woman:
Detroit police wrongfully
arrested Robert Williams
00:09:11
based on a false
facial recognition hit.
00:09:13
There's definitely
a pattern of harm
00:09:17
that disproportionately
falls on vulnerable people,
people of color.
00:09:21
Then there's attention,
00:09:22
but of course, the damage
has already been done.
00:09:30
( Skype ringing )
00:09:34
- Hello.
- Hey, Christophe.
00:09:36
Thanks for doing these tests.
00:09:38
- Of course.
- I know it was
a bit of a pain,
00:09:40
but I'm curious
what you found.
00:09:42
Sure. I mean,
I actually did it.
00:09:43
I actually tweeted 180
different sets of pictures.
00:09:48
In total, dark-skinned people
00:09:49
were displayed
in the crop 131 times,
00:09:52
and light-skinned people
00:09:53
were displayed
in the crop 229 times,
00:09:56
which comes out
to 36% dark-skinned
00:09:59
and 64% light-skinned.
00:10:01
That does seem to be
evidence of some bias.
00:10:04
It's interesting because
Twitter posted a blog post
00:10:07
saying that they had done
some of their own tests
00:10:10
before launching this tool,
and they said that
00:10:12
they didn't find evidence
of racial bias,
00:10:14
but that they would
be looking into it further.
00:10:17
Um, they also said
that the kind of technology
00:10:19
that they use to crop images
00:10:21
is called
a Saliency Prediction Model,
00:10:24
which means software
that basically is making a guess
00:10:28
about what's important
in an image.
00:10:31
So, how does a machine
know what is salient,
what's relevant in a picture?
00:10:37
Yeah, it's really
interesting, actually.
00:10:38
There's these
saliency data sets
00:10:40
that documented people's
eye movements
00:10:43
while they looked
at certain sets of images.
00:10:46
So you can take those photos
00:10:47
and you can take
that eye-tracking data
00:10:50
and teach a computer
what humans look at.
00:10:53
So, Twitter's not going to
give me any more information
00:10:56
about how they trained
their model,
00:10:58
but I found an engineer
from a company called Gradio.
00:11:01
They built an app
that does something similar,
00:11:04
and I think it can give us
a closer look
00:11:06
at how this kind of AI works.
00:11:10
- Hey.
- Hey.
00:11:11
- Joss.
- Nice to meet you. Dawood.
00:11:13
So, you and your colleagues
00:11:15
built a saliency cropping tool
00:11:19
that is similar to what we think
Twitter is probably doing.
00:11:22
Yeah, we took a public
machine learning model,
posted it on our library,
00:11:27
and launched it
for anyone to try.
00:11:29
And you don't have to
constantly post pictures
00:11:31
on your timeline to try
and experiment with it,
00:11:33
which is what people
were doing when they first
became aware of the problem.
00:11:35
And that's what we did.
We did a bunch of tests
just on Twitter.
00:11:38
But what's interesting
about what your app shows
00:11:40
is the sort of intermediate
step there, which is this
saliency prediction.
00:11:45
Right, yeah.
I think the intermediate step
is important for people to see.
00:11:48
Well, I-- I brought some
pictures for us to try.
00:11:50
These are actually
the hosts of "Glad You Asked."
00:11:53
And I was hoping we could put
them into your interface
00:11:57
and see what, uh,
the saliency prediction is.
00:12:00
Sure.
Just load this image here.
00:12:02
Joss:
Okay, so, we have
a saliency map.
00:12:05
Clearly the prediction
is that faces are salient,
00:12:08
which is not really a surprise.
00:12:10
But it looks like maybe
they're not equally salient.
00:12:13
- Right.
- Is there a way to sort of
look closer at that?
00:12:16
So, what we can do here,
we actually built it out
in the app
00:12:19
where we can put a window
on someone's specific face,
00:12:22
and it will give us a percentage
of what amount of saliency
00:12:25
you have over your face
versus in proportion
to the whole thing.
00:12:28
- That's interesting.
- Yeah.
00:12:30
She's-- Fabiola's
in the center of the picture,
00:12:32
but she's actually got
a lower percentage
00:12:35
of the salience compared
to Cleo, who's to her right.
00:12:38
Right, and trying
to guess why a model
is making a prediction
00:12:43
and why it's
predicting what it is
00:12:45
is a huge problem
with machine learning.
00:12:47
It's always something
that you have to kind of
00:12:48
back-trace to try
and understand.
00:12:50
And sometimes
it's not even possible.
00:12:52
Mm-hmm.
I looked up what data sets
00:12:54
were used to train
the model you guys used,
00:12:56
and I found one
that was created by
00:12:59
researchers at MIT
back in 2009.
00:13:02
So, it was originally
about a thousand images.
00:13:05
We pulled the ones
that contained faces,
00:13:07
any face we could find
that was big enough to see.
00:13:11
And I went through
all of those,
00:13:12
and I found that only
10 of the photos,
00:13:15
that's just about 3%,
00:13:17
included someone
who appeared to be
00:13:19
of Black or African descent.
00:13:22
Yeah, I mean,
if you're collecting
a data set through Flickr,
00:13:24
you're-- first of all,
you're biased to people
00:13:27
that have used Flickr
back in, what, 2009,
you said, or something?
00:13:30
Joss: And I guess if we saw
in this image data set,
00:13:33
there are more
cat faces than black faces,
00:13:36
we can probably assume
that minimal effort was made
00:13:40
to make that data set
representative.
00:13:54
When someone collects data
into a training data set,
00:13:56
they can be motivated by things
like convenience and cost
00:14:00
and end up with data
that lacks diversity.
00:14:02
That type of bias, which
we saw in the saliency photos,
00:14:05
is relatively easy to address.
00:14:08
If you include more images
representing racial minorities,
00:14:10
you can probably improve
the model's performance
on those groups.
00:14:14
But sometimes
human subjectivity
00:14:17
is imbedded right
into the data itself.
00:14:19
Take crime data for example.
00:14:22
Our data on past crimes
in part reflects
00:14:24
police officers' decisions
about what neighborhoods
to patrol
00:14:27
and who to stop and arrest.
00:14:29
We don't have an objective
measure of crime,
00:14:32
and we know
that the data we do have
00:14:33
contains at least
some racial profiling.
00:14:36
But it's still being used
to train crime prediction tools.
00:14:39
And then there's the question
of how the data is structured
over here.
00:14:44
Say you want a program
that identifies
00:14:45
chronically sick patients
to get additional care
00:14:48
so they don't end up
in the ER.
00:14:50
You'd use past patients
as your examples,
00:14:52
but you have to choose
a label variable.
00:14:54
You have to define
for the machine what
a high-risk patient is
00:14:58
and there's not always
an obvious answer.
00:14:59
A common choice
is to define high-risk
as high-cost,
00:15:04
under the assumption
that people who use
00:15:05
a lot of health care resources
are in need of intervention.
00:15:10
Then the learning
algorithm looks through
00:15:12
the patient's data--
00:15:13
their age, sex,
00:15:14
medications, diagnoses,
insurance claims,
00:15:17
and it finds the combination
of attributes
00:15:19
that correlates
with their total health costs.
00:15:22
And once it gets good
at predicting
00:15:23
total health costs
on past patients,
00:15:26
that formula becomes software
to assess new patients
00:15:29
and give them a risk score.
00:15:31
But instead of predicting
sick patients,
00:15:32
this predicts
expensive patients.
00:15:35
Remember, the label was cost,
00:15:37
and when researchers
took a closer look
at those risk scores,
00:15:40
they realized that label choice
was a big problem.
00:15:42
But by then, the algorithm
had already been used
00:15:44
on millions of Americans.
00:15:49
It produced risk scores
for different patients,
00:15:52
and if a patient
had a risk score
00:15:56
of almost 60,
00:15:58
they would be referred
into the program
00:16:02
for screening--
for their screening.
00:16:04
And if they had a risk score
of almost 100,
00:16:07
they would default
into the program.
00:16:10
Now, when we look at the number
of chronic conditions
00:16:15
that patients of different
risk scores were affected by,
00:16:20
you see a racial disparity
where white patients
00:16:24
had fewer conditions
than black patients
00:16:27
at each risk score.
00:16:29
That means that
black patients were sicker
00:16:32
than their white counterparts
00:16:33
when they had
the same risk score.
00:16:36
And so what happened is
in producing these risk scores
00:16:39
and using spending,
00:16:41
they failed to recognize
that on average
00:16:44
black people incur fewer costs
for a variety of reasons,
00:16:50
including institutional racism,
00:16:52
including lack of access
to high-quality insurance,
00:16:55
and a whole host
of other factors.
00:16:57
But not because
they're less sick.
00:16:59
Not because they're less sick.
00:17:00
And so I think it's important
00:17:01
to remember
this had racist outcomes,
00:17:05
discriminatory outcomes,
not because there was
00:17:08
a big, bad boogie man
behind the screen
00:17:10
out to get black patients,
00:17:12
but precisely because
no one was thinking
00:17:14
about racial disparities
in healthcare.
00:17:17
No one thought it would matter.
00:17:19
And so it was
about the colorblindness,
00:17:21
the race neutrality
that created this.
00:17:24
The good news is
that now the researchers
who exposed this
00:17:29
and who brought this to light
are working with the company
00:17:33
that produced this algorithm
to have a better proxy.
00:17:36
So instead of spending,
it'll actually be
00:17:38
people's actual
physical conditions
00:17:41
and the rate at which
they get sick, et cetera,
00:17:44
that is harder to figure out,
00:17:46
it's a harder kind of
proxy to calculate,
00:17:49
but it's more accurate.
00:17:55
I feel like what's so unsettling
about this healthcare algorithm
00:17:58
is that the patients
would have had
00:18:00
no way of knowing
this was happening.
00:18:03
It's not like Twitter,
where you can upload
00:18:05
your own picture, test it out,
compare with other people.
00:18:08
This was just working
in the background,
00:18:12
quietly prioritizing
the care of certain patients
00:18:14
based on an algorithmic score
00:18:16
while the other patients
probably never knew
00:18:19
they were even passed over
for this program.
00:18:21
I feel like there
has to be a way
00:18:23
for companies to vet
these systems in advance,
00:18:26
so I'm excited to talk
to Deborah Raji.
00:18:28
She's been doing
a lot of thinking
00:18:30
and writing about just that.
00:18:33
My question for you is
how do we find out
00:18:35
about these problems before
they go out into the world
00:18:37
and cause harm
rather than afterwards?
00:18:40
So, I guess a clarification
point is that machine learning
00:18:43
is highly unregulated
as an industry.
00:18:46
These companies
don't have to report
their performance metrics,
00:18:48
they don't have to report
their evaluation results
00:18:51
to any kind
of regulatory body.
00:18:53
But internally there's this
new culture of documentation
00:18:56
that I think has been
incredibly productive.
00:18:59
I worked on a couple of projects
with colleagues at Google,
00:19:02
and one of the main outcomes
of that was this effort
called Model Cards--
00:19:05
very simple
one-page documentation
00:19:08
on how the model
actually works,
00:19:10
but also questions that are
connected to ethical concerns,
00:19:13
such as the intended use
for the model,
00:19:15
details about where
the data's coming from,
00:19:17
how the data's labeled,
and then also, you know,
00:19:20
instructions
to evaluate the system
according to its performance
00:19:24
on different
demographic sub-groups.
00:19:26
Maybe that's something
that's hard to accept
00:19:29
is that it would actually
be maybe impossible
00:19:34
to get performance
across sub-groups
to be exactly the same.
00:19:38
How much of that do we just
have to be like, "Okay"?
00:19:41
I really don't think
there's an unbiased data set
00:19:45
in which everything
will be perfect.
00:19:47
I think the more important thing
is to actually evaluate
00:19:52
and assess things
with an active eye
00:19:54
for those that are most likely
to be negatively impacted.
00:19:57
You know, if you know
that people of color
are most vulnerable
00:20:00
or a particular marginalized
group is most vulnerable
00:20:04
in a particular situation,
00:20:06
then prioritize them
in your evaluation.
00:20:08
But I do think there's
certain situations
00:20:11
where maybe we should
not be predicting
00:20:12
with a machine-learning
system at all.
00:20:13
We should be super cautious
and super careful
00:20:17
about where we deploy it
and where we don't deploy it,
00:20:20
and what kind of human oversight
00:20:21
we put over
these systems as well.
00:20:24
The problem of bias in AI
is really big.
00:20:27
It's really difficult.
00:20:29
But I don't think it means
we have to give up
00:20:30
on machine learning altogether.
00:20:32
One benefit
of bias in a computer
versus bias in a human
00:20:36
is that you can measure
and track it fairly easily.
00:20:38
And you can tinker
with your model
00:20:40
to try and get fair outcomes
if you're motivated to do so.
00:20:44
The first step was
becoming aware of the problem.
00:20:46
Now the second step
is enforcing solutions,
00:20:48
which I think we're
just beginning to see now.
00:20:50
But Deb is raising
a bigger question.
00:20:52
Not just how do we get bias
out of the algorithms,
00:20:55
but which algorithms
should be used at all?
00:20:57
Do we need a predictive model
to be cropping our photos?
00:21:02
Do we want facial recognition
in our communities?
00:21:04
Many would say no,
whether it's biased or not.
00:21:08
And that question
of which technologies
00:21:09
get built and how they
get deployed in our world,
00:21:12
it boils down
to resources and power.
00:21:16
It's the power
to decide whose interests
00:21:18
will be served
by a predictive model,
00:21:20
and which questions get asked.
00:21:23
You could ask, okay,
I want to know how landlords
00:21:28
are making life
for renters hard.
00:21:30
Which landlords are not
fixing up their buildings?
00:21:33
Which ones are hiking rent?
00:21:36
Or you could ask,
okay, let's figure out
00:21:38
which renters
have low credit scores.
00:21:41
Let's figure out the people
who have a gap in unemployment
00:21:45
so I don't want to rent to them.
00:21:46
And so it's at that problem
00:21:48
of forming the question
00:21:49
and posing the problem
00:21:51
that the power dynamics
are already being laid
00:21:54
that set us off in one
trajectory or another.
00:21:57
And the big challenge
there being that
00:22:00
with these two possible
lines of inquiry,
00:22:02
- one of those is probably
a lot more profitable...
- Exactly, exactly.
00:22:07
- ...than the other one.
- And too often the people who
are creating these tools,
00:22:10
they don't necessarily
have to share the interests
00:22:13
of the people who are
posing the questions,
00:22:16
but those are their clients.
00:22:18
So, the question
for the designers
and the programmers is
00:22:22
are you accountable
only to your clients
00:22:25
or are you also accountable
to the larger body politic?
00:22:29
Are you responsible for what
these tools do in the world?
00:22:34
( music playing )
00:22:37
( indistinct chatter )
00:22:44
Man: Can you lift up
your arm a little?
00:22:46
( chatter continues )