00:00:01
[Music]
00:00:10
i'm going to talk today about um
00:00:12
a book that i recently wrote with
00:00:14
catherine ignazio who couldn't be here
00:00:16
today
00:00:16
but she is this is what she looks like
00:00:18
um she's an assistant professor of urban
00:00:21
science and planning at mit
00:00:23
and um i should just say if you're
00:00:25
interested in the book either now or
00:00:27
after the end of the talk you can
00:00:28
actually read it online it's available
00:00:30
open access
00:00:31
through this url data feminism dot io
00:00:36
so um i thought what i would do today is
00:00:39
talk a little bit about sort of our
00:00:41
motivation for writing the book which
00:00:42
hopefully will resonate with many of the
00:00:44
listeners
00:00:46
and then it towards the end of my time
00:00:48
i'll summarize a little bit about what's
00:00:50
actually
00:00:50
you know what as you can tell by the
00:00:51
title data feminism is a book about what
00:00:54
feminism can contribute to data science
00:00:57
and i think
00:00:58
some people when they hear the title
00:00:59
they think i'm not really sure how
00:01:01
what huh um but hopefully by the end of
00:01:04
the talk uh you'll see sort of what we
00:01:06
were thinking and
00:01:07
uh better yet you'll believe it so we
00:01:09
see our book as contributing to
00:01:11
a growing body of work that is together
00:01:14
and collectively
00:01:15
holding corporate and government actors
00:01:17
accountable for
00:01:18
sexist racist classist data products so
00:01:22
you could think of things like
00:01:24
face detection systems that can't see
00:01:26
women of color
00:01:27
um hiring algorithms that demote
00:01:30
applicants that went to
00:01:32
all women's schools um search algorithms
00:01:34
that circulate negative stereotypes
00:01:36
about
00:01:37
black girls you could think about the
00:01:39
recent um
00:01:40
a levels fiasco in the uk you know all
00:01:43
of these things and more
00:01:46
um but what we bring to this
00:01:47
conversation is this focus on feminism
00:01:50
and intersectional feminism in
00:01:51
particular
00:01:53
and before i sort of get to the main
00:01:55
argument of the book about why data
00:01:57
science needs feminism
00:01:58
i wanted to do just a very quick bit of
00:02:01
level setting
00:02:02
about um uh about uh
00:02:06
just a little uh bit of level setting um
00:02:09
so everyone you know when they hear the
00:02:11
term feminism
00:02:12
uh sort of brings their own definition
00:02:14
to the table so we thought that we would
00:02:15
tell you about ours
00:02:18
so one definition comes actually from
00:02:20
beyonce
00:02:22
feminist the person who believes in
00:02:24
equal rights for men and women
00:02:25
and trans people and here's a second
00:02:28
definition feminism
00:02:29
organized activity on behalf of women
00:02:32
and trans people's
00:02:33
rights and interests so feminism in this
00:02:35
definition is also a political action
00:02:38
but feminism is also a set of theories
00:02:40
and ideas
00:02:41
um these theories began by thinking
00:02:43
through issues of inequality with
00:02:45
respect
00:02:46
to sex and gender but over the past 40
00:02:49
years both sort of in the academy and
00:02:50
then just reality
00:02:52
um have made people realize that there
00:02:54
need to be many many more dimensions of
00:02:56
inequality
00:02:57
in conversation with each other so these
00:02:59
include
00:03:00
sex and gender but also race class
00:03:04
sexual orientation ability and more
00:03:09
and this leads to the most important
00:03:11
take away just from the sort of brief
00:03:12
intro
00:03:13
of on feminism which is that when you're
00:03:15
talking about feminism in the year 2020
00:03:18
it must be understood as intersectional
00:03:20
um
00:03:21
and this is a term coined by the legal
00:03:22
scholar kimberly crenshaw
00:03:24
which uses to explain she uses to
00:03:26
explain
00:03:27
how social inequality cannot be defined
00:03:30
by only one dimension of difference like
00:03:32
gender
00:03:32
so when we're talking about inequality
00:03:35
or oppression
00:03:36
we must be talking about the
00:03:37
intersection of the many factors and
00:03:40
forces that produce it
00:03:42
um so racism classism imperialism and so
00:03:44
on um
00:03:46
and the really key thing to understand
00:03:47
uh about intersectionality and this is
00:03:49
actually something that's often
00:03:50
overlooked is that
00:03:52
intersectionality doesn't just describe
00:03:54
markers of individual identity and their
00:03:56
effects
00:03:57
um it describes the structural forces of
00:04:00
power sort of the root cause
00:04:02
of um these inequalities and their
00:04:05
intersection
00:04:06
that create the effects that we
00:04:07
experience and it's really
00:04:09
the work of women of color feminists and
00:04:11
black feminists in particular
00:04:13
that have foregrounded this conversation
00:04:15
about structural forces
00:04:18
so just to sort of summarize this idea
00:04:20
of intersectional feminism which
00:04:21
provides us
00:04:22
with the underlying framework for our
00:04:24
book it's not just about women
00:04:27
it's not just about gender it's at its
00:04:30
core about
00:04:30
power it's about who has power and who
00:04:32
doesn't
00:04:33
and in today's world data is power
00:04:38
and so intersectional feminism when
00:04:40
applied to data science
00:04:42
can help that power be challenged and
00:04:44
changed
00:04:45
and our argument in the book is really
00:04:47
that data science needs feminism
00:04:49
and intersectional feminism in
00:04:51
particular
00:04:52
if we ever hope to overturn these power
00:04:55
imbalances
00:04:56
that we experience in our data sets and
00:04:58
our data systems
00:05:01
so um that's a little bit of the
00:05:03
background for about about the book
00:05:05
and our rationale for writing it but
00:05:08
what the book actually contains is these
00:05:10
seven principles of data feminism
00:05:12
and what katherine and i did is we you
00:05:15
know sat down and sort of asked
00:05:16
ourselves you know what have we learned
00:05:18
from all of our schooling and feminism
00:05:21
all of our experience in various
00:05:23
activist communities and other sort of
00:05:25
community groups that we've been a part
00:05:27
of
00:05:27
and we came up with these seven
00:05:29
principles that
00:05:33
encapsulate the most important aspects
00:05:35
of intersectional feminism as they
00:05:37
relate to data
00:05:38
and our goal here was really to
00:05:40
operationalize feminism for data science
00:05:42
so
00:05:43
to provide models that might guide the
00:05:45
people working with data
00:05:46
or who want to work with data or people
00:05:49
who want to refuse to work with data
00:05:52
so i'm just in the second half of the
00:05:54
talk going to do
00:05:56
uh three just quick examples um
00:05:59
so you get the sense of what we mean by
00:06:02
these principles and how we see them
00:06:03
play out
00:06:04
in again sort of data sets data systems
00:06:07
and data data products
00:06:12
so in the book we tell the story of mimi
00:06:14
onuaha's efforts to collect what she
00:06:16
calls missing data sets
00:06:18
these are data sets that a reasonable
00:06:20
person might expect to exist
00:06:22
um you know like the number of citizens
00:06:23
killed by the police or the number of
00:06:25
women versus men with cases of
00:06:27
coronavirus
00:06:28
um but these data sets do not exist and
00:06:31
what onaha does
00:06:32
is undertake an analysis of power this
00:06:35
is the first principle of data feminism
00:06:37
in her art project to ask why for
00:06:40
instance we have detailed data on things
00:06:42
like
00:06:43
the length of guinea pig teeth which we
00:06:45
do but we don't have data on
00:06:48
police violence but feminism also
00:06:51
involves action if you can remember that
00:06:53
second definition of feminism
00:06:55
and so in the chapter about challenging
00:06:57
power
00:06:58
we also describe ways to push back
00:07:00
against unequal power structures
00:07:03
in the data systems that we encounter
00:07:05
for example
00:07:06
the issue of feminism in mexico and
00:07:09
actually in pretty much every other
00:07:10
country
00:07:11
this is another case of missing data
00:07:12
sets and
00:07:14
in the book we tell the story of one
00:07:16
woman maria salguero
00:07:18
who resolved to head straight towards
00:07:20
this problem and collect the missing
00:07:22
data herself
00:07:23
um and this is what might be called a
00:07:25
feminist counter data collection
00:07:27
strategy so collecting counter data
00:07:29
in the absence of state or government or
00:07:32
institutional
00:07:34
desire or will to collect data on an
00:07:36
important issue
00:07:37
so if the state fails to collect data
00:07:40
you can collect counter data
00:07:42
in order to challenge that power and
00:07:44
there's sort of lots of caveats about
00:07:46
the good that data collection can do
00:07:48
because it's certainly not true
00:07:50
that more data is always better in all
00:07:53
cases
00:07:53
but you know for the issue for time's
00:07:55
sake i'm going to leave it there
00:07:59
but feminism doesn't just help us
00:08:01
identify issues to address
00:08:03
it also informs the process of data
00:08:05
science work
00:08:07
in this principle embracing pluralism
00:08:10
derives from the feminist philosopher
00:08:11
donna haraway's idea of situated
00:08:13
knowledge
00:08:14
um this is her view that the most
00:08:16
complete knowledge
00:08:17
comes from bringing together multiple
00:08:19
perspectives
00:08:20
um so in this model knowledge is not top
00:08:23
down actually not like me just lecturing
00:08:25
at you
00:08:26
but it's actually created through
00:08:27
dialogue and exchange
00:08:29
and her argument which we believe is
00:08:32
this ultimately results in a more
00:08:34
complete picture
00:08:35
of the problem at hand and we see this
00:08:37
in the example of the anti-eviction
00:08:39
mapping
00:08:40
project um this is this large image that
00:08:42
you see on the left it's also known as
00:08:44
the aemp
00:08:46
and they are a self-described collective
00:08:48
of quote housing justice activists
00:08:50
researchers data nerds artists and oral
00:08:53
historians and since 2013
00:08:56
the aemp has worked to quantify and
00:08:58
organize around the housing crisis in
00:09:00
san francisco
00:09:01
that's in the bay area in the united
00:09:02
states where silicon valley is
00:09:04
and it's been a real problem over the
00:09:06
years of people working for tech
00:09:08
companies coming in making high salaries
00:09:11
the rents going up
00:09:12
and everyone else being kicked out um
00:09:15
and so this group works in collaboration
00:09:17
with tenants rights organizations and
00:09:18
community groups
00:09:20
and then they also actually create oral
00:09:21
histories which is what you see here
00:09:23
this little screenshot
00:09:24
right here um in this narratives of
00:09:27
resistance and displacement map
00:09:29
um so on this map each of the blue dots
00:09:32
that you see leads to a video story from
00:09:34
a single person
00:09:35
or a family who is facing displacement
00:09:38
from their home
00:09:40
and in the book we contrast this map
00:09:42
with the map created
00:09:44
by the eviction lab which is the smaller
00:09:46
map over here
00:09:47
um this is based at princeton university
00:09:50
and the eviction lab's goal is to
00:09:51
present a national picture
00:09:54
of the eviction crisis and i should say
00:09:56
at the outset
00:09:57
um this is a worthy goal and a valuable
00:10:00
project i'm not criticizing the project
00:10:02
what i'm trying to call attention to is
00:10:04
the difference in terms of process
00:10:06
which is substantial so you could take a
00:10:08
look at this map this one over here
00:10:10
which depicts the whole country of the
00:10:11
united states and you might think
00:10:13
oh um they're working with seemingly
00:10:15
bigger data
00:10:16
right um and i'm looking at what is
00:10:19
seemingly a more comprehensive picture
00:10:21
of the problem of eviction
00:10:22
in the united states they significantly
00:10:24
under count evictions
00:10:26
um because if you are working in the
00:10:28
real estate industry you know and your
00:10:29
businesses to resell homes um
00:10:32
it is not in your interest to count any
00:10:33
more evictions than you have to
00:10:35
but working instead with local tenants
00:10:37
rights organizations
00:10:39
the aemp has gathered messier but
00:10:41
actually much more accurate and more
00:10:43
contextualized data
00:10:44
that documents a greater extent of the
00:10:46
problem at hand
00:10:48
this is because they actually hear from
00:10:49
tenants who say help me
00:10:51
i'm being evicted and it may not be that
00:10:53
they are served with an official notice
00:10:55
of eviction that you need to get by
00:10:56
going to the local government
00:10:58
filling out a form etc and maybe just
00:11:00
that like
00:11:01
the landlord hasn't fixed their toilet
00:11:02
for two months or is lurking in their
00:11:04
lobby or you know all the other ways in
00:11:05
which you can
00:11:06
get someone to move out without actually
00:11:09
formally beginning eviction proceedings
00:11:12
so just one more thing um data feminism
00:11:15
principles apply not only to collecting
00:11:17
data
00:11:17
or even analyzing data but also
00:11:20
visualizing and communicating data
00:11:22
um so one of the key contributions of
00:11:24
feminist thinking is to dismantle false
00:11:26
binaries
00:11:28
so feminist philosophers start with a
00:11:30
gender binary
00:11:31
but as we say behind a binary there's
00:11:33
always a hierarchy
00:11:35
and the gender binary with men on top
00:11:37
and non-binary folks erased
00:11:40
this one is no different but there are
00:11:42
many other false binaries that are
00:11:44
gendered and show up in our work
00:11:46
so you might think of the false binary
00:11:48
between reason and emotion
00:11:50
um and this goes back to the early
00:11:52
enlightenment when there actually was
00:11:54
a gendered valence to this idea that
00:11:55
sort of only men were capable of
00:11:57
exhibiting
00:11:58
a reason and women fell on the emotional
00:12:01
side
00:12:02
and clearly in this binary right this
00:12:04
division the hierarchy
00:12:06
is that reason is somehow better than
00:12:08
emotion
00:12:09
and in the book we use the two charts
00:12:11
that you see here this uh
00:12:13
periscopic uh visualization it's
00:12:15
actually an animated visualization
00:12:17
of gun deaths in the united states
00:12:19
versus this bar
00:12:20
chart here that was shown in the
00:12:21
washington post using actually very
00:12:23
similar data
00:12:25
um but we use these two charts in order
00:12:27
to show how emotion has really been
00:12:28
exiled
00:12:29
from data communication thanks to edward
00:12:32
tufte mostly
00:12:33
um but the both feminist philosophy
00:12:36
and visual information visualization
00:12:38
research have shown how emotion is
00:12:40
actually central to perception
00:12:42
um to recall to learning to
00:12:44
understanding all of these things
00:12:47
there's all sorts of user studies to
00:12:48
back this up
00:12:51
so this just brings me to the final sort
00:12:53
of major point that i want to make
00:12:54
before the q a
00:12:56
which may already be obvious from these
00:12:58
examples but
00:12:59
it's that data feminism insists on an
00:13:02
expanded definition
00:13:04
of data science um so the data science
00:13:07
that we describe in the book
00:13:08
isn't defined by the size of the data
00:13:10
set or by the credentials of the people
00:13:12
undertaking the work
00:13:14
because these concerns are continually
00:13:16
used to exclude women
00:13:17
and people of color from the field as
00:13:19
well as to exclude work that makes a
00:13:21
contribution
00:13:22
that is socio-technical rather than
00:13:24
purely technical or methodological
00:13:27
um but we have if we expand our
00:13:29
definition of data science
00:13:31
then we can clearly see that some of the
00:13:33
most exciting work in the field today
00:13:35
is being undertaken by artists by
00:13:38
journalists
00:13:39
by humanists by community organizers and
00:13:42
by activists
00:13:43
and you know some of this work actually
00:13:45
does look like traditional data science
00:13:47
and so
00:13:48
here we want to give a shout out to
00:13:49
margaret mitchell and her team at google
00:13:51
for their research on bias and natural
00:13:53
language processing
00:13:55
that's the paper that you see on the far
00:13:56
left but then right here in the middle
00:13:59
you see something entirely different
00:14:01
this is an interactive ai sculpture
00:14:04
by the artist stephanie dinkins um and
00:14:06
she trained it sort of like an alexa
00:14:09
um but it was trained on an
00:14:11
intergenerational dialogue between black
00:14:13
women and her family and so when you
00:14:14
interact with it you get a very specific
00:14:17
conversation
00:14:18
and intentionally so and then on the
00:14:21
right
00:14:22
you see uh in a sort of more fun data
00:14:25
data visualization project some data
00:14:27
journalism by the pudding
00:14:29
which examines gender bias and
00:14:31
hollywood's screenplays and then down
00:14:33
here
00:14:33
this is actually a mural by the group
00:14:36
data therapy
00:14:37
they work with community-based
00:14:39
organizations to create what they call
00:14:40
data murals
00:14:42
for their own communities and we have
00:14:44
just you know actually hundreds of
00:14:45
examples like this in the book
00:14:47
um which we selected to sort of
00:14:49
illustrate our points and inspire our
00:14:51
readers
00:14:52
but you know what we were doing when we
00:14:54
were picking these examples
00:14:56
was to sort of try to hold two different
00:14:58
things in our hands um
00:14:59
because on the one hand we recognize
00:15:01
that data is at the root of so many
00:15:03
problems today
00:15:04
but we also believe very firmly and very
00:15:06
strongly that
00:15:07
when data is wielded intentionally and
00:15:09
with care and with attention to the
00:15:11
lives and the people that it represents
00:15:13
so just to sum up um here are some of
00:15:15
the main takeaways of data feminism that
00:15:17
we talk about in the book
00:15:18
um data feminism is a data science that
00:15:20
sort of at its core exposes and
00:15:22
challenges power
00:15:23
um it's led by and centered centers
00:15:25
minoritized people
00:15:27
it can function as a counter data
00:15:28
science about the injustices created by
00:15:31
mainstream data science that um
00:15:32
in many cases functions in this way um
00:15:35
it looks at many axes of inequality
00:15:38
including but not limited to gender
00:15:40
race and class it considers process
00:15:45
and thinks about how inequality
00:15:47
permeates all stages of a data science
00:15:49
project
00:15:50
from asking the research questions to
00:15:52
how you get funding to conduct the
00:15:54
research to how that project is deployed
00:15:57
and then it credits the labor involved
00:15:59
in data work acknowledging how data
00:16:00
science
00:16:01
is the work of many hands
00:16:04
and so more concretely here are some
00:16:07
things that you can do if you sort of
00:16:08
want to inhabit these principles or take
00:16:10
them to your workplace or
00:16:11
wherever it is that you you do your data
00:16:13
work um
00:16:15
so do work that interrogates and exposes
00:16:17
sexism racism
00:16:18
and other forces of oppression examine
00:16:21
how these
00:16:22
forces show up in data and in the world
00:16:24
you can
00:16:25
collect counter data and missing data
00:16:26
like in some of the projects that we've
00:16:28
seen
00:16:29
you can introduce new communities to
00:16:30
data science um
00:16:32
you can use data to advocate for equity
00:16:34
at your institution
00:16:36
you can experiment with creative forms
00:16:38
of data presentation and communication
00:16:40
we've seen some of these but they also
00:16:41
include
00:16:42
quilts sculptures vr fashion shows
00:16:46
you can include more people and data
00:16:47
driven projects especially
00:16:49
impacted communities and then you can
00:16:51
make sure you credit your sources and
00:16:53
your research support staff
00:16:55
make your process transparent and
00:16:57
reflect on your own identity
00:17:00
so that's all i've got for the
00:17:01
presentation again sort of here's the
00:17:03
book
00:17:03
it's available online if it looks
00:17:05
interesting to you and you can also
00:17:07
learn more about both my work at my
00:17:09
website
00:17:10
catherine's work at her website we both
00:17:13
run research labs which you can see the
00:17:15
urls here
00:17:16
we're on twitter um github instagram
00:17:20
what have you um so yeah so thanks so
00:17:23
much for listening
00:17:32
[Music]
00:17:44
now
00:17:47
[Music]
00:17:55
you