00:00:00
[Applause]
00:00:04
I'm gonna be poppin about perceived
00:00:07
performance today and some of the recent
00:00:09
work that we've been doing at Missoula I
00:00:11
am Heather I am based in Toronto and I
00:00:15
might use a researcher on the Firefox
00:00:17
team this is not Toronto I realized
00:00:21
after the fact this is like perpetuating
00:00:25
stereotypes with that photo but hi I'm
00:00:33
Gemma and I'm also a Firefox researcher
00:00:35
and based in Seattle and this is not
00:00:39
Seattle either this is actually so also
00:00:42
a Canadian photo yes so are we getting
00:00:47
feedback or is that just me okay so as
00:01:02
user researchers we often evaluate
00:01:04
products by focusing on how they fit
00:01:06
into users lives whether but they
00:01:08
provide value and if they're usable for
00:01:10
people for people with a range of
00:01:12
abilities user perceived performance
00:01:14
matters because it can have a
00:01:17
significant impact on a user's overall
00:01:18
experience with a product assessing user
00:01:21
perceived performance helps inform
00:01:23
decisions not only for engineering but
00:01:25
also for our product and UX teams so
00:01:29
Mozilla is fairly old in terms of most
00:01:31
modern tech companies while Firefox 1.0
00:01:35
was released in 2004 the Mozilla project
00:01:38
actually started several years earlier
00:01:39
and our company history page includes
00:01:42
the following explanation of our origin
00:01:44
and it's unsurprisingly pretty
00:01:46
engineering focused it reads the Mozilla
00:01:49
project was created in 1998 with the
00:01:51
release of the Netscape Browser suite
00:01:53
source code it was intended to harness
00:01:55
the creative power of thousands of
00:01:56
programmers on the Internet and fuel
00:01:59
unprecedented levels of innovation in
00:02:01
the browser market while Firefox remains
00:02:05
a strong engineering led organization
00:02:07
we've grown our UX and product teams
00:02:10
in the last several years in 2017 we
00:02:13
came together to focus on a major
00:02:15
desktop browser update for a Firefox 57
00:02:18
release which we named quantum it
00:02:21
included numerous improvements to the
00:02:23
gecko browser engine and a large visual
00:02:25
refresh of the browser all with the goal
00:02:30
of making sure Firefox felt superfast
00:02:32
our user research team recognized this
00:02:35
as an opportunity to integrate these two
00:02:37
efforts by actually measuring our user
00:02:39
perceived performance for the very first
00:02:41
time we decided to conduct a
00:02:43
benchmarking study using unbranded
00:02:45
builds of both Chrome and Firefox there
00:02:50
are many factors that influenced browser
00:02:52
usage but performance is one of the core
00:02:54
motivators this is an example of one of
00:02:57
the advertisements we use when quantum
00:02:59
eventually launched in this ad Reggie
00:03:03
Watts is interpreting the contrast
00:03:05
between a slow and a fast browsing
00:03:07
experience so what do we actually mean
00:03:09
when we're talking about performance
00:03:13
perceived performance refers to how
00:03:15
quickly software appears to perform a
00:03:17
given tasks tasks and is an important
00:03:19
element of a user's overall experience
00:03:21
it's important to note that perception
00:03:23
of time should not be assumed to be an
00:03:25
accurate measurement of time the actual
00:03:28
duration reflects subjective time and
00:03:30
perceived duration reflects subjective
00:03:32
psychological time which is susceptible
00:03:34
to varying degrees of distortion however
00:03:37
perceived duration should not be
00:03:39
regarded as any less important for user
00:03:41
facing products another consideration
00:03:45
with perceive time is both active and
00:03:47
passive time active time is
00:03:50
characterized by an engaged mental
00:03:52
activity for the user and passive time
00:03:54
is characterized by a lack of engagement
00:03:56
these are two modes that users typically
00:03:58
move between when they're engaging with
00:04:00
a product or service
00:04:01
so one real-world example of active
00:04:04
versus passive time is choosing to drive
00:04:07
the back roads instead of waiting and
00:04:09
heavy highway traffic even if they take
00:04:11
the same amount of time it'll feel
00:04:13
faster to be moving research shows it on
00:04:16
average people engage in a passive
00:04:18
weight overestimate their waiting time
00:04:20
by about 36%
00:04:22
therefore it's essential for designers
00:04:24
and developers to find ways to limit the
00:04:26
amount of time users are spending in
00:04:28
this passive phase another example years
00:04:32
ago in Houston airline customers were
00:04:34
complaining about baggage claim wait
00:04:36
times in response the airport doubled
00:04:39
the baggage staff but the complaints
00:04:41
kept coming after some research they
00:04:43
found that it was actually the idle time
00:04:45
waiting for the baggage that was the
00:04:48
true issue so in response they actually
00:04:50
just moved the baggage claim further
00:04:51
down the hall so that even though it
00:04:54
took the same amount of time for the
00:04:56
bags to appear on the carousel it just
00:04:58
felt so much shorter to be walking for a
00:05:00
part of that weight versus standing idly
00:05:05
so performance is largely determined by
00:05:08
four main factors duration
00:05:10
responsiveness fluency and tolerance
00:05:13
duration is the actual duration that a
00:05:16
process takes this is the element that's
00:05:18
often referred to as performance and
00:05:20
technical discussions so different
00:05:22
magnitudes of duration require different
00:05:23
treatments to achieve optimal perceived
00:05:25
performance and responsiveness is the
00:05:29
perceived time it takes the system to
00:05:31
respond to user input for example let
00:05:36
that load an empty dialogue appearing
00:05:39
immediately after click and then taking
00:05:41
a second to populate with content feels
00:05:43
faster than the same dialogue appearing
00:05:45
with a second delay but fully populated
00:05:50
fluency is the perceived smoothness of a
00:05:53
process it could also be described as a
00:05:55
measure for how hard the machine appears
00:05:57
to be working so for example a
00:06:01
stuttering progress indicator gives the
00:06:03
impression of lower performance
00:06:05
regardless of the actual duration of the
00:06:07
process tolerance is a measure of how
00:06:13
long the user expects a process to take
00:06:15
and at what point they will abandon or
00:06:17
cancel the process for example the
00:06:21
tolerated duration for loading a web
00:06:23
page is going to be much different than
00:06:25
for saving a bookmark so next we'll be
00:06:31
taking you through how to actually set
00:06:33
up a perceived performance study you
00:06:35
using the quantum Firefox research that
00:06:37
I mentioned at the beginning of this
00:06:38
talk and some recent work on an
00:06:40
experimental mobile browser as case
00:06:42
studies so I'm going to turn over to
00:06:44
Heather now to introduce the mobile
00:06:46
research and to talk about our desktop
00:06:48
and mobile perceived performance
00:06:49
research great so last year the Firefox
00:06:58
team started working on our new mobile
00:07:00
browser powered by Firefox this new
00:07:02
rendering engine gecko vo we knew how
00:07:06
critical great performance was for the
00:07:08
mobile experience especially coming off
00:07:09
of what we just sort of launched to its
00:07:12
desktop so we wanted to understand and
00:07:14
prioritize performance efforts that
00:07:16
would have the most impact on users and
00:07:18
identify opportunities for the user
00:07:19
experience to feel fast I'm going to
00:07:23
walk through the high-level steps of
00:07:25
running a perceived performance study as
00:07:26
well as some of the findings and impact
00:07:28
this research has had for us so for both
00:07:37
desktop and mobile we needed to identify
00:07:39
our stakeholders in these types of
00:07:42
projects stakeholders might extend
00:07:43
beyond the performance team and beyond
00:07:45
the engineering team so for example for
00:07:48
us our seek holders were product
00:07:50
managers engineering managers engineers
00:07:53
and designers and even content strategy
00:08:02
so oh and sorry one thing I also wanted
00:08:05
to mention is it went across product
00:08:07
teams too so yeah it's banned like a few
00:08:10
different areas that we're looking at so
00:08:14
once once the stakeholders have been
00:08:15
identified work with them to identicals
00:08:19
of the perceived performance study and
00:08:22
this goes back to the intent that you've
00:08:23
heard a few times today so really
00:08:25
understanding like what is the purpose
00:08:26
of what we're going to do because this
00:08:29
will inform a lot of things I'm going to
00:08:31
start it will inform what the research
00:08:33
program looks like so for example with
00:08:37
our quantum desktop study we had two
00:08:39
goals the first was to identify the
00:08:41
greatest areas of improvement for
00:08:43
perceived performance and user
00:08:45
preference and the second was to
00:08:47
evaluate what perceived performance
00:08:48
improvements had been made to this new
00:08:51
version of Firefox so this meant that we
00:08:53
needed to benchmark the product against
00:08:56
itself over time but we also needed to
00:08:59
do a comparative study against our
00:09:01
competitor chrome and benchmark that
00:09:03
over time as well so once the research
00:09:10
goals and approach are identified you
00:09:12
can figure out when the study should be
00:09:13
scheduled and you can do this by
00:09:15
understanding two things upfront the
00:09:17
first is when the builds will be
00:09:18
available for you to evaluate and the
00:09:21
second is when the findings from the
00:09:23
research will have had impact so it's
00:09:25
important to make sure that the team
00:09:27
will have time to act on the results
00:09:28
knowing the release schedule and the
00:09:30
product roadmap can help determine when
00:09:32
the findings from the research will be
00:09:34
actionable by the team once it was time
00:09:40
for us to run the study we ran a kickoff
00:09:42
meeting with our stakeholders the
00:09:44
project kickoff is a critical step in a
00:09:46
process and it should cover what the
00:09:48
purpose of the study is as well as what
00:09:50
questions or hypotheses your
00:09:52
stakeholders have so this is a really
00:09:55
good time to have team alignment this
00:09:57
will help inform what your research
00:09:59
questions are which will then inform
00:10:01
what participants to recruits or what's
00:10:03
your criteria for your participants and
00:10:05
what tasks and questions to include in
00:10:07
your study you'll also want to find out
00:10:10
during a kickoff what the team already
00:10:12
knows and this
00:10:13
to come from any places it might be
00:10:16
previous performance tests or metrics
00:10:18
that the company has or recurring themes
00:10:21
from product reviews or customer support
00:10:24
you should also find out if there's any
00:10:27
upcoming decisions changes or milestones
00:10:29
that might impact the decisions yeah
00:10:33
impact the product and identify what
00:10:37
devices and builds are needed for the
00:10:38
study so for example we wanted to run
00:10:45
the study with unbranded builds a
00:10:46
firefox in chrome so we worked with the
00:10:48
engineering teams to prepare those
00:10:50
builds both chromium and Firefox are
00:10:52
open source projects so the engineers
00:10:54
were able to take the code and remove
00:10:56
all of the branding from it and create
00:10:59
the browser builds that we then
00:11:01
installed on the test devices yeah so
00:11:04
you can see you like that these are this
00:11:07
is on one of the test devices and those
00:11:09
are like two versions of like unbranded
00:11:11
Firefox and unbranded chrome so by
00:11:17
identifying the questions the team had
00:11:18
and looking at what we already know knew
00:11:21
we identified a set of tasks to evaluate
00:11:24
and these included browser specific
00:11:26
tasks like the speed of opening the
00:11:28
browser and site-specific tasks like
00:11:30
filling out a web form to select which
00:11:33
websites to include in our site specific
00:11:35
tasks we looked at secondary research
00:11:38
and we also worked with our stakeholders
00:11:43
during our study planning we also
00:11:45
identified the controls we needed to put
00:11:47
in place these are the parts of the
00:11:48
study that you either need to keep
00:11:50
consistent or vary in order to have
00:11:52
valid results for our studies we were
00:11:55
thinking about which devices to select
00:11:57
so for example in our mobile studies we
00:12:00
chose a high end high bandwidth mobile
00:12:02
phone as well as a low low end low
00:12:05
bandwidth mobile phone this was really
00:12:07
helpful during analysis because we were
00:12:09
able to compare the data collected by
00:12:11
those two different groups and see some
00:12:13
like interesting differences
00:12:17
we also alternated the order of both the
00:12:19
browsers and the tasks that participants
00:12:22
were given which helped us mitigate an
00:12:24
order bias and finally we used the
00:12:27
unbranded builds for a study to mitigate
00:12:30
a brand effect with Chrome or Firefox at
00:12:37
this point we were able to define
00:12:38
criteria for who needed who we needed to
00:12:42
run this study with so we wanted to have
00:12:45
two groups of participants those that
00:12:46
used Chrome is their primary browser and
00:12:48
those that used Firefox is our primary
00:12:50
browser when we were planning our
00:12:52
desktop study we also wanted to have a
00:12:55
group of participants living in a rural
00:12:56
location and a group of participants
00:12:58
living in out an urban location to think
00:13:02
about like high and low bandwidth we
00:13:06
also knew that we needed to be able to
00:13:07
see statistical significance in our data
00:13:10
so that helped us determine how many
00:13:12
participants to include in our study so
00:13:14
for example with our desktop research
00:13:16
because we wanted to have statistical
00:13:18
significance between urban and rural
00:13:20
locations we had 20 participants for
00:13:22
each once the questions and tasks were
00:13:28
defined we could create our guide for
00:13:29
running the sessions we ran our studies
00:13:32
study is in a lab which allowed us to
00:13:35
control for the test environment and
00:13:37
each session was conducted one-on-one
00:13:39
with our participant and ran about 45
00:13:42
minutes during that time we interviewed
00:13:44
them about their current attitudes and
00:13:46
experiences with the browser and then
00:13:48
had them complete the set of tasks and
00:13:50
rate the responsiveness on a seven-point
00:13:52
Likert scale each participant went
00:13:55
through the set of tasks on the two
00:13:56
different browsers that we were
00:13:57
comparing we then wrapped up with their
00:14:00
preference and overall perception of
00:14:02
speed and a discussion and once we're
00:14:05
done with the sessions were ready for
00:14:06
analysis
00:14:10
so during analysis we focused on
00:14:12
responsive responsiveness rating
00:14:14
averages as well as the percentage of
00:14:17
people that preferred one of the
00:14:19
browsers over another we looked at
00:14:22
whether the responsiveness rating
00:14:24
averaged between the two browsers they
00:14:26
used with significant overall but also
00:14:28
we looked by task by browser order
00:14:30
bandwidth and whether they were
00:14:32
primarily at Firefox user primarily a
00:14:35
chrome user we also looked at whether
00:14:37
the percentage of people that preferred
00:14:40
one browser over the other was
00:14:41
significant along those same variables
00:14:45
we used significance testing to assess
00:14:48
comparative outcomes to determine
00:14:50
whether the difference but okay all
00:14:58
right um so yeah we used significance
00:15:02
testing to assess the comparative
00:15:04
outcome to determine whether the
00:15:06
difference between them was caused by
00:15:07
something other than random chance so
00:15:09
our team used a 90% confidence level to
00:15:11
test for statistical significance which
00:15:13
meant that we could be 90% sure that the
00:15:15
difference in percentage was not due to
00:15:17
random chance so I want to share with
00:15:21
you a few of the things that we were
00:15:23
able to learn from the research on
00:15:25
mobile and our initial benchmarking of
00:15:28
Firefox against chrome on Android we
00:15:30
found chrome outperformed Firefox in
00:15:32
most categories in statistically
00:15:34
significant ways the average
00:15:36
responsiveness rating was hive was
00:15:39
higher on Chrome and we found that the
00:15:41
majority of tasks where chrome had felt
00:15:43
more responsive involved page loads or
00:15:45
interactions done while the page was
00:15:46
still loading when comparing our old
00:15:50
rendering engine webview against our new
00:15:52
one gecko view we found that the old one
00:15:55
outperformed gecko view and a number of
00:15:57
statistically significant ways as well
00:15:59
we also found that participants didn't
00:16:01
perceive page load accurately so as long
00:16:05
as the structure the main content and
00:16:07
the key actions were loaded participants
00:16:09
considered the page loaded even if a
00:16:11
long tail of things filled in after that
00:16:13
in other words if they weren't blocked
00:16:15
from their intended goal they didn't
00:16:17
consider that to be an issue
00:16:21
interestingly when we looked at the
00:16:23
results on both mobile studies we saw a
00:16:25
low bandwidth effect so what we noticed
00:16:28
is like when we separated the data
00:16:31
between the low and low bandwidth device
00:16:34
and the high
00:16:35
hi hi pan with the device there was
00:16:38
statistical significance between the
00:16:42
responsiveness ratings on the low
00:16:44
bandwidth and statistical significance
00:16:47
for the preference over one browser over
00:16:49
the other on the low bandwidth but that
00:16:51
disappeared when we were looking at the
00:16:53
height high bandwidth on desktop in our
00:16:59
initial benchmark chrome outperformed
00:17:01
Firefox on the majority of site-specific
00:17:03
and browser specific tasks and was the
00:17:06
preferred browser by most participants a
00:17:08
few big things that that we worked with
00:17:12
the engineers to address were to improve
00:17:14
the speed of opening the browser improve
00:17:16
animations to better represent loading
00:17:18
in page load completion prioritize the
00:17:21
foreground tabs over background tabs
00:17:23
impaired has content that people care
00:17:26
about so for example if we're loading a
00:17:28
page with an article loading the article
00:17:30
first while delaying scripts with ads
00:17:32
and tracking domains when we followed up
00:17:36
three months later with phase two to
00:17:38
measure the progress we found that
00:17:40
Firefox had statistically significantly
00:17:43
improved responsiveness including in the
00:17:46
majority of site-specific tasks which
00:17:48
are circled
00:17:50
we also found users prefer the new tab
00:17:53
experience on Firefox because of its
00:17:54
significant speed advantage over chrome
00:17:56
as well as the UI updates that had been
00:17:58
made and interestingly we also found
00:18:02
that the close tab dialog box in Firefox
00:18:04
received many favorable comments
00:18:08
despite lengthening the time to close
00:18:10
the browser in fact we observed many
00:18:13
times when feelings of like user
00:18:15
friendliness or a good user experience
00:18:16
contributed to the participants
00:18:18
preference rating so we've learned a ton
00:18:24
from running these studies and starting
00:18:26
to build out this program in Firefox and
00:18:29
it's been a really nice complimentary
00:18:31
addition to the performance tests that
00:18:33
are already happening at the company
00:18:35
it's been really useful to what this
00:18:38
sounds like so basic but it's been
00:18:40
really useful to actually be able to
00:18:42
watch our users
00:18:43
um like do these tasks and talk about
00:18:46
them and be able to share that back with
00:18:48
engineering following the results of the
00:18:51
studies the team filed a number of
00:18:52
performance related bugs to improve to
00:18:55
improve perceived performance on our
00:18:56
product and it's been helpful to run a
00:18:59
studies overtime to measure the
00:19:00
improvements that have been made and
00:19:02
where we need to continue focusing one
00:19:05
of the biggest benefits from
00:19:07
understanding the perceived performance
00:19:08
has been the ability to identify where
00:19:10
we can make the biggest impact for our
00:19:12
users experience our engineering team
00:19:15
knew some of the major areas where
00:19:17
performance needed to be improved and
00:19:19
much of what we found out was
00:19:20
complimentary but being able to point to
00:19:23
the biggest perceptible issues helped us
00:19:26
prioritize what work to give sorry what
00:19:30
will give our users the most impactful
00:19:32
improvements in their experience ok so
00:19:39
I'm running a perceived performance
00:19:40
benchmarking study before a major
00:19:43
release like Firefox quantum is an
00:19:45
important way to assess how users
00:19:47
experience design and engineering
00:19:48
updates and once the initial research
00:19:51
design has been created you should
00:19:52
consider maximizing this investment by
00:19:55
running perceived performance
00:19:55
benchmarking studies on a regular basis
00:19:58
this might be at every major release or
00:20:00
on a different cadence that works best
00:20:02
for your organization but as long as you
00:20:04
can carve out a little bit of time and
00:20:05
the budget for recruitment a lot of the
00:20:07
heavy lifting will have already been
00:20:08
done because the study will be almost
00:20:10
identical so how does Firefox think
00:20:16
about perceived performance today it's
00:20:18
been about a year and a half since we
00:20:20
ran our first perceived performance
00:20:21
study on quantum and we've now run
00:20:23
perceived performance studies on all of
00:20:25
our major browsing products perceived
00:20:28
performance has become widely used
00:20:29
terminology for defining product goals
00:20:31
and for measuring success and we also
00:20:34
recently hired a dedicated perceived
00:20:36
performance product manager and we also
00:20:39
have user research staff to a variety of
00:20:41
perceived performance projects in the
00:20:44
future we'd like to hire a dedicated
00:20:46
perceived performance user researcher
00:20:47
and extend our perceived performance
00:20:49
research to locations outside of North
00:20:52
Erica and finally we would really like
00:20:55
to establish a regular perceived
00:20:56
performance research cadence and ideally
00:20:58
one that's not only tied to major
00:21:00
releases so if you would like to learn
00:21:05
more about researching perceived
00:21:07
performance these are a few resources
00:21:09
we'd recommend to start the first two
00:21:11
are short articles that provide a good
00:21:13
basis for learning the first is shorter
00:21:16
wait times the effects of various
00:21:18
loading screens on perceived performance
00:21:19
and subjective versus objective time
00:21:22
measures a note on the perception of
00:21:24
time and consumer behavior and the less
00:21:27
they are books designing with the mind
00:21:29
in mind Thinking Fast and Slow which is
00:21:32
pretty popular I'm sure most people have
00:21:34
heard of that one and then designing an
00:21:36
engineering time the psychology of time
00:21:38
perception and software we'd also
00:21:42
recommend checking out our Firefox
00:21:44
photon design system which you can find
00:21:46
at design Firefox comm is a really great
00:21:50
short section on designing for
00:21:51
performance as well as a number of other
00:21:53
useful resources and those short video
00:21:56
clips I shared earlier you can also find
00:21:58
those at this website which are great to
00:22:00
demonstrate some of the different ways
00:22:01
that people are thinking about perceived
00:22:03
performance so thank you so much for
00:22:07
listening to us today you can access a
00:22:09
copy of these slides at the bitly URL at
00:22:12
the top and if you aren't already we
00:22:14
hope that you will consider introducing
00:22:15
user perceived performance research to
00:22:18
your organizations please find us
00:22:20
tonight at the after party and we would
00:22:23
love to keep talking about this thank
00:22:25
you
00:22:30
you