Headon data refers to the data collected from various headon points, such as Uber's service areas, which can be used for forecasting demand and supply.

What are the challenges with headon data?

The main challenges include data sparsity, where some headon points have very few requests, making it difficult to build reliable prediction models.

What is the purpose of data smoothing?

Data smoothing is used to aggregate information from neighboring headon points to improve the stability and accuracy of forecasting models.

What are the two main approaches to data smoothing discussed?

The two main approaches are clustering and kernel smoothing, with kernel smoothing being preferred for its flexibility.

How does headon convolution improve efficiency?

Headon convolution allows for efficient aggregation of data from neighboring headon points, leveraging existing convolution implementations to enhance computational speed.

What is the significance of using neighboring headon data?

Neighboring headon data often shares similar patterns, which can be used to improve predictions for headon points with sparse data.

What is the role of external signals in forecasting?

External signals, such as events or weather conditions, are considered to enhance the accuracy of predictions.

How does the speaker propose to handle different kernel sizes for headon points?

The speaker suggests running multiple smoothing processes in parallel with different kernel sizes to accommodate varying data characteristics.

What is the impact of using GPU for headon convolution?

Using GPU for headon convolution can significantly improve performance, achieving up to a hundred times faster computation compared to basic implementations.

What future opportunities does headon convolution open up?

Headon convolution enables the use of advanced machine learning and deep learning models, enhancing the ability to process and predict data effectively.

[Uber Open Summit 2018] Hexagon Convolution for Data Smoothing & Forecasting

00:22:18

https://www.youtube.com/watch?v=z3PaGIQTFSE

Resumo

TLDRIn this presentation, TR from Uber's marketplace forecasting team discusses the processing of headon data, particularly focusing on data smoothing techniques and forecasting methods. The talk highlights the challenges posed by data sparsity in headon data, where some points have very few requests. Two main approaches for data smoothing are introduced: clustering and kernel smoothing, with an emphasis on the latter for its flexibility and efficiency. The speaker explains how headon convolution can be implemented to aggregate data from neighboring headon points, improving prediction models and computational efficiency. The presentation concludes by discussing the potential of headon convolution to facilitate advanced machine learning and deep learning applications in real-time forecasting.

Conclusões

📊 Understanding headon data is crucial for accurate forecasting.
🔍 Data sparsity poses significant challenges in prediction models.
🔄 Smoothing techniques help stabilize predictions by aggregating data.
🧩 Kernel smoothing is preferred for its flexibility over clustering.
⚙️ Headon convolution enhances computational efficiency in real-time forecasting.
🌐 Neighboring headon points often share similar patterns, aiding predictions.
📈 External signals like events and weather improve forecasting accuracy.
⚡ GPU acceleration can dramatically speed up headon convolution processes.
🔄 Different kernel sizes can be applied to various headon points for better results.
🚀 Headon convolution opens doors for advanced machine learning applications.

Linha do tempo

00:00:00 - 00:05:00
The speaker introduces the topic of processing headon data, focusing on smoothing techniques and forecasting methods. They highlight the transition from traditional data handling to a more flexible approach with headon data, which allows for better business decisions, such as dynamic pricing based on localized demand.
00:05:00 - 00:10:00
The speaker discusses the challenges of data sparsity in headon data, particularly in forecasting. They explain the importance of using historical data, recent demand, and external signals to build predictive models. The issue arises when there is insufficient data for certain headons, making predictions unreliable.
00:10:00 - 00:15:00
Two approaches to data smoothing are presented: clustering and kernel smoothing. Clustering groups headons into clusters for aggregation, which simplifies data but can lead to arbitrary boundaries. Kernel smoothing, on the other hand, considers neighboring headons individually, providing more flexibility and avoiding boundary issues, though it requires more computational resources.
00:15:00 - 00:22:18
The speaker introduces the concept of headon convolution, leveraging existing convolution techniques to efficiently process headon data. They explain the transformation of headon data into a grid format for convolution, which enhances computational efficiency and opens up opportunities for advanced machine learning and deep learning applications in real-time forecasting.

Mostrar mais

Mapa mental

Vídeo de perguntas e respostas

What is headon data?
Headon data refers to the data collected from various headon points, such as Uber's service areas, which can be used for forecasting demand and supply.
What are the challenges with headon data?
The main challenges include data sparsity, where some headon points have very few requests, making it difficult to build reliable prediction models.
What is the purpose of data smoothing?
Data smoothing is used to aggregate information from neighboring headon points to improve the stability and accuracy of forecasting models.
What are the two main approaches to data smoothing discussed?
The two main approaches are clustering and kernel smoothing, with kernel smoothing being preferred for its flexibility.
How does headon convolution improve efficiency?
Headon convolution allows for efficient aggregation of data from neighboring headon points, leveraging existing convolution implementations to enhance computational speed.
What is the significance of using neighboring headon data?
Neighboring headon data often shares similar patterns, which can be used to improve predictions for headon points with sparse data.
What is the role of external signals in forecasting?
External signals, such as events or weather conditions, are considered to enhance the accuracy of predictions.
How does the speaker propose to handle different kernel sizes for headon points?
The speaker suggests running multiple smoothing processes in parallel with different kernel sizes to accommodate varying data characteristics.
What is the impact of using GPU for headon convolution?
Using GPU for headon convolution can significantly improve performance, achieving up to a hundred times faster computation compared to basic implementations.
What future opportunities does headon convolution open up?
Headon convolution enables the use of advanced machine learning and deep learning models, enhancing the ability to process and predict data effectively.

Ver mais resumos de vídeos

Obtenha acesso instantâneo a resumos gratuitos de vídeos do YouTube com tecnologia de IA!

Legendas

Rolagem automática:

00:00:00
hello everyone uh I think today you have
00:00:02
heard a lot about hgun H3 Etc so next
00:00:06
thing I'm going to talk about like how
00:00:07
we are going to process the headon data
00:00:10
particularly I'm going to focus on how
00:00:11
we want to smooth the headon data and
00:00:14
then how can we use the headon data from
00:00:16
for forecasting by the way my name is TR
00:00:19
I'm an engineer from the marketplace
00:00:20
forecasting
00:00:22
team uh so I'm going to talk about uh
00:00:25
first let me show you some examples uh
00:00:27
here is one illustration about uh
00:00:30
uh Uber uh trip request uh in San
00:00:34
Francisco uh in this example I want to
00:00:36
show uh if you see the different colors
00:00:39
basically right means that we have more
00:00:40
request and blue which we have uh fewer
00:00:43
requests and what I really want to show
00:00:45
here is that we have different
00:00:47
categories of has guns for some has gun
00:00:49
we can have lot request for some has gun
00:00:51
we may not have basic we have very few
00:00:55
request so now let me come to the uh
00:00:58
problem right so actually not before the
00:01:00
problem let me talk about this bit more
00:01:01
about like a Hason data so as the camera
00:01:04
already mentioned like uh we are moving
00:01:07
from this Kindle uh D Fence to hasun one
00:01:11
with the big advantage that is more
00:01:13
flexible basically we're going to have
00:01:15
many small head guns and we are going to
00:01:17
have all kind of uber data in the
00:01:19
aggregation of small head guns so like
00:01:21
we can make a very flexible business
00:01:23
decision for example the price we can
00:01:25
change the price for each hasun instead
00:01:27
of like using one price for the entire
00:01:29
city
00:01:30
so uh because of the has gun for the
00:01:33
data we can have thousand of has gun for
00:01:35
example in San Francisco we have more
00:01:37
than 50,000 H guns the problem as show
00:01:40
in the previous example is that uh we
00:01:42
can have some H Gun have lots of data
00:01:44
which is good but if we also have lots
00:01:46
of head guns basically the data is very
00:01:49
uh
00:01:50
Spar uh for example maybe we have a few
00:01:52
head guns just have a few request for
00:01:54
the whole day right so when we have this
00:01:57
kind of data spacity issue for head guns
00:01:59
the big problem problem is that how can
00:02:01
we make the decision to be more stable
00:02:03
or more solid um because I'm from the
00:02:06
forecasting team next I'm going to talk
00:02:07
about uh uh the data spity issue uh
00:02:11
which uh in the forecasting problem so
00:02:15
for S time I'm not going to get into all
00:02:17
detail of the real time forecasting uh
00:02:19
let me just give you a very high level
00:02:21
overview about the problem and how we
00:02:23
try to solve the problem so in Uber we
00:02:26
want to predict the future we want to
00:02:28
predict uh the future Market Marketplace
00:02:31
state so like we can use the future
00:02:33
state to help us to at least get us to
00:02:36
make the decision in the
00:02:38
present so uh apparently this is a very
00:02:41
classical machine learning problem right
00:02:44
we want to build a prediction models we
00:02:46
want take in U data features to train
00:02:48
the model to do the
00:02:50
prediction uh some common feature like
00:02:52
we can use the historical data of the
00:02:54
has gun for example what happened in the
00:02:57
last few weeks for this has gun and we
00:03:00
also want to use a recent data let's say
00:03:02
last 60 Minutes how many trips how many
00:03:05
demand how many supplies in this has gun
00:03:07
we said that we also want to consider
00:03:09
some external signals for example we
00:03:12
look at um maybe there's some events in
00:03:14
this hun maybe um today is holiday maybe
00:03:18
there are some better some bad weather
00:03:20
like uh raining or snow storm Etc right
00:03:24
so the basic idea is that we want to use
00:03:26
all this data uh to uh build our model
00:03:30
to train our model to do the prediction
00:03:32
so now the problem comes so because we
00:03:34
are doing the high SC level prediction
00:03:37
if one has G has very SP data basically
00:03:39
maybe just a few data point in the long
00:03:43
history uh when we build a model become
00:03:45
very uh the performance gener very bad a
00:03:48
think about like if there almost no
00:03:50
request for last few days how what I
00:03:53
going to predict for next 60 minutes
00:03:55
right so this problem have been very
00:03:58
hard so so another observation is that
00:04:01
what we found that uh normally uh
00:04:03
neighboring head guns share the similar
00:04:05
pattern right so if we have one has gun
00:04:08
very few trips maybe there are not many
00:04:10
trips nearby H guns uh as a result one
00:04:13
solution what we take we have been
00:04:16
taking is that uh we try to smooth the
00:04:18
data the basic idea is that how can we
00:04:20
use the the neighbor has guns uh the
00:04:23
data the information to consider as the
00:04:25
current value of the hasun so that we
00:04:27
can use them for building models for
00:04:30
prediction so the generally there are
00:04:33
two approach for data smoothing one is
00:04:36
the we call the he classing the basic
00:04:39
idea is very simple we try to first
00:04:42
group High guns into clusters and then
00:04:47
uh what we do for this SM in basally we
00:04:50
are we are going to use all the values
00:04:52
of the hasun in one has cluster make
00:04:55
some aggregation and use the value as
00:04:57
the value as a value of the Hason we
00:04:59
looking at it so for this approach uh
00:05:03
the advantage is that uh we reduce the
00:05:05
competition we have much less data after
00:05:08
doing aggregation right so it's much
00:05:10
easier for us to train the model it's
00:05:13
much easier for us to do prediction the
00:05:15
trade off the trade off is that uh the
00:05:18
quality of the prediction model will
00:05:20
havly depend on the classroom quality I
00:05:23
think a cam also talk about the geofence
00:05:25
at the beginning right so if you do not
00:05:27
have a good clustering basically your
00:05:29
predi can also be affected another
00:05:32
problem is uh we may have some
00:05:34
neighboring head guns because they
00:05:36
belong to different he clusters right
00:05:39
and then we may use different models as
00:05:41
a result we may find we are having this
00:05:44
kind of arbitrary clustering boundaries
00:05:46
similar as what's shown in Cameron talk
00:05:49
we may have the kind of uh boundaries
00:05:52
just very close neighbors but they have
00:05:54
they they're using different
00:05:56
prices so um another approach uh because
00:06:00
of the disadvantage of the uh clustering
00:06:03
approach that's why actually a lot of
00:06:05
applications in Uber we use this King
00:06:08
smoothing so um let me use one example
00:06:11
to to show you what we mean by caring
00:06:14
smoothing a seem like we have this green
00:06:16
has gun and we are going to have
00:06:19
different layers of neighbors I just um
00:06:22
label them with one two three the rough
00:06:24
idea is of caring smoo that uh whenever
00:06:28
we look at one has gun we also look at
00:06:30
all this uh King neighbors uh K can be
00:06:34
uh different values all the Valu of the
00:06:36
neighbors and then use aggregation do
00:06:39
some like a u maybe average or some
00:06:43
other Dynamic waiting and then use that
00:06:46
value for the green has gun
00:06:48
here so uh uh compared with the
00:06:51
clustering approach right this approach
00:06:53
is much more
00:06:54
flexible uh because we are going to
00:06:57
process each hasun separately in that we
00:07:00
we have this kind of predefined
00:07:02
classing uh another um big Advantage is
00:07:06
that because of the flexible there are
00:07:08
no arbitrary clustering boundaries right
00:07:11
we can avoid this kind of uh boundary
00:07:14
issue so what's the trat off here the
00:07:17
trat off is that uh uh we can have much
00:07:21
more computation sybol when for each has
00:07:24
gun we are going to take our neighbors
00:07:27
and in many ubber application we may use
00:07:30
like caring maybe 10 or even 20 which
00:07:32
means that for each has gun we need
00:07:34
aggregate our neighbor maybe hundreds of
00:07:37
has gun
00:07:38
values and uh just look at San Francisco
00:07:42
we may have like more than 50,000 headon
00:07:44
in total think about like we are we want
00:07:46
to do this realtime forecasting right
00:07:48
actually many other realtime
00:07:50
applications which means that for every
00:07:52
minute we need do lot competition do the
00:07:55
smoothing and uh not only one city right
00:07:58
think about like uber we have uh
00:08:00
currently we have 700 plus cities uh in
00:08:02
production right so I all of them
00:08:05
together is a lot a lot more
00:08:08
computation so uh in the rest of the
00:08:10
talk I be focus a little bit more on how
00:08:13
can they uh have one efficient way to do
00:08:15
the
00:08:16
curing so I'm going to introduce the
00:08:19
hasun contion approach before I talk
00:08:22
about hasun convolution let me first
00:08:24
intro I mean at least bring the idea
00:08:25
again for the uh General convolution uh
00:08:28
concept I'm sure like most of you may be
00:08:30
familiar with convolution because
00:08:32
convolution is a foundational component
00:08:34
for uh convolution neural network it
00:08:37
been so popular now so uh in case you're
00:08:40
still not familiar let me just use one
00:08:42
example to show how the general
00:08:44
convolution
00:08:45
works I'm use a two- dimensional data uh
00:08:49
to show how it works so the rough idea
00:08:51
for conclusion is that assume like we
00:08:53
have input Matrix we Define a small
00:08:56
kernel Matrix or future maybe just a
00:08:58
small Matrix
00:09:00
and we want to run the convolution to
00:09:01
get the output
00:09:03
Matrix so the basic idea is that we want
00:09:05
to slide the small or cor kernel Matrix
00:09:08
on top of the big Matrix at each
00:09:10
position we are going to do the pair
00:09:12
multiplication and addition so like we
00:09:15
can get a result let me use one example
00:09:18
to show it how it works assume like we
00:09:20
have the green one which is a kernel
00:09:22
Matrix which is a small Matrix and we
00:09:24
can slide small SL small Matrix on top
00:09:27
of this uh uh big input Matrix for
00:09:30
example in the blue error what we do is
00:09:32
that we just do the uh Paris U value
00:09:35
multiplication and then we just do
00:09:38
some uh this is just another example we
00:09:41
just slide the kernel on top of this
00:09:43
input Matrix we can get all the com
00:09:45
result of course there are many details
00:09:48
for convolution I'm not going to get the
00:09:50
detail you can find a lot of tutorial
00:09:51
how to do the
00:09:53
convolution so uh what's the next
00:09:55
property for conclusion the next
00:09:57
property is that this is highly
00:09:59
paralyzable another thing that because
00:10:01
this is a so common uh fundamental
00:10:03
component of many application or
00:10:06
computation we have lots of material uh
00:10:09
implementation in different package for
00:10:11
example we have it in CI uh just here a
00:10:14
few example tender flow py Etc because
00:10:18
this is so fundamental there lots of
00:10:19
optimization have been done to optimize
00:10:22
the efficiency of conclustion right so
00:10:25
and also like we can use dpu to
00:10:27
accelerate our competition
00:10:30
so we have this contion on grid Matrix
00:10:32
so the question is that how can we do
00:10:34
this thing on the Hason
00:10:35
data so the idea actually is very
00:10:38
similar what we want to do is that we
00:10:40
want to run the conclusion on Hon data
00:10:43
assume like we have input set of H guns
00:10:46
we also want to Define our own kernel
00:10:49
this kernel we find like uh formar
00:10:52
basically the kernel like our caring
00:10:54
kernel and we want to run the
00:10:56
convolution uh using the uh kernel on
00:10:58
top the input Matrix right so uh the
00:11:02
question is that what's a challenge
00:11:03
what's a challenge the challenge here is
00:11:06
that there no uh existing or available
00:11:09
package which take in the hasone
00:11:12
coordinate data because most this kind
00:11:14
of conclusion package they just assume
00:11:17
you are going to have this uh grid
00:11:19
Matrix or array or tensor right so the
00:11:24
question for us is that how can we use
00:11:26
leverage existing convolution imple
00:11:29
ination so like we can get all the
00:11:31
benefit of either um the optimization or
00:11:35
some DP acceleration
00:11:37
right so the next I'm going to talk
00:11:39
about how we're going to do the
00:11:41
transformation so like we can leverage
00:11:43
existing convolution
00:11:46
implementations before I talk about uh
00:11:48
how we do the transformation how we U um
00:11:51
uh Implement hun conclusion let's me
00:11:53
first introduce a few uh Hasan
00:11:56
coordinate system because we need a
00:11:59
represent each Hason so like we know how
00:12:01
we can do the uh
00:12:03
transformation here are just a few
00:12:05
common very commonly used coordinate
00:12:07
system this one is called Cub coordinate
00:12:10
the basic idea is that for each Hason we
00:12:12
use three dimensional data to represent
00:12:15
three axes to represent the hon data
00:12:18
basically XYZ uh there's some TR off for
00:12:21
this approach uh first thing that we use
00:12:24
threedimensional data to represent two
00:12:26
dimensional data is not very memory
00:12:28
efficient uh another thing that uh we
00:12:31
have three values right so it's not
00:12:33
directly mapping to a grid system so
00:12:35
that's why we are not using this
00:12:37
coordinate
00:12:38
system another another coordinate system
00:12:41
is called like a double
00:12:42
coordinate uh I'm not going get into our
00:12:45
details but the rough idea is that for
00:12:47
each hasun we are using two numbers or
00:12:49
two AES to
00:12:51
represent uh so let's just I mean the I
00:12:54
mean theoretically it's easy to map to a
00:12:56
grid Square Matrix right so the question
00:13:00
is that after we do the direct mapping
00:13:03
we find like Oh The Neighbors in the
00:13:05
Hason data may not be direct Neighbors
00:13:07
in a grid or Square grid so the question
00:13:11
uh recall like when we do when we run
00:13:13
conclusion we normally need a solid
00:13:15
error right if have lots of gaps which
00:13:18
means that not efficient and we also do
00:13:20
lots of filtering to remove the the um
00:13:23
the value we do not need so we are also
00:13:26
not using this one uh this is a another
00:13:29
coordinate system which is AO coordinate
00:13:33
so uh for this coordinate we use two
00:13:35
numbers to represent for each has and
00:13:38
the good thing that uh after mapping we
00:13:41
can clearly say like uh uh The Neighbors
00:13:44
in the uh Hason data also roughly
00:13:47
Neighbors in the great Matrix and more
00:13:50
important this is a solid error and we
00:13:53
only have a few cells in the corner
00:13:55
which we can easily use the filter to uh
00:13:58
filter out
00:13:59
so the approach we have chosen or the
00:14:01
coordinate system we have chosen
00:14:02
basically based on the axle coordinates
00:14:05
coordinate next let me talk about like
00:14:07
how we can use this system to do Hason
00:14:10
conclusion now s like we have this uh
00:14:13
input data b a big input uh hcon data we
00:14:16
also Define our own small filter right
00:14:20
so the first step what we want to do is
00:14:22
that we first use the XO coordinate to
00:14:25
represent each
00:14:27
hun uh after we have this a coordinate
00:14:30
the next step we do is that we just do
00:14:33
the uh transformation similar as what we
00:14:35
have done for the AO data to map to uh
00:14:39
this uh grid
00:14:40
Matrix after we have this grid Matrix uh
00:14:43
sorry the square grid Matrix uh Things
00:14:46
become much easier we can leverage all
00:14:48
this uh implementations available right
00:14:52
so we just run conclusion on this Square
00:14:54
green Matrix we get this Con result
00:14:58
after we have this uh C result in GD The
00:15:02
Next Step that we can easily map it back
00:15:04
to uh hon
00:15:06
data so so far I just talk about how we
00:15:10
want to do the Hason conclusion just try
00:15:11
to recall we have Hason data we
00:15:14
represent in uh using our H using the H3
00:15:17
library and then the first step is that
00:15:20
we want to use uh the AO coordinate or
00:15:23
something similar we map to
00:15:24
corresponding representation for each
00:15:26
hasum and then we do the transformation
00:15:29
mapped into this uh Square grid Matrix
00:15:32
and then we run the convolution after we
00:15:35
get a convolution result we just map it
00:15:39
back so next let me just talk little bit
00:15:42
more about like how we are going to use
00:15:44
Hason conclustion for this uh um
00:15:48
carrying data smoing right so uh recall
00:15:51
like how we do the carrying uh this
00:15:53
thing rough idea is that we just need
00:15:55
the only thing we need to change that we
00:15:57
just need to set different filters for
00:16:00
example here I just show some example
00:16:02
the first one is a one ring filter in
00:16:04
this filter we just use equal weight if
00:16:07
we use this filter to do this Hason
00:16:08
conclusion basically means that we are
00:16:10
just doing a average oh sorry just a sum
00:16:15
but be beyond that we can also do this
00:16:16
kind of dynamic uh um Futures which
00:16:20
right so uh look at the example uh by
00:16:24
using this filter essentially we are
00:16:26
doing a weighted version of the s when
00:16:29
we do the king
00:16:30
smoothing another thing I want to
00:16:32
mention that when we talk about like the
00:16:34
advantage of U has gun uh is more
00:16:37
flexible right we want to handle H Gun
00:16:41
separately so that means that we may
00:16:43
want to use different keing size uh for
00:16:46
different head guns for example some
00:16:48
head gun in the city we may use a small
00:16:50
keing some head gun in the nonb ER maybe
00:16:53
you want to use a l
00:16:55
king so how can we handle this uh
00:16:58
actually the solution is very Brute
00:17:00
Force because the Hasan convolution is
00:17:03
so fast we just run aring um smothing in
00:17:08
parallel together and then we just put
00:17:10
data together later let me next I will
00:17:12
show you like how fast the
00:17:15
convolution yeah here here the P result
00:17:19
so for the P comparison we just
00:17:21
Implement three basic approach the first
00:17:23
one is the basic implementation we just
00:17:26
for each hasun we just get neighbers and
00:17:28
then we just uh uh do the aggregation or
00:17:32
do the sum do the mean Etc another
00:17:35
implementation we use a tender flow it
00:17:37
can be any uh available package but we
00:17:39
chose tender flow we use tender flow
00:17:42
implement the Hasan conion we draw it on
00:17:45
CPU the third solution is the same code
00:17:48
we draw on
00:17:49
GPU we can clearly say like by using
00:17:52
this tender flow implementation of hon
00:17:55
conclusion uh uh with GPU we can ch more
00:17:59
almost like hundred times performance
00:18:02
Improvement even with CPU we still
00:18:04
canchi like a 10 times performance
00:18:07
Improvement so to summarize a little bit
00:18:09
I just want to show here that
00:18:11
U uh by using this hasun conclusion we
00:18:15
can D dramatically improve our
00:18:17
efficiency because this kind of data
00:18:19
smoothing is so fundamental for many
00:18:21
applications actually this can help us
00:18:23
to uh really reduce the competition
00:18:26
bottom neck so like we can uh we can put
00:18:30
our more effort on this kind of business
00:18:32
logic part so I talk about like how we
00:18:35
can um do uh uh data smoothing using
00:18:39
Hasan conclusion but actually Hasan
00:18:41
conion is more than that because Hason
00:18:44
contion is just a so fundamental
00:18:47
component think about like in um deep
00:18:49
learning right so contion your network
00:18:52
has been uh basically a foundational
00:18:54
component for many de uh deep Learning
00:18:57
Network structure
00:18:59
so it's similar for our forecasting
00:19:01
problem just recall like for the real
00:19:03
time forecasting problem what we do is
00:19:05
that we try to uh uh predict for every
00:19:09
hasun in the world for every for every
00:19:12
minute for next 16 minutes in the future
00:19:14
different quantity for example Supply
00:19:16
demand uh driver waiting time Etc so uh
00:19:20
the input is all kind of data historic
00:19:22
data Rec data external signals and the
00:19:25
output is basically our
00:19:27
prediction so so uh if just look at each
00:19:30
has gone separately the big problem is
00:19:32
that because of data sparity it's very
00:19:35
hard to really improve the perance um
00:19:38
and here we can clearly use the Hun
00:19:40
conclusion right so the first step in I
00:19:44
mean this is just one uh neuron Network
00:19:46
structure I'm showing here for example
00:19:48
we have Tred different uh architectures
00:19:50
in this architecture so the first step
00:19:52
we going to do is that we're going to
00:19:54
run the hasun conclusion first basically
00:19:57
we can aggregate all kind of signal
00:19:59
together and then the aggreg signal will
00:20:02
fit into the later part of the deeper
00:20:04
new network for six time I'm not going
00:20:06
to talk about our detail of this network
00:20:09
structure but roughly this network
00:20:11
structure was inspired by residual
00:20:13
Network where we can increase uh Network
00:20:16
layer and also it's a u a lot of idea is
00:20:19
borrowed from wnet basically how you can
00:20:22
process sequence
00:20:23
data so uh there are some other uh
00:20:26
interesting part like in in this network
00:20:28
we are not really predicting one value
00:20:30
we're predict predicting a distribution
00:20:32
I'm not going to talk about R detail but
00:20:34
the rough idea here I want to mention
00:20:36
here is uh by using hasun convolution is
00:20:40
such a fundamental component for uh uh
00:20:43
this deep learning model and uh actually
00:20:46
it open us many uh many opportunities
00:20:49
because a lot of if you look at Deep
00:20:51
learning literature or research or even
00:20:53
industry just like so many component
00:20:56
work buil on top of conclusion before
00:20:59
because we have we use Hason data it
00:21:01
been very hard for us do the conclusion
00:21:03
a lot of things we cannot leverage but
00:21:05
now with hasun conclusion is more like
00:21:07
open the door so like we can leverage a
00:21:09
lot of existing
00:21:11
work so uh with that let me just Briefly
00:21:15
summarize summarize the talk uh so
00:21:19
uh uh in Uber we build H3 we build
00:21:22
hasone um C sorry has aggregation of
00:21:25
this Uber business data we have our kind
00:21:28
of data which are aggregated in the H
00:21:31
Gun level and and we also use small has
00:21:34
guns which mean that we're going to have
00:21:35
some B has gun non B has guns uh data
00:21:38
sparti can become isue for us to do uh
00:21:41
to make a you know solid stable decision
00:21:44
that's why we need a lots of data
00:21:46
smoothing algorithm and uh to make this
00:21:49
data smoothing more efficient we build
00:21:51
this has gun convolution which really uh
00:21:54
improve the computation cost um by using
00:21:58
he conclusion actually it opens the door
00:22:01
for us to uh do a lot of more fancy
00:22:04
stuff building machine learning model
00:22:06
deep learning models and we are trying
00:22:08
to use a hasun solution to build our
00:22:10
realtime forecasting our model um with
00:22:14
that I would just like thank you all to
00:22:16
come to talk

Etiquetas

headon data
data smoothing
forecasting
Uber
kernel smoothing
clustering
data sparsity
headon convolution
machine learning
deep learning