[Uber Open Summit 2018] Hexagon Convolution for Data Smoothing & Forecasting

00:22:18
https://www.youtube.com/watch?v=z3PaGIQTFSE

Ringkasan

TLDRIn this presentation, TR from Uber's marketplace forecasting team discusses the processing of headon data, particularly focusing on data smoothing techniques and forecasting methods. The talk highlights the challenges posed by data sparsity in headon data, where some points have very few requests. Two main approaches for data smoothing are introduced: clustering and kernel smoothing, with an emphasis on the latter for its flexibility and efficiency. The speaker explains how headon convolution can be implemented to aggregate data from neighboring headon points, improving prediction models and computational efficiency. The presentation concludes by discussing the potential of headon convolution to facilitate advanced machine learning and deep learning applications in real-time forecasting.

Takeaways

  • 📊 Understanding headon data is crucial for accurate forecasting.
  • 🔍 Data sparsity poses significant challenges in prediction models.
  • 🔄 Smoothing techniques help stabilize predictions by aggregating data.
  • 🧩 Kernel smoothing is preferred for its flexibility over clustering.
  • ⚙️ Headon convolution enhances computational efficiency in real-time forecasting.
  • 🌐 Neighboring headon points often share similar patterns, aiding predictions.
  • 📈 External signals like events and weather improve forecasting accuracy.
  • ⚡ GPU acceleration can dramatically speed up headon convolution processes.
  • 🔄 Different kernel sizes can be applied to various headon points for better results.
  • 🚀 Headon convolution opens doors for advanced machine learning applications.

Garis waktu

  • 00:00:00 - 00:05:00

    The speaker introduces the topic of processing headon data, focusing on smoothing techniques and forecasting methods. They highlight the transition from traditional data handling to a more flexible approach with headon data, which allows for better business decisions, such as dynamic pricing based on localized demand.

  • 00:05:00 - 00:10:00

    The speaker discusses the challenges of data sparsity in headon data, particularly in forecasting. They explain the importance of using historical data, recent demand, and external signals to build predictive models. The issue arises when there is insufficient data for certain headons, making predictions unreliable.

  • 00:10:00 - 00:15:00

    Two approaches to data smoothing are presented: clustering and kernel smoothing. Clustering groups headons into clusters for aggregation, which simplifies data but can lead to arbitrary boundaries. Kernel smoothing, on the other hand, considers neighboring headons individually, providing more flexibility and avoiding boundary issues, though it requires more computational resources.

  • 00:15:00 - 00:22:18

    The speaker introduces the concept of headon convolution, leveraging existing convolution techniques to efficiently process headon data. They explain the transformation of headon data into a grid format for convolution, which enhances computational efficiency and opens up opportunities for advanced machine learning and deep learning applications in real-time forecasting.

Tampilkan lebih banyak

Peta Pikiran

Video Tanya Jawab

  • What is headon data?

    Headon data refers to the data collected from various headon points, such as Uber's service areas, which can be used for forecasting demand and supply.

  • What are the challenges with headon data?

    The main challenges include data sparsity, where some headon points have very few requests, making it difficult to build reliable prediction models.

  • What is the purpose of data smoothing?

    Data smoothing is used to aggregate information from neighboring headon points to improve the stability and accuracy of forecasting models.

  • What are the two main approaches to data smoothing discussed?

    The two main approaches are clustering and kernel smoothing, with kernel smoothing being preferred for its flexibility.

  • How does headon convolution improve efficiency?

    Headon convolution allows for efficient aggregation of data from neighboring headon points, leveraging existing convolution implementations to enhance computational speed.

  • What is the significance of using neighboring headon data?

    Neighboring headon data often shares similar patterns, which can be used to improve predictions for headon points with sparse data.

  • What is the role of external signals in forecasting?

    External signals, such as events or weather conditions, are considered to enhance the accuracy of predictions.

  • How does the speaker propose to handle different kernel sizes for headon points?

    The speaker suggests running multiple smoothing processes in parallel with different kernel sizes to accommodate varying data characteristics.

  • What is the impact of using GPU for headon convolution?

    Using GPU for headon convolution can significantly improve performance, achieving up to a hundred times faster computation compared to basic implementations.

  • What future opportunities does headon convolution open up?

    Headon convolution enables the use of advanced machine learning and deep learning models, enhancing the ability to process and predict data effectively.

Lihat lebih banyak ringkasan video

Dapatkan akses instan ke ringkasan video YouTube gratis yang didukung oleh AI!
Teks
en
Gulir Otomatis:
  • 00:00:00
    hello everyone uh I think today you have
  • 00:00:02
    heard a lot about hgun H3 Etc so next
  • 00:00:06
    thing I'm going to talk about like how
  • 00:00:07
    we are going to process the headon data
  • 00:00:10
    particularly I'm going to focus on how
  • 00:00:11
    we want to smooth the headon data and
  • 00:00:14
    then how can we use the headon data from
  • 00:00:16
    for forecasting by the way my name is TR
  • 00:00:19
    I'm an engineer from the marketplace
  • 00:00:20
    forecasting
  • 00:00:22
    team uh so I'm going to talk about uh
  • 00:00:25
    first let me show you some examples uh
  • 00:00:27
    here is one illustration about uh
  • 00:00:30
    uh Uber uh trip request uh in San
  • 00:00:34
    Francisco uh in this example I want to
  • 00:00:36
    show uh if you see the different colors
  • 00:00:39
    basically right means that we have more
  • 00:00:40
    request and blue which we have uh fewer
  • 00:00:43
    requests and what I really want to show
  • 00:00:45
    here is that we have different
  • 00:00:47
    categories of has guns for some has gun
  • 00:00:49
    we can have lot request for some has gun
  • 00:00:51
    we may not have basic we have very few
  • 00:00:55
    request so now let me come to the uh
  • 00:00:58
    problem right so actually not before the
  • 00:01:00
    problem let me talk about this bit more
  • 00:01:01
    about like a Hason data so as the camera
  • 00:01:04
    already mentioned like uh we are moving
  • 00:01:07
    from this Kindle uh D Fence to hasun one
  • 00:01:11
    with the big advantage that is more
  • 00:01:13
    flexible basically we're going to have
  • 00:01:15
    many small head guns and we are going to
  • 00:01:17
    have all kind of uber data in the
  • 00:01:19
    aggregation of small head guns so like
  • 00:01:21
    we can make a very flexible business
  • 00:01:23
    decision for example the price we can
  • 00:01:25
    change the price for each hasun instead
  • 00:01:27
    of like using one price for the entire
  • 00:01:29
    city
  • 00:01:30
    so uh because of the has gun for the
  • 00:01:33
    data we can have thousand of has gun for
  • 00:01:35
    example in San Francisco we have more
  • 00:01:37
    than 50,000 H guns the problem as show
  • 00:01:40
    in the previous example is that uh we
  • 00:01:42
    can have some H Gun have lots of data
  • 00:01:44
    which is good but if we also have lots
  • 00:01:46
    of head guns basically the data is very
  • 00:01:49
    uh
  • 00:01:50
    Spar uh for example maybe we have a few
  • 00:01:52
    head guns just have a few request for
  • 00:01:54
    the whole day right so when we have this
  • 00:01:57
    kind of data spacity issue for head guns
  • 00:01:59
    the big problem problem is that how can
  • 00:02:01
    we make the decision to be more stable
  • 00:02:03
    or more solid um because I'm from the
  • 00:02:06
    forecasting team next I'm going to talk
  • 00:02:07
    about uh uh the data spity issue uh
  • 00:02:11
    which uh in the forecasting problem so
  • 00:02:15
    for S time I'm not going to get into all
  • 00:02:17
    detail of the real time forecasting uh
  • 00:02:19
    let me just give you a very high level
  • 00:02:21
    overview about the problem and how we
  • 00:02:23
    try to solve the problem so in Uber we
  • 00:02:26
    want to predict the future we want to
  • 00:02:28
    predict uh the future Market Marketplace
  • 00:02:31
    state so like we can use the future
  • 00:02:33
    state to help us to at least get us to
  • 00:02:36
    make the decision in the
  • 00:02:38
    present so uh apparently this is a very
  • 00:02:41
    classical machine learning problem right
  • 00:02:44
    we want to build a prediction models we
  • 00:02:46
    want take in U data features to train
  • 00:02:48
    the model to do the
  • 00:02:50
    prediction uh some common feature like
  • 00:02:52
    we can use the historical data of the
  • 00:02:54
    has gun for example what happened in the
  • 00:02:57
    last few weeks for this has gun and we
  • 00:03:00
    also want to use a recent data let's say
  • 00:03:02
    last 60 Minutes how many trips how many
  • 00:03:05
    demand how many supplies in this has gun
  • 00:03:07
    we said that we also want to consider
  • 00:03:09
    some external signals for example we
  • 00:03:12
    look at um maybe there's some events in
  • 00:03:14
    this hun maybe um today is holiday maybe
  • 00:03:18
    there are some better some bad weather
  • 00:03:20
    like uh raining or snow storm Etc right
  • 00:03:24
    so the basic idea is that we want to use
  • 00:03:26
    all this data uh to uh build our model
  • 00:03:30
    to train our model to do the prediction
  • 00:03:32
    so now the problem comes so because we
  • 00:03:34
    are doing the high SC level prediction
  • 00:03:37
    if one has G has very SP data basically
  • 00:03:39
    maybe just a few data point in the long
  • 00:03:43
    history uh when we build a model become
  • 00:03:45
    very uh the performance gener very bad a
  • 00:03:48
    think about like if there almost no
  • 00:03:50
    request for last few days how what I
  • 00:03:53
    going to predict for next 60 minutes
  • 00:03:55
    right so this problem have been very
  • 00:03:58
    hard so so another observation is that
  • 00:04:01
    what we found that uh normally uh
  • 00:04:03
    neighboring head guns share the similar
  • 00:04:05
    pattern right so if we have one has gun
  • 00:04:08
    very few trips maybe there are not many
  • 00:04:10
    trips nearby H guns uh as a result one
  • 00:04:13
    solution what we take we have been
  • 00:04:16
    taking is that uh we try to smooth the
  • 00:04:18
    data the basic idea is that how can we
  • 00:04:20
    use the the neighbor has guns uh the
  • 00:04:23
    data the information to consider as the
  • 00:04:25
    current value of the hasun so that we
  • 00:04:27
    can use them for building models for
  • 00:04:30
    prediction so the generally there are
  • 00:04:33
    two approach for data smoothing one is
  • 00:04:36
    the we call the he classing the basic
  • 00:04:39
    idea is very simple we try to first
  • 00:04:42
    group High guns into clusters and then
  • 00:04:47
    uh what we do for this SM in basally we
  • 00:04:50
    are we are going to use all the values
  • 00:04:52
    of the hasun in one has cluster make
  • 00:04:55
    some aggregation and use the value as
  • 00:04:57
    the value as a value of the Hason we
  • 00:04:59
    looking at it so for this approach uh
  • 00:05:03
    the advantage is that uh we reduce the
  • 00:05:05
    competition we have much less data after
  • 00:05:08
    doing aggregation right so it's much
  • 00:05:10
    easier for us to train the model it's
  • 00:05:13
    much easier for us to do prediction the
  • 00:05:15
    trade off the trade off is that uh the
  • 00:05:18
    quality of the prediction model will
  • 00:05:20
    havly depend on the classroom quality I
  • 00:05:23
    think a cam also talk about the geofence
  • 00:05:25
    at the beginning right so if you do not
  • 00:05:27
    have a good clustering basically your
  • 00:05:29
    predi can also be affected another
  • 00:05:32
    problem is uh we may have some
  • 00:05:34
    neighboring head guns because they
  • 00:05:36
    belong to different he clusters right
  • 00:05:39
    and then we may use different models as
  • 00:05:41
    a result we may find we are having this
  • 00:05:44
    kind of arbitrary clustering boundaries
  • 00:05:46
    similar as what's shown in Cameron talk
  • 00:05:49
    we may have the kind of uh boundaries
  • 00:05:52
    just very close neighbors but they have
  • 00:05:54
    they they're using different
  • 00:05:56
    prices so um another approach uh because
  • 00:06:00
    of the disadvantage of the uh clustering
  • 00:06:03
    approach that's why actually a lot of
  • 00:06:05
    applications in Uber we use this King
  • 00:06:08
    smoothing so um let me use one example
  • 00:06:11
    to to show you what we mean by caring
  • 00:06:14
    smoothing a seem like we have this green
  • 00:06:16
    has gun and we are going to have
  • 00:06:19
    different layers of neighbors I just um
  • 00:06:22
    label them with one two three the rough
  • 00:06:24
    idea is of caring smoo that uh whenever
  • 00:06:28
    we look at one has gun we also look at
  • 00:06:30
    all this uh King neighbors uh K can be
  • 00:06:34
    uh different values all the Valu of the
  • 00:06:36
    neighbors and then use aggregation do
  • 00:06:39
    some like a u maybe average or some
  • 00:06:43
    other Dynamic waiting and then use that
  • 00:06:46
    value for the green has gun
  • 00:06:48
    here so uh uh compared with the
  • 00:06:51
    clustering approach right this approach
  • 00:06:53
    is much more
  • 00:06:54
    flexible uh because we are going to
  • 00:06:57
    process each hasun separately in that we
  • 00:07:00
    we have this kind of predefined
  • 00:07:02
    classing uh another um big Advantage is
  • 00:07:06
    that because of the flexible there are
  • 00:07:08
    no arbitrary clustering boundaries right
  • 00:07:11
    we can avoid this kind of uh boundary
  • 00:07:14
    issue so what's the trat off here the
  • 00:07:17
    trat off is that uh uh we can have much
  • 00:07:21
    more computation sybol when for each has
  • 00:07:24
    gun we are going to take our neighbors
  • 00:07:27
    and in many ubber application we may use
  • 00:07:30
    like caring maybe 10 or even 20 which
  • 00:07:32
    means that for each has gun we need
  • 00:07:34
    aggregate our neighbor maybe hundreds of
  • 00:07:37
    has gun
  • 00:07:38
    values and uh just look at San Francisco
  • 00:07:42
    we may have like more than 50,000 headon
  • 00:07:44
    in total think about like we are we want
  • 00:07:46
    to do this realtime forecasting right
  • 00:07:48
    actually many other realtime
  • 00:07:50
    applications which means that for every
  • 00:07:52
    minute we need do lot competition do the
  • 00:07:55
    smoothing and uh not only one city right
  • 00:07:58
    think about like uber we have uh
  • 00:08:00
    currently we have 700 plus cities uh in
  • 00:08:02
    production right so I all of them
  • 00:08:05
    together is a lot a lot more
  • 00:08:08
    computation so uh in the rest of the
  • 00:08:10
    talk I be focus a little bit more on how
  • 00:08:13
    can they uh have one efficient way to do
  • 00:08:15
    the
  • 00:08:16
    curing so I'm going to introduce the
  • 00:08:19
    hasun contion approach before I talk
  • 00:08:22
    about hasun convolution let me first
  • 00:08:24
    intro I mean at least bring the idea
  • 00:08:25
    again for the uh General convolution uh
  • 00:08:28
    concept I'm sure like most of you may be
  • 00:08:30
    familiar with convolution because
  • 00:08:32
    convolution is a foundational component
  • 00:08:34
    for uh convolution neural network it
  • 00:08:37
    been so popular now so uh in case you're
  • 00:08:40
    still not familiar let me just use one
  • 00:08:42
    example to show how the general
  • 00:08:44
    convolution
  • 00:08:45
    works I'm use a two- dimensional data uh
  • 00:08:49
    to show how it works so the rough idea
  • 00:08:51
    for conclusion is that assume like we
  • 00:08:53
    have input Matrix we Define a small
  • 00:08:56
    kernel Matrix or future maybe just a
  • 00:08:58
    small Matrix
  • 00:09:00
    and we want to run the convolution to
  • 00:09:01
    get the output
  • 00:09:03
    Matrix so the basic idea is that we want
  • 00:09:05
    to slide the small or cor kernel Matrix
  • 00:09:08
    on top of the big Matrix at each
  • 00:09:10
    position we are going to do the pair
  • 00:09:12
    multiplication and addition so like we
  • 00:09:15
    can get a result let me use one example
  • 00:09:18
    to show it how it works assume like we
  • 00:09:20
    have the green one which is a kernel
  • 00:09:22
    Matrix which is a small Matrix and we
  • 00:09:24
    can slide small SL small Matrix on top
  • 00:09:27
    of this uh uh big input Matrix for
  • 00:09:30
    example in the blue error what we do is
  • 00:09:32
    that we just do the uh Paris U value
  • 00:09:35
    multiplication and then we just do
  • 00:09:38
    some uh this is just another example we
  • 00:09:41
    just slide the kernel on top of this
  • 00:09:43
    input Matrix we can get all the com
  • 00:09:45
    result of course there are many details
  • 00:09:48
    for convolution I'm not going to get the
  • 00:09:50
    detail you can find a lot of tutorial
  • 00:09:51
    how to do the
  • 00:09:53
    convolution so uh what's the next
  • 00:09:55
    property for conclusion the next
  • 00:09:57
    property is that this is highly
  • 00:09:59
    paralyzable another thing that because
  • 00:10:01
    this is a so common uh fundamental
  • 00:10:03
    component of many application or
  • 00:10:06
    computation we have lots of material uh
  • 00:10:09
    implementation in different package for
  • 00:10:11
    example we have it in CI uh just here a
  • 00:10:14
    few example tender flow py Etc because
  • 00:10:18
    this is so fundamental there lots of
  • 00:10:19
    optimization have been done to optimize
  • 00:10:22
    the efficiency of conclustion right so
  • 00:10:25
    and also like we can use dpu to
  • 00:10:27
    accelerate our competition
  • 00:10:30
    so we have this contion on grid Matrix
  • 00:10:32
    so the question is that how can we do
  • 00:10:34
    this thing on the Hason
  • 00:10:35
    data so the idea actually is very
  • 00:10:38
    similar what we want to do is that we
  • 00:10:40
    want to run the conclusion on Hon data
  • 00:10:43
    assume like we have input set of H guns
  • 00:10:46
    we also want to Define our own kernel
  • 00:10:49
    this kernel we find like uh formar
  • 00:10:52
    basically the kernel like our caring
  • 00:10:54
    kernel and we want to run the
  • 00:10:56
    convolution uh using the uh kernel on
  • 00:10:58
    top the input Matrix right so uh the
  • 00:11:02
    question is that what's a challenge
  • 00:11:03
    what's a challenge the challenge here is
  • 00:11:06
    that there no uh existing or available
  • 00:11:09
    package which take in the hasone
  • 00:11:12
    coordinate data because most this kind
  • 00:11:14
    of conclusion package they just assume
  • 00:11:17
    you are going to have this uh grid
  • 00:11:19
    Matrix or array or tensor right so the
  • 00:11:24
    question for us is that how can we use
  • 00:11:26
    leverage existing convolution imple
  • 00:11:29
    ination so like we can get all the
  • 00:11:31
    benefit of either um the optimization or
  • 00:11:35
    some DP acceleration
  • 00:11:37
    right so the next I'm going to talk
  • 00:11:39
    about how we're going to do the
  • 00:11:41
    transformation so like we can leverage
  • 00:11:43
    existing convolution
  • 00:11:46
    implementations before I talk about uh
  • 00:11:48
    how we do the transformation how we U um
  • 00:11:51
    uh Implement hun conclusion let's me
  • 00:11:53
    first introduce a few uh Hasan
  • 00:11:56
    coordinate system because we need a
  • 00:11:59
    represent each Hason so like we know how
  • 00:12:01
    we can do the uh
  • 00:12:03
    transformation here are just a few
  • 00:12:05
    common very commonly used coordinate
  • 00:12:07
    system this one is called Cub coordinate
  • 00:12:10
    the basic idea is that for each Hason we
  • 00:12:12
    use three dimensional data to represent
  • 00:12:15
    three axes to represent the hon data
  • 00:12:18
    basically XYZ uh there's some TR off for
  • 00:12:21
    this approach uh first thing that we use
  • 00:12:24
    threedimensional data to represent two
  • 00:12:26
    dimensional data is not very memory
  • 00:12:28
    efficient uh another thing that uh we
  • 00:12:31
    have three values right so it's not
  • 00:12:33
    directly mapping to a grid system so
  • 00:12:35
    that's why we are not using this
  • 00:12:37
    coordinate
  • 00:12:38
    system another another coordinate system
  • 00:12:41
    is called like a double
  • 00:12:42
    coordinate uh I'm not going get into our
  • 00:12:45
    details but the rough idea is that for
  • 00:12:47
    each hasun we are using two numbers or
  • 00:12:49
    two AES to
  • 00:12:51
    represent uh so let's just I mean the I
  • 00:12:54
    mean theoretically it's easy to map to a
  • 00:12:56
    grid Square Matrix right so the question
  • 00:13:00
    is that after we do the direct mapping
  • 00:13:03
    we find like Oh The Neighbors in the
  • 00:13:05
    Hason data may not be direct Neighbors
  • 00:13:07
    in a grid or Square grid so the question
  • 00:13:11
    uh recall like when we do when we run
  • 00:13:13
    conclusion we normally need a solid
  • 00:13:15
    error right if have lots of gaps which
  • 00:13:18
    means that not efficient and we also do
  • 00:13:20
    lots of filtering to remove the the um
  • 00:13:23
    the value we do not need so we are also
  • 00:13:26
    not using this one uh this is a another
  • 00:13:29
    coordinate system which is AO coordinate
  • 00:13:33
    so uh for this coordinate we use two
  • 00:13:35
    numbers to represent for each has and
  • 00:13:38
    the good thing that uh after mapping we
  • 00:13:41
    can clearly say like uh uh The Neighbors
  • 00:13:44
    in the uh Hason data also roughly
  • 00:13:47
    Neighbors in the great Matrix and more
  • 00:13:50
    important this is a solid error and we
  • 00:13:53
    only have a few cells in the corner
  • 00:13:55
    which we can easily use the filter to uh
  • 00:13:58
    filter out
  • 00:13:59
    so the approach we have chosen or the
  • 00:14:01
    coordinate system we have chosen
  • 00:14:02
    basically based on the axle coordinates
  • 00:14:05
    coordinate next let me talk about like
  • 00:14:07
    how we can use this system to do Hason
  • 00:14:10
    conclusion now s like we have this uh
  • 00:14:13
    input data b a big input uh hcon data we
  • 00:14:16
    also Define our own small filter right
  • 00:14:20
    so the first step what we want to do is
  • 00:14:22
    that we first use the XO coordinate to
  • 00:14:25
    represent each
  • 00:14:27
    hun uh after we have this a coordinate
  • 00:14:30
    the next step we do is that we just do
  • 00:14:33
    the uh transformation similar as what we
  • 00:14:35
    have done for the AO data to map to uh
  • 00:14:39
    this uh grid
  • 00:14:40
    Matrix after we have this grid Matrix uh
  • 00:14:43
    sorry the square grid Matrix uh Things
  • 00:14:46
    become much easier we can leverage all
  • 00:14:48
    this uh implementations available right
  • 00:14:52
    so we just run conclusion on this Square
  • 00:14:54
    green Matrix we get this Con result
  • 00:14:58
    after we have this uh C result in GD The
  • 00:15:02
    Next Step that we can easily map it back
  • 00:15:04
    to uh hon
  • 00:15:06
    data so so far I just talk about how we
  • 00:15:10
    want to do the Hason conclusion just try
  • 00:15:11
    to recall we have Hason data we
  • 00:15:14
    represent in uh using our H using the H3
  • 00:15:17
    library and then the first step is that
  • 00:15:20
    we want to use uh the AO coordinate or
  • 00:15:23
    something similar we map to
  • 00:15:24
    corresponding representation for each
  • 00:15:26
    hasum and then we do the transformation
  • 00:15:29
    mapped into this uh Square grid Matrix
  • 00:15:32
    and then we run the convolution after we
  • 00:15:35
    get a convolution result we just map it
  • 00:15:39
    back so next let me just talk little bit
  • 00:15:42
    more about like how we are going to use
  • 00:15:44
    Hason conclustion for this uh um
  • 00:15:48
    carrying data smoing right so uh recall
  • 00:15:51
    like how we do the carrying uh this
  • 00:15:53
    thing rough idea is that we just need
  • 00:15:55
    the only thing we need to change that we
  • 00:15:57
    just need to set different filters for
  • 00:16:00
    example here I just show some example
  • 00:16:02
    the first one is a one ring filter in
  • 00:16:04
    this filter we just use equal weight if
  • 00:16:07
    we use this filter to do this Hason
  • 00:16:08
    conclusion basically means that we are
  • 00:16:10
    just doing a average oh sorry just a sum
  • 00:16:15
    but be beyond that we can also do this
  • 00:16:16
    kind of dynamic uh um Futures which
  • 00:16:20
    right so uh look at the example uh by
  • 00:16:24
    using this filter essentially we are
  • 00:16:26
    doing a weighted version of the s when
  • 00:16:29
    we do the king
  • 00:16:30
    smoothing another thing I want to
  • 00:16:32
    mention that when we talk about like the
  • 00:16:34
    advantage of U has gun uh is more
  • 00:16:37
    flexible right we want to handle H Gun
  • 00:16:41
    separately so that means that we may
  • 00:16:43
    want to use different keing size uh for
  • 00:16:46
    different head guns for example some
  • 00:16:48
    head gun in the city we may use a small
  • 00:16:50
    keing some head gun in the nonb ER maybe
  • 00:16:53
    you want to use a l
  • 00:16:55
    king so how can we handle this uh
  • 00:16:58
    actually the solution is very Brute
  • 00:17:00
    Force because the Hasan convolution is
  • 00:17:03
    so fast we just run aring um smothing in
  • 00:17:08
    parallel together and then we just put
  • 00:17:10
    data together later let me next I will
  • 00:17:12
    show you like how fast the
  • 00:17:15
    convolution yeah here here the P result
  • 00:17:19
    so for the P comparison we just
  • 00:17:21
    Implement three basic approach the first
  • 00:17:23
    one is the basic implementation we just
  • 00:17:26
    for each hasun we just get neighbers and
  • 00:17:28
    then we just uh uh do the aggregation or
  • 00:17:32
    do the sum do the mean Etc another
  • 00:17:35
    implementation we use a tender flow it
  • 00:17:37
    can be any uh available package but we
  • 00:17:39
    chose tender flow we use tender flow
  • 00:17:42
    implement the Hasan conion we draw it on
  • 00:17:45
    CPU the third solution is the same code
  • 00:17:48
    we draw on
  • 00:17:49
    GPU we can clearly say like by using
  • 00:17:52
    this tender flow implementation of hon
  • 00:17:55
    conclusion uh uh with GPU we can ch more
  • 00:17:59
    almost like hundred times performance
  • 00:18:02
    Improvement even with CPU we still
  • 00:18:04
    canchi like a 10 times performance
  • 00:18:07
    Improvement so to summarize a little bit
  • 00:18:09
    I just want to show here that
  • 00:18:11
    U uh by using this hasun conclusion we
  • 00:18:15
    can D dramatically improve our
  • 00:18:17
    efficiency because this kind of data
  • 00:18:19
    smoothing is so fundamental for many
  • 00:18:21
    applications actually this can help us
  • 00:18:23
    to uh really reduce the competition
  • 00:18:26
    bottom neck so like we can uh we can put
  • 00:18:30
    our more effort on this kind of business
  • 00:18:32
    logic part so I talk about like how we
  • 00:18:35
    can um do uh uh data smoothing using
  • 00:18:39
    Hasan conclusion but actually Hasan
  • 00:18:41
    conion is more than that because Hason
  • 00:18:44
    contion is just a so fundamental
  • 00:18:47
    component think about like in um deep
  • 00:18:49
    learning right so contion your network
  • 00:18:52
    has been uh basically a foundational
  • 00:18:54
    component for many de uh deep Learning
  • 00:18:57
    Network structure
  • 00:18:59
    so it's similar for our forecasting
  • 00:19:01
    problem just recall like for the real
  • 00:19:03
    time forecasting problem what we do is
  • 00:19:05
    that we try to uh uh predict for every
  • 00:19:09
    hasun in the world for every for every
  • 00:19:12
    minute for next 16 minutes in the future
  • 00:19:14
    different quantity for example Supply
  • 00:19:16
    demand uh driver waiting time Etc so uh
  • 00:19:20
    the input is all kind of data historic
  • 00:19:22
    data Rec data external signals and the
  • 00:19:25
    output is basically our
  • 00:19:27
    prediction so so uh if just look at each
  • 00:19:30
    has gone separately the big problem is
  • 00:19:32
    that because of data sparity it's very
  • 00:19:35
    hard to really improve the perance um
  • 00:19:38
    and here we can clearly use the Hun
  • 00:19:40
    conclusion right so the first step in I
  • 00:19:44
    mean this is just one uh neuron Network
  • 00:19:46
    structure I'm showing here for example
  • 00:19:48
    we have Tred different uh architectures
  • 00:19:50
    in this architecture so the first step
  • 00:19:52
    we going to do is that we're going to
  • 00:19:54
    run the hasun conclusion first basically
  • 00:19:57
    we can aggregate all kind of signal
  • 00:19:59
    together and then the aggreg signal will
  • 00:20:02
    fit into the later part of the deeper
  • 00:20:04
    new network for six time I'm not going
  • 00:20:06
    to talk about our detail of this network
  • 00:20:09
    structure but roughly this network
  • 00:20:11
    structure was inspired by residual
  • 00:20:13
    Network where we can increase uh Network
  • 00:20:16
    layer and also it's a u a lot of idea is
  • 00:20:19
    borrowed from wnet basically how you can
  • 00:20:22
    process sequence
  • 00:20:23
    data so uh there are some other uh
  • 00:20:26
    interesting part like in in this network
  • 00:20:28
    we are not really predicting one value
  • 00:20:30
    we're predict predicting a distribution
  • 00:20:32
    I'm not going to talk about R detail but
  • 00:20:34
    the rough idea here I want to mention
  • 00:20:36
    here is uh by using hasun convolution is
  • 00:20:40
    such a fundamental component for uh uh
  • 00:20:43
    this deep learning model and uh actually
  • 00:20:46
    it open us many uh many opportunities
  • 00:20:49
    because a lot of if you look at Deep
  • 00:20:51
    learning literature or research or even
  • 00:20:53
    industry just like so many component
  • 00:20:56
    work buil on top of conclusion before
  • 00:20:59
    because we have we use Hason data it
  • 00:21:01
    been very hard for us do the conclusion
  • 00:21:03
    a lot of things we cannot leverage but
  • 00:21:05
    now with hasun conclusion is more like
  • 00:21:07
    open the door so like we can leverage a
  • 00:21:09
    lot of existing
  • 00:21:11
    work so uh with that let me just Briefly
  • 00:21:15
    summarize summarize the talk uh so
  • 00:21:19
    uh uh in Uber we build H3 we build
  • 00:21:22
    hasone um C sorry has aggregation of
  • 00:21:25
    this Uber business data we have our kind
  • 00:21:28
    of data which are aggregated in the H
  • 00:21:31
    Gun level and and we also use small has
  • 00:21:34
    guns which mean that we're going to have
  • 00:21:35
    some B has gun non B has guns uh data
  • 00:21:38
    sparti can become isue for us to do uh
  • 00:21:41
    to make a you know solid stable decision
  • 00:21:44
    that's why we need a lots of data
  • 00:21:46
    smoothing algorithm and uh to make this
  • 00:21:49
    data smoothing more efficient we build
  • 00:21:51
    this has gun convolution which really uh
  • 00:21:54
    improve the computation cost um by using
  • 00:21:58
    he conclusion actually it opens the door
  • 00:22:01
    for us to uh do a lot of more fancy
  • 00:22:04
    stuff building machine learning model
  • 00:22:06
    deep learning models and we are trying
  • 00:22:08
    to use a hasun solution to build our
  • 00:22:10
    realtime forecasting our model um with
  • 00:22:14
    that I would just like thank you all to
  • 00:22:16
    come to talk
Tags
  • headon data
  • data smoothing
  • forecasting
  • Uber
  • kernel smoothing
  • clustering
  • data sparsity
  • headon convolution
  • machine learning
  • deep learning