My Favorite Model with Daniel Lee - nyhackr November Meetup

00:52:29
https://www.youtube.com/watch?v=5QE_feHJOjk

摘要

TLDRLa réunion mensuelle initialement prévue pour être légère a été détournée vers une discussion technique sur les modèles statistiques, particulièrement ceux d'inférence bayésienne, appliqués dans l'industrie musicale. Le conférencier, Daniel, a expliqué comment il utilise ces modèles pour prévoir les streams de musique en ligne via le service DSP dans différents pays. En utilisant Stan, un langage de programmation probabilistique, il calque les streams de musique sur le modèle d'une enveloppe musicale, ce qui facilite la communication des résultats aux professionnels de la musique. De plus, il aborde les nombreux défis posés par les métadonnées incorrectes des morceaux de musique. Enfin, les rencontres se terminent souvent par une réunion sociale au Malt House qui permet à chacun de discuter des sujets de la réunion de manière plus informelle.

心得

  • 📅 La réunion prévue pour être légère s'est centrée sur des discussions techniques.
  • 📊 Daniel utilise Stan pour des modèles statistiques dans l'industrie musicale.
  • 🎶 Il utilise un modèle d'enveloppe musicale pour prédire les streams.
  • 📉 Les modèles aident à comprendre la progression et le déclin des streams.
  • 🌍 Ils analysent les données par pays via les services DSP.
  • 🔄 Les métadonnées incorrectes sont un défi important.
  • 🔍 La compréhension du comportement utilisateur diffère entre Apple et Spotify.
  • 👥 La communication des résultats techniques à des non-experts est cruciale.
  • 🍕 La réunion se termine par une rencontre sociale au Malt House.
  • 🔢 Les modèles aident à gérer les sorties et le marketing de musique.

时间轴

  • 00:00:00 - 00:05:00

    La discussion commence de manière légère mais devient plus sérieuse avec une discussion sur les statistiques bayésiennes et les offres d'emploi.

  • 00:05:00 - 00:10:00

    Il est question de participer à un canal Slack pour publier des offres d'emploi, et la nourriture est évaluée lors de la rencontre.

  • 00:10:00 - 00:15:00

    Les remerciements vont à NYU pour leur soutien à l'événement, tandis qu'on discute des vidéos à venir et des prochains événements mensuels.

  • 00:15:00 - 00:20:00

    L'orateur remplaçant parle des platines tourne-disques, de leurs caractéristiques et de ses préférences personnelles.

  • 00:20:00 - 00:25:00

    Il partage son expérience avec Stan, un langage open-source pour la modélisation probabiliste, et décrit divers modèles statistiques qu'il a utilisés.

  • 00:25:00 - 00:30:00

    L'orateur explique un modèle statistique préféré pour prédire les écoutes de musique, en utilisant des enveloppes sonores pour la modélisation.

  • 00:30:00 - 00:35:00

    Le modèle relationne les écoutes de musique avec l'évolution des enveloppes ADSR, ce qui aide à prévoir les flux futurs en tenant compte des tendances actuelles.

  • 00:35:00 - 00:40:00

    L'orateur montre les résultats prévisionnels et discute des défis liés à des facteurs non représentés dans les données, comme les tournées d'artistes.

  • 00:40:00 - 00:45:00

    Les avantages de reformuler le modèle en termes simples et compréhensibles pour ceux qui ne connaissent pas les statistiques, sont mis en valeur.

  • 00:45:00 - 00:52:29

    Une discussion dynamique consiste en des réponses aux questions de l'audience sur l'application des modèles, la motivation, et les défis de l'industrie musicale.

显示更多

思维导图

Mind Map

常见问题

  • Que fait le conférencier Daniel?

    Daniel travaille sur l'estimation et la projection des compétences des joueurs de sport, notamment dans la NBA, en utilisant des modèles statistiques.

  • Quel sujet a été ajouté à la discussion ce mois-ci?

    Au lieu d'une discussion légère, un sujet technique sur les modèles bayésiens et statistiques a été abordé.

  • Pourquoi le conférencier remplaçant était-il nécessaire?

    Il y a eu une annulation tardive, donc un remplaçant a été nécessaire pour maintenir la session.

  • Quel modèle statistique est son préféré?

    Son modèle préféré est celui utilisé pour prédire les streams de musique en ligne, car il est intuitif pour les professionnels de la musique.

  • Quels défis Daniel rencontre-t-il avec les métadonnées musicales?

    Les métadonnées incorrectes sont un problème majeur, car elles affectent la précision des modèles de prévision de streams.

  • Y a-t-il eu un changement de thème prévu pour cette réunion?

    Oui, un thème léger était planifié, mais a été remplacé par une discussion technique détaillée.

  • Comment se termine généralement la réunion?

    Les participants ont tendance à se retrouver à Malt House pour continuer les discussions de manière informelle après la réunion.

  • Quel est le but principal du modèle développé par Daniel?

    Estimer et prédire les streams de musique en fonction de divers facteurs comme le releasement de nouveaux morceaux et le marketing.

查看更多视频摘要

即时访问由人工智能支持的免费 YouTube 视频摘要!
字幕
en
自动滚动:
  • 00:00:00
    so we're going going so it is now
  • 00:00:02
    November right okay uh we promised you a
  • 00:00:06
    lighthearted talk for this month but it
  • 00:00:07
    all came crashing
  • 00:00:09
    down he's right there no he's cool he's
  • 00:00:12
    going he's okay physically fine he's in
  • 00:00:14
    the room uh and we're going to try to
  • 00:00:16
    get him to speak again sometime next
  • 00:00:18
    year so you know we all be excited for
  • 00:00:19
    when that happens so we're going to make
  • 00:00:21
    that happen right cool absolutely all
  • 00:00:23
    right so so we're going to have a deep
  • 00:00:25
    heavy statistical nerdy talk today then
  • 00:00:28
    all right so anyone here is expecting
  • 00:00:30
    fuzzy monsters it's not going to happen
  • 00:00:32
    we're going to get a different type of
  • 00:00:32
    monster a basian monster that's the
  • 00:00:35
    stuff in nightmares all right so first
  • 00:00:37
    things first though jobs is Anybody
  • 00:00:39
    hiring this
  • 00:00:41
    month Anybody
  • 00:00:43
    hiring no okay is anyone
  • 00:00:48
    firing yeah it seems like everyone is
  • 00:00:51
    right okay well in case anyone who is
  • 00:00:53
    watching virtually is hiring you know
  • 00:00:56
    you can go to NY the NY hackr slack if
  • 00:00:58
    you want to find that you go to NY hackr
  • 00:01:00
    .org and click on the slack hashtag and
  • 00:01:03
    then you could go to the job postings
  • 00:01:04
    Channel and post your job I think a job
  • 00:01:06
    was posted there fairly recently and
  • 00:01:08
    over the past 15 years we've gotten
  • 00:01:10
    numerous people jobs at this meet up
  • 00:01:12
    asking this question used to have a lot
  • 00:01:14
    more people hiring but hopefully that
  • 00:01:15
    will pick up again soon so if you want a
  • 00:01:17
    job check out the job postings Channel
  • 00:01:18
    if you want to hire someone post it in
  • 00:01:20
    the job postings Channel all right
  • 00:01:23
    second order of business the pizza today
  • 00:01:25
    is from Cello's pizza so everyone is in
  • 00:01:28
    the room get out your phone go to bit.
  • 00:01:31
    lee/ pizzle and rank it on the fiveo
  • 00:01:34
    scale and let us know how it is I see a
  • 00:01:37
    bunch of people eating but not taking
  • 00:01:38
    out their phones I will start calling
  • 00:01:40
    you out George all right you weren't
  • 00:01:43
    eating but you I saw you sitting there
  • 00:01:45
    without your phone everyone here great
  • 00:01:46
    great great if you're watching from home
  • 00:01:49
    uh go in the chat and let us know what
  • 00:01:50
    you're eating food is a big part of this
  • 00:01:52
    me not food pizza is a big part of this
  • 00:01:54
    Meetup but if you made poor life choices
  • 00:01:56
    and you don't have pizza let us know
  • 00:01:57
    what you're eating uh if if you want to
  • 00:02:00
    chat there are multiple ways to chat you
  • 00:02:01
    could chat right in the YouTube live
  • 00:02:03
    stream which you're watching or you
  • 00:02:05
    could chat in the NY hackr slack at the
  • 00:02:08
    monthly Meetup chat channel so go there
  • 00:02:11
    or go on YouTube and tell us what you're
  • 00:02:13
    eating or if you have questions from the
  • 00:02:15
    speaker and you're watching virtually
  • 00:02:17
    later on you can go in there and we will
  • 00:02:20
    compile the questions asked virtually
  • 00:02:22
    and we will ask them intermingled with
  • 00:02:24
    the inperson
  • 00:02:27
    questions while everyone is voting I
  • 00:02:29
    will give a big big thank you to NYU
  • 00:02:31
    prism and George for making this happen
  • 00:02:33
    every month so everyone big round of
  • 00:02:34
    applause for George and NYU
  • 00:02:37
    prism so we we have a big thank you to
  • 00:02:40
    him for both enabling the space and
  • 00:02:42
    enabling the projector on the screen
  • 00:02:44
    that comes from George well he gets a
  • 00:02:46
    special shout out for that uh last month
  • 00:02:49
    we had the r in government conference
  • 00:02:51
    the videos for that will be up ready for
  • 00:02:54
    everyone to view within the week it'll
  • 00:02:56
    be a nice Thanksgiving treat when you're
  • 00:02:57
    home avoiding your family you can watch
  • 00:02:59
    those videos
  • 00:03:00
    instead so we'll announce that when that
  • 00:03:02
    comes
  • 00:03:03
    out next month's meet up will be
  • 00:03:06
    December 3rd and it will be about
  • 00:03:09
    population density of different cities
  • 00:03:12
    so that'll be a nice urban planner thing
  • 00:03:14
    then in January George will be giving a
  • 00:03:17
    talk about some something controversial
  • 00:03:20
    statistical theory he has it'll make all
  • 00:03:23
    the causal inference people go
  • 00:03:25
    nuts with these one three cool
  • 00:03:28
    tricks right
  • 00:03:30
    all right but that'll be George speaking
  • 00:03:31
    January then in February Eric will be
  • 00:03:34
    hopefully giving that talk and if not
  • 00:03:36
    that talk some other talk you have
  • 00:03:38
    another talk you'll have to give instead
  • 00:03:39
    if you can't give that one all right
  • 00:03:41
    cool cool cool then in March will'll be
  • 00:03:43
    vickram cool then in April we're going
  • 00:03:45
    to have a talk about assuming we can
  • 00:03:47
    confirm this with them about uh mapping
  • 00:03:50
    they they have a new thing called Subway
  • 00:03:51
    stories it's about mapping you know
  • 00:03:53
    people getting on and off the train oh
  • 00:03:55
    where they start where they stop
  • 00:03:56
    remember a few months ago at an MTA
  • 00:03:58
    person come talk they have a model for
  • 00:03:59
    where people get off these people are
  • 00:04:01
    using that data to make visualizations
  • 00:04:04
    so it's a very nice tie into our
  • 00:04:05
    previous talk all right then after the
  • 00:04:08
    talk today we are going to go to Malt
  • 00:04:11
    House George came up with this he said
  • 00:04:13
    it's a great
  • 00:04:15
    place it's bad it's Jared's fa there you
  • 00:04:17
    go but Nicole said she likes it so we're
  • 00:04:19
    going to trust her more than George
  • 00:04:21
    right and we're going to go to Mt house
  • 00:04:22
    which is located at where is it located
  • 00:04:25
    I think it's on Thompson Thompson a
  • 00:04:27
    little below the park you all have
  • 00:04:28
    Google Maps Malt House find it we're
  • 00:04:31
    gonna go there hopefully it'll be good
  • 00:04:32
    it's good to try a new place um so we'll
  • 00:04:34
    do that and anyone is watching virtually
  • 00:04:36
    in New York City we've had people show
  • 00:04:37
    up at the bar after the talk so come by
  • 00:04:39
    Thompson Thompson and what uh Thompson
  • 00:04:41
    and bleer Thompson and bleer all right
  • 00:04:43
    folks Thompson and bleer that's where
  • 00:04:44
    we're heading all right oh that's right
  • 00:04:46
    by oh Fior is gone but it's right by um
  • 00:04:49
    arturos which we've had for the mut up
  • 00:04:50
    before closer it's closer oh closer than
  • 00:04:52
    that all right cool cool cool oh yeah
  • 00:04:53
    that's on house thank you for correcting
  • 00:04:55
    thank you all right anyway so we had a
  • 00:04:58
    cancellation today so we're not learning
  • 00:04:59
    about pix we want to have a big thank
  • 00:05:01
    you for our fill in speaker who we
  • 00:05:04
    literally filled in last night around
  • 00:05:06
    10:30 at night I think we got him to
  • 00:05:07
    fill in so everyone both thank him and
  • 00:05:09
    just welcome to the stage welcome
  • 00:05:12
    [Applause]
  • 00:05:17
    Daniel come on
  • 00:05:20
    down and a big round of applause
  • 00:05:21
    seriously it was
  • 00:05:26
    great I'm going to disappoint Jared this
  • 00:05:29
    talk is really light-hearted it's not
  • 00:05:31
    going to be tight I figured it's a long
  • 00:05:34
    day it's uh what day is it today Tuesday
  • 00:05:37
    um yeah so I'm going to be talking about
  • 00:05:40
    turntables um on the left here we have
  • 00:05:44
    the Technic 1200 Mar 2s and on the right
  • 00:05:48
    we have the m3ds and I'm going to tell
  • 00:05:51
    you about my favorite model here um if
  • 00:05:54
    you don't know these were the Workhorse
  • 00:05:56
    turntables used in all clubs for many
  • 00:05:59
    decades
  • 00:06:00
    they're awesome um but there was a major
  • 00:06:03
    Improvement that happened in
  • 00:06:05
    97 um that little onoff switch right
  • 00:06:08
    there and the difference is really
  • 00:06:11
    subtle but on the left the whole top
  • 00:06:14
    turned right so you can easily hit it
  • 00:06:17
    and the hold top turn but on the right
  • 00:06:19
    had this little casing that went all the
  • 00:06:21
    way to the top so you actually had to be
  • 00:06:23
    um intentional about turning it there
  • 00:06:26
    were a couple other improvements here
  • 00:06:28
    the the pitch fader is different where
  • 00:06:30
    there's a little um depression in the
  • 00:06:33
    middle on the The Mark 2s um which made
  • 00:06:36
    it it's like a rounding error if you get
  • 00:06:37
    close to zero it's going to just go to
  • 00:06:39
    zero which is not good for trying to mix
  • 00:06:42
    but this is really the thing um and so
  • 00:06:46
    my favorite model is the mark II's the
  • 00:06:49
    m3ds just because of
  • 00:06:52
    usability and that's probably not why
  • 00:06:54
    you're
  • 00:06:55
    here all right I'm I'm going to talk to
  • 00:06:58
    you about my favorite statistical model
  • 00:07:02
    um just to give you some background I've
  • 00:07:05
    worked on Stan for a long time uh stands
  • 00:07:08
    an open source probabilistic programming
  • 00:07:10
    language it's known for basing inference
  • 00:07:12
    but you can use it for other things uh
  • 00:07:14
    it's open source is used the cross
  • 00:07:16
    Industries um there are all these uh
  • 00:07:18
    secondary packages that make it easier
  • 00:07:20
    to use rstan rstan Arm brms um Pi stand
  • 00:07:25
    command stand Pi uh if you look at sort
  • 00:07:28
    of the r thing this by the way this talk
  • 00:07:30
    was prepared as a backup to the nyr so
  • 00:07:33
    this is a bunch of R type of stuff
  • 00:07:37
    here um I've been working with Stan for
  • 00:07:40
    about 13 years um did a lot of the
  • 00:07:43
    community building much like Jared has
  • 00:07:45
    been doing here online meetups in New
  • 00:07:47
    York City uh given a lot of talks um
  • 00:07:52
    some of the different models that I've
  • 00:07:54
    worked on with Stan and you could ask me
  • 00:07:56
    questions about any of this stuff and
  • 00:07:57
    I'll I'll be happy to answer them or
  • 00:07:59
    what I can
  • 00:08:01
    um Quality Control across different
  • 00:08:04
    sites for Diamond rating um there's
  • 00:08:07
    price elasticity um for Consumer
  • 00:08:11
    consumer pricing uh mixed medium
  • 00:08:14
    modeling is pretty big uh if you're into
  • 00:08:17
    pharmaco kinetics or pharmacodynamics
  • 00:08:19
    Stan is used a lot there um one of the
  • 00:08:23
    things is that this those types of
  • 00:08:25
    models were the hardest for me to
  • 00:08:26
    implement sorry most effort to implement
  • 00:08:29
    I wouldn't say hardest but it it
  • 00:08:30
    definitely took a lot of time to
  • 00:08:31
    implement so working with ordinary
  • 00:08:33
    differential
  • 00:08:35
    equations um survival models um that was
  • 00:08:40
    the hardest math it took months of
  • 00:08:43
    pencil and paper to do integrals so that
  • 00:08:45
    I didn't have to put it into Stan uh you
  • 00:08:48
    know actual neural Nets using a brain
  • 00:08:51
    epilepsy model
  • 00:08:54
    um my current job is estimating and
  • 00:08:57
    projection skills of sports players um
  • 00:09:00
    right now I'm working in the NBA so
  • 00:09:02
    that's what I
  • 00:09:03
    do uh and we'll talk about my favorite
  • 00:09:06
    model uh this work was not done alone
  • 00:09:10
    this was was done while at Warner Music
  • 00:09:12
    and with a team of great people um so
  • 00:09:16
    these are the people that I worked with
  • 00:09:17
    at the time and what we were trying to
  • 00:09:20
    do was estimate uh for any track a music
  • 00:09:24
    track estimate how many streams you
  • 00:09:26
    would get for the track in the future um
  • 00:09:29
    some of the reasons that we care the
  • 00:09:31
    first is it drives Revenue the more
  • 00:09:33
    streams you have the more money you have
  • 00:09:34
    great um the second thing that people
  • 00:09:37
    were interested in was how can we affect
  • 00:09:41
    the stream counts can we do anything to
  • 00:09:43
    boost it can we um you know advertise a
  • 00:09:47
    little
  • 00:09:48
    more um so what we were trying to do was
  • 00:09:51
    forecast and estimate these tracks into
  • 00:09:54
    the future and what we're interested in
  • 00:09:57
    is by what they call DSP which is a
  • 00:09:59
    digital service provider that's Apple
  • 00:10:01
    music uh
  • 00:10:03
    Spotify dieser um there're a bunch of
  • 00:10:06
    them and we we were also interested in
  • 00:10:09
    by country um some of the things that
  • 00:10:11
    people were interested in were if it
  • 00:10:14
    takes off in the US is Europe going to
  • 00:10:16
    follow later if it if it takes off in
  • 00:10:18
    Paris in France does it go to Belgium
  • 00:10:21
    next where where's it
  • 00:10:23
    going
  • 00:10:24
    um so let's take a guess can you guess
  • 00:10:28
    where these songs are going to go based
  • 00:10:30
    on the Stream So the stream counts are
  • 00:10:32
    on are going
  • 00:10:34
    vertical uh Spotify is in blue Apple
  • 00:10:37
    music is in green and Amazon is in Red
  • 00:10:41
    so if you take a look this is a this is
  • 00:10:43
    an afro afro beats type of song here and
  • 00:10:49
    you know it's going there Jack harlo
  • 00:10:51
    first class if you're into like popish
  • 00:10:54
    hip-hop stuff that was released here and
  • 00:10:57
    it's it's going down
  • 00:11:00
    um so take a second in your mind try to
  • 00:11:04
    take a guess of where it's going to go
  • 00:11:06
    I'll show you what what actually
  • 00:11:13
    happen
  • 00:11:15
    oh I'll I'll show you in a little
  • 00:11:18
    bit oh actually it's right here okay
  • 00:11:22
    cool so yeah um if you look you'll see
  • 00:11:27
    this sort of pattern it goes up and
  • 00:11:29
    comes down pretty easy except for mother
  • 00:11:32
    mother halaf 2 right and you'll see that
  • 00:11:35
    it was going down and then something
  • 00:11:37
    happened and you got a lift um some
  • 00:11:40
    other things to note are the Y AIS here
  • 00:11:45
    Jack harlo is in the 4 million at the
  • 00:11:47
    top range uh CK was at 300K right so we
  • 00:11:53
    have completely different types of
  • 00:11:55
    artists here that are represented in
  • 00:11:57
    this data set
  • 00:12:00
    um the forecasting
  • 00:12:02
    model so this model um like I said it's
  • 00:12:07
    my favorite model here is representing
  • 00:12:10
    the streams of music as or streams of um
  • 00:12:13
    stream counts as music
  • 00:12:16
    envelopes so how many of you play
  • 00:12:19
    instruments and have seen something like
  • 00:12:21
    this
  • 00:12:24
    before
  • 00:12:25
    okay so um all right I'll take a second
  • 00:12:29
    to explain what's going on um if you
  • 00:12:31
    imagine this is the envelope of sound um
  • 00:12:34
    amplitude goes up that's how loud a
  • 00:12:36
    sound is and as it comes back down it's
  • 00:12:38
    it gets quieter if you play a note on a
  • 00:12:40
    piano you press it that that's what
  • 00:12:43
    happens here it's going to get louder
  • 00:12:46
    right and then piano is is kind of
  • 00:12:47
    percussive so it gets loud really fast
  • 00:12:49
    if you play a different instrument it's
  • 00:12:50
    going to get loud over time um so that
  • 00:12:53
    that happens here it gets to a maximum
  • 00:12:56
    and as you hold it it's going to Decay a
  • 00:12:58
    little bit and sorry it's going to Decay
  • 00:13:00
    a little bit naturally and then as you
  • 00:13:02
    hold it it's going to
  • 00:13:03
    sustain and then at some point you're
  • 00:13:05
    going to let go of the the key and it's
  • 00:13:07
    going to release and you know this this
  • 00:13:09
    pattern happens differently with
  • 00:13:11
    different musical instruments um you can
  • 00:13:13
    do this with electronic instruments as
  • 00:13:15
    well so asdr attack Decay sustain
  • 00:13:21
    release for our forecasting model what
  • 00:13:24
    we're trying to do was estimate from the
  • 00:13:26
    data first of all a delay so between the
  • 00:13:30
    time that you release a song um we're
  • 00:13:33
    going to estimate the time before it
  • 00:13:36
    starts ramping up why are we trying to
  • 00:13:39
    estimate that part of the problem is
  • 00:13:41
    there's incorrect data it'll say that a
  • 00:13:44
    song was released a week prior two weeks
  • 00:13:47
    prior to when it actually was released
  • 00:13:49
    on the
  • 00:13:50
    platforms um we're trying to we're
  • 00:13:52
    trying to estimate this maximum the
  • 00:13:54
    distance between zero and the maximum in
  • 00:13:56
    the number
  • 00:13:57
    streams uh we want to know the sustain
  • 00:14:01
    amplitude here so how high it stays
  • 00:14:05
    pretty
  • 00:14:06
    flat um we also want to know the attack
  • 00:14:09
    duration right so how long does it take
  • 00:14:11
    to get to its peak and then the Decay
  • 00:14:13
    duration how long does it go from its
  • 00:14:15
    peak back down to a place where it just
  • 00:14:18
    kind of flat lines for a while um in our
  • 00:14:21
    model we we ignore this release part
  • 00:14:24
    because if you look back at
  • 00:14:26
    the sorry if you look back at the
  • 00:14:31
    data um most of these songs if you look
  • 00:14:35
    further down in time especially a a big
  • 00:14:38
    pop song like first class you'll see
  • 00:14:40
    that it just stays people listen to it
  • 00:14:43
    at about the same rate for a long time
  • 00:14:46
    so we can ignore
  • 00:14:48
    this
  • 00:14:50
    um so here's here's how the forecasting
  • 00:14:53
    model does right so if we truncated the
  • 00:14:56
    data at this dotted line and so we
  • 00:14:58
    didn't show show the model we didn't
  • 00:15:00
    show the the model the data after that
  • 00:15:03
    it kind of estimates things like that
  • 00:15:05
    and how does it do that it's building
  • 00:15:07
    out this model for this attack Decay um
  • 00:15:12
    and you know something that's not uh
  • 00:15:15
    captured in the data and this is why
  • 00:15:17
    mother mother halaf 2 is weird is
  • 00:15:21
    because mother mother went on a global
  • 00:15:23
    tour right here and made another release
  • 00:15:27
    so they they got a bump because they did
  • 00:15:30
    something right and just looking at the
  • 00:15:33
    streams absent of that context you're
  • 00:15:35
    going to miss all
  • 00:15:41
    that all right
  • 00:15:45
    so streams as music envelop once again
  • 00:15:48
    we're estimating from the data the Decay
  • 00:15:50
    which is not pictured a Max a sustain
  • 00:15:52
    amplitude intact duration Decay
  • 00:15:55
    duration all right um
  • 00:15:59
    question is why is this particular model
  • 00:16:01
    my favorite and we kind of have to look
  • 00:16:03
    at the alternative here so I could have
  • 00:16:08
    described this as a compartmental model
  • 00:16:10
    in epidemiology and try to explain this
  • 00:16:12
    to a musician or musician turn music
  • 00:16:16
    executive right this exe this resembles
  • 00:16:18
    a sir model how many are you are
  • 00:16:21
    familiar with this sort of model oh cool
  • 00:16:24
    so we got a bunch of people in here that
  • 00:16:26
    are familiar with this model but not you
  • 00:16:29
    know not music envelopes right so within
  • 00:16:33
    this community I can talk about you know
  • 00:16:35
    you have a population of susceptible
  • 00:16:37
    people um that's the population of
  • 00:16:39
    people that listen to this type of music
  • 00:16:41
    right and then you get the Infectious
  • 00:16:43
    and they're going to start spreading out
  • 00:16:45
    who listens to models oh one thing I
  • 00:16:49
    should mention is that if you look at
  • 00:16:51
    the Infectious curve that sort of looks
  • 00:16:54
    like what we were looking at right so
  • 00:16:56
    the the two other curves are are sort of
  • 00:16:58
    hidden for from us our
  • 00:17:00
    latent um we could describe this as an
  • 00:17:02
    ordinary differential equation right so
  • 00:17:05
    it takes a little bit of
  • 00:17:07
    math
  • 00:17:09
    um and what we're trying to estimate
  • 00:17:12
    here are two different rate variables
  • 00:17:14
    that kind of describe that'll give
  • 00:17:16
    different shapes to this eye
  • 00:17:18
    curve we're trying to estimate these
  • 00:17:20
    population sizes um and we're we'll end
  • 00:17:24
    up talking about derivatives and rates
  • 00:17:26
    of change right that that comes natural
  • 00:17:30
    from these like these equations here
  • 00:17:33
    it's because of the way the math is
  • 00:17:36
    um uh another thing that comes naturally
  • 00:17:39
    out of a model like this is half lives
  • 00:17:41
    like when do you get to half of the the
  • 00:17:44
    population and you're interested in
  • 00:17:46
    sometimes steady States so if you have a
  • 00:17:49
    more complex version of this sort of
  • 00:17:50
    model you talk about steady state
  • 00:17:54
    Behavior so the reason why this
  • 00:17:57
    particular model model is my favorite is
  • 00:18:00
    because I was able to reframe the model
  • 00:18:02
    into something that's intuitive to
  • 00:18:04
    people in the music
  • 00:18:06
    industry um it's easier to communicate
  • 00:18:09
    for them right so I can talk about if I
  • 00:18:12
    talk to a musician about here's a Max
  • 00:18:14
    amplitude here's the max number of
  • 00:18:15
    streams that I expect to get here's the
  • 00:18:17
    the Decay here's the time to take it'll
  • 00:18:20
    take for it to like be
  • 00:18:22
    steady um so I can communicate all that
  • 00:18:25
    and the the switch and
  • 00:18:27
    nomenclature test something that's very
  • 00:18:29
    familiar to the Musicians is um and and
  • 00:18:33
    communicate
  • 00:18:35
    communicable was
  • 00:18:37
    um was really the the key and so much
  • 00:18:42
    like you know the it's it's a little
  • 00:18:44
    subtle change in usage right it's it's
  • 00:18:47
    not the model is any different we could
  • 00:18:49
    have done the same thing with the sir
  • 00:18:52
    model here but by changing and reframing
  • 00:18:54
    it we get something that's really uh
  • 00:18:57
    useful in communicated um useful for the
  • 00:19:06
    purpose I ran through those slides
  • 00:19:09
    really fast Jared so first I'll take
  • 00:19:12
    questions sure let me see this question
  • 00:19:15
    line any questions from the crowd here
  • 00:19:19
    so are the is is there some view of the
  • 00:19:22
    two models that they're potentially
  • 00:19:24
    equivalent under the hood or can be yeah
  • 00:19:26
    they they can be so um
  • 00:19:29
    um I've used this trick a couple of
  • 00:19:32
    other times as well so you change the
  • 00:19:35
    quantities of interest from
  • 00:19:39
    um from the natural parameters of the
  • 00:19:42
    statistical model so instead of thinking
  • 00:19:44
    about this as
  • 00:19:46
    um beta and gamma and trying to estimate
  • 00:19:49
    these and communicate beta and gamma to
  • 00:19:52
    people that you know what does beta mean
  • 00:19:54
    what does gamma mean when you start
  • 00:19:56
    talking about what priories do you put
  • 00:19:58
    on a beta
  • 00:19:59
    you know people that work in music
  • 00:20:01
    aren't really going to be able to tell
  • 00:20:03
    you anything about it and you start
  • 00:20:07
    framing that as all right this is um
  • 00:20:11
    this is Jack harlo what do you think the
  • 00:20:13
    maximum number of streams you're going
  • 00:20:15
    to get the first week is there some way
  • 00:20:18
    you can bump that to 2x that how long do
  • 00:20:21
    you think they're going to be at the top
  • 00:20:22
    of the charts for what does that mean
  • 00:20:24
    when are they going to fall off the top
  • 00:20:26
    of the charts his last song did 10 weeks
  • 00:20:29
    at the top of the chart so we expect it
  • 00:20:31
    to be 10 weeks or 12 or eight right you
  • 00:20:34
    can talk about that and those terms but
  • 00:20:37
    if you try talking about them in these
  • 00:20:39
    terms minus beta these gamas you're you
  • 00:20:43
    know it's the same model right it's
  • 00:20:47
    just well you you can you can basic of
  • 00:20:49
    the curve you can impute the the priors
  • 00:20:53
    for the parameters by looking at it in a
  • 00:20:55
    different frame of reference
  • 00:21:00
    is there a functional limit to how far
  • 00:21:02
    in the future you can prict
  • 00:21:05
    streams we're we're pretty bad
  • 00:21:08
    well two things um streams for most
  • 00:21:13
    songs are really consistent once they
  • 00:21:16
    hit this like sustain they just people
  • 00:21:19
    will listen to it at that same
  • 00:21:21
    rate things that have changed that are
  • 00:21:24
    Tik Tok and um what they call it
  • 00:21:30
    um it's when they they play the old
  • 00:21:33
    songs on new shows and then it picks up
  • 00:21:37
    and then it gets back on Tik Tok
  • 00:21:38
    Instagram and then all of a sudden you
  • 00:21:40
    had um K Bush running up the hill made
  • 00:21:42
    it to a number one chart for the first
  • 00:21:44
    time and like that was the longest one
  • 00:21:45
    it was like 30e Gap since the release to
  • 00:21:48
    being number one right stuff like that
  • 00:21:52
    like good luck predicting that I you
  • 00:21:56
    know I'd like to publish a song that
  • 00:21:58
    years later gets picked up and Fs
  • 00:22:00
    royalties
  • 00:22:02
    for
  • 00:22:08
    hello I like how you the show here's
  • 00:22:12
    situation where
  • 00:22:15
    model
  • 00:22:19
    external
  • 00:22:21
    ATT represent these models one thing I
  • 00:22:24
    was wondering though is what the model
  • 00:22:26
    specification was
  • 00:22:39
    model
  • 00:22:42
    usor yeah so those are great questions
  • 00:22:45
    it started out as a sir model with the
  • 00:22:48
    derivatives uh specified that way and
  • 00:22:51
    then we were having a lot of trouble
  • 00:22:54
    with that model and fitting it it was
  • 00:22:57
    written in a different language we won't
  • 00:22:59
    talk about all the problems but that's
  • 00:23:01
    how it originally started and we were
  • 00:23:04
    actually having trouble
  • 00:23:06
    specifying
  • 00:23:08
    um reasonable priors for these things
  • 00:23:11
    because these you know it depends on the
  • 00:23:13
    size of the the different populations
  • 00:23:16
    right the the priors are context driven
  • 00:23:18
    in that sense um so at the end of the
  • 00:23:22
    day the thing that we that we actually
  • 00:23:24
    fit for this work here which has been
  • 00:23:27
    expanded on since was actually a linear
  • 00:23:29
    Model A piecewise linear model model
  • 00:23:32
    yeah but um it was done basing so if you
  • 00:23:36
    look there you actually do
  • 00:23:38
    see um like a bit of uncertainty there
  • 00:23:42
    because we're not just optimizing it oh
  • 00:23:44
    yeah and
  • 00:23:48
    um that's right so um you know it has
  • 00:23:52
    this bit of like a change Point type of
  • 00:23:55
    property to it as well because we're
  • 00:23:56
    estimating the time to to when these
  • 00:23:58
    things happen so those were variable in
  • 00:24:01
    time um the other thing was it was
  • 00:24:04
    hierarchical so we were um assuming that
  • 00:24:09
    uh all three of these curves we actually
  • 00:24:12
    had like a dozen um dsps stacked
  • 00:24:15
    together but they we assume that they
  • 00:24:18
    were moving together in some fashion
  • 00:24:21
    across different
  • 00:24:22
    countries um it turns out that that
  • 00:24:24
    isn't quite
  • 00:24:26
    true um if you start looking into user
  • 00:24:29
    Behavior Spotify users are very
  • 00:24:31
    different than Apple users and both
  • 00:24:34
    those are very different than people in
  • 00:24:35
    France on dieser and those were just
  • 00:24:38
    some of the major ones and we were
  • 00:24:40
    seeing things like house music
  • 00:24:43
    absolutely pop off in um in Europe in
  • 00:24:47
    dieser and it wouldn't it maybe but it
  • 00:24:52
    wasn't it wasn't it wasn't popping off
  • 00:24:54
    here and it turned out it was in like uh
  • 00:24:57
    was it beats source or some other other
  • 00:24:59
    DSP that was like didn't have streams
  • 00:25:02
    otherwise
  • 00:25:09
    like yeah so I mean this this work right
  • 00:25:13
    every time you work on a model like this
  • 00:25:14
    it all these questions are good we were
  • 00:25:16
    thinking about a lot of these it's like
  • 00:25:17
    where can we do better right we have
  • 00:25:19
    this hierarchical model tying these
  • 00:25:22
    different dsps together in a certain way
  • 00:25:24
    assuming that they should all function
  • 00:25:26
    similarly but then no they don't so you
  • 00:25:30
    know we we were battling that there was
  • 00:25:33
    what other things oh um some of the
  • 00:25:35
    other things that were were battling was
  • 00:25:38
    uh the reason why we had to estimate the
  • 00:25:40
    decay in there was because I think the
  • 00:25:44
    most egregious one was lizo about damn
  • 00:25:47
    time was um the metadata had the the
  • 00:25:52
    track released two years prior to the
  • 00:25:54
    actual release date and if you're
  • 00:25:56
    thinking like you're working inside
  • 00:25:58
    music you this is your artist don't you
  • 00:26:01
    don't you know the actual date you like
  • 00:26:04
    handed this over to Apple
  • 00:26:06
    like um it turns out that that's a very
  • 00:26:10
    hard problem um for unknown reasons to
  • 00:26:14
    me
  • 00:26:15
    um and some of the other things
  • 00:26:19
    were each of these
  • 00:26:21
    songs so we think of these as
  • 00:26:24
    songs um we think of this as a recording
  • 00:26:28
    and you might think that one recording
  • 00:26:30
    gets blasted out to everywhere it turns
  • 00:26:33
    out if you look under the hood this one
  • 00:26:36
    recording might have 12 different uh
  • 00:26:39
    equivalent of isbns like SKS so why does
  • 00:26:43
    it do that every time it's on a release
  • 00:26:45
    it gets another number so if it's on an
  • 00:26:48
    album cut if it's on a on a this cut
  • 00:26:51
    it's this release this compilation it
  • 00:26:53
    gets another number attached to it um it
  • 00:26:57
    turns out that people people are very
  • 00:26:58
    careful about how they spell things and
  • 00:27:01
    want features and whether or not they're
  • 00:27:03
    listed as a as a artist so something
  • 00:27:08
    titled with featuring someone else is
  • 00:27:10
    different than a song that says that
  • 00:27:13
    with a featured artist is different than
  • 00:27:16
    you know all sorts of other things
  • 00:27:20
    sometimes the Publishers are in there
  • 00:27:21
    sometimes the the writers get credit in
  • 00:27:24
    the actual song so each of those are
  • 00:27:26
    different um
  • 00:27:28
    it turns out that there it was actually
  • 00:27:30
    very difficult for us to find a master
  • 00:27:32
    list of um which
  • 00:27:36
    recording had how many like skew numbers
  • 00:27:39
    attached to it and so we had to do this
  • 00:27:42
    like an inverse NLP prom instead of
  • 00:27:45
    sitting on the data NLP is you know it's
  • 00:27:49
    awesome but it's not it's hard when you
  • 00:27:51
    have to when you think you when you
  • 00:27:53
    assume that this stuff is
  • 00:27:56
    known yeah that's when that's when it
  • 00:27:58
    gets you like we didn't discover some of
  • 00:28:01
    these problems until months in which is
  • 00:28:03
    you know the only reason we knew that is
  • 00:28:06
    because we were looking at a very
  • 00:28:09
    well-known track and then realized we
  • 00:28:12
    were missing Spotify streams for a
  • 00:28:13
    couple couple months
  • 00:28:16
    the uh if you see CK and
  • 00:28:20
    Milana how many different ways is that
  • 00:28:23
    spelled um how many
  • 00:28:26
    hyphenations how many different
  • 00:28:28
    compilations does it get
  • 00:28:30
    on um turns out people play games like
  • 00:28:34
    chop the song by a couple seconds so it
  • 00:28:37
    fits on Apple to get onto their top
  • 00:28:40
    playlists like because there's a limit
  • 00:28:43
    so there are lots and lots and lots of
  • 00:28:45
    games to play and you know just looking
  • 00:28:48
    at the data we're not in control
  • 00:28:51
    yet bre first
  • 00:28:55
    repeat yeah some of the challenge was we
  • 00:28:58
    were dealing right at the time was the
  • 00:29:01
    shift over from like having a lot of
  • 00:29:03
    High Fidelity cookie data to less
  • 00:29:06
    Fidelity so everything was
  • 00:29:09
    aggregated um but we did have a lot of
  • 00:29:11
    that um this model didn't incorporate a
  • 00:29:14
    lot of that I'm hoping that they've done
  • 00:29:16
    more
  • 00:29:21
    since how did you decide when it
  • 00:29:23
    switched what what when it switched from
  • 00:29:29
    we were estimating that um so that was
  • 00:29:31
    one of the parameters that we were
  • 00:29:33
    estimating in the
  • 00:29:35
    model um when when we starting out with
  • 00:29:39
    the Sur model it it naturally comes out
  • 00:29:43
    of the the beta and the gamma so
  • 00:29:46
    combination of that will determine where
  • 00:29:48
    the peak is
  • 00:29:51
    but um once we once we change it up we
  • 00:29:55
    can actually specify where that is and
  • 00:29:57
    put priors on it
  • 00:29:59
    was it one model for everything you have
  • 00:30:01
    a model for the change points a model
  • 00:30:02
    for each section individually it was one
  • 00:30:04
    model for everything which was actually
  • 00:30:06
    really cool
  • 00:30:09
    um go back to
  • 00:30:13
    the
  • 00:30:15
    yeah would you say that with mother
  • 00:30:18
    mother was it like another adsr
  • 00:30:25
    happening yeah so um back to
  • 00:30:29
    communication to um music Executives and
  • 00:30:33
    people that know music the thing that we
  • 00:30:35
    were going with this model was
  • 00:30:38
    um if you look at something like this if
  • 00:30:42
    you look at midi instruments you can
  • 00:30:43
    press a key hold it down and press a key
  • 00:30:45
    again and it'll just stack on and we
  • 00:30:49
    were thinking the first thing we're
  • 00:30:51
    going to do was add um do additive a
  • 00:30:54
    second thing happens it's sustaining and
  • 00:30:57
    then you have a second Peak so you just
  • 00:30:59
    add it to that um there are nonlinear
  • 00:31:02
    ways to think about that as well but the
  • 00:31:04
    first step is linear so that was one of
  • 00:31:06
    the thinking one of the things that
  • 00:31:08
    we're thinking when we're talking to
  • 00:31:09
    people it's like
  • 00:31:12
    um for example uh like they were pushing
  • 00:31:18
    um there was a lot of work around Edge
  • 00:31:20
    sharing with a lot of remixes for a
  • 00:31:22
    particular song and the question was if
  • 00:31:25
    I do a remix and release it it on this
  • 00:31:28
    date and then wait a week and release it
  • 00:31:30
    on a second date and then wait a week
  • 00:31:32
    and release it on a third date now you
  • 00:31:33
    have three events is it better to do one
  • 00:31:36
    event or is it better to do three um
  • 00:31:39
    does that add to the first song or does
  • 00:31:42
    it just kind of Peter out so that that
  • 00:31:45
    was sort of the thinking behind that
  • 00:32:22
    that's a great question and I'll say um
  • 00:32:26
    for the music industry in particular
  • 00:32:28
    particular the actual methods mattered
  • 00:32:32
    less um than say the farm industry so at
  • 00:32:36
    the Pharma
  • 00:32:37
    level uh the the stakeholders were PhD
  • 00:32:42
    biostatisticians that we were talking to
  • 00:32:44
    that understood their methods and wanted
  • 00:32:46
    to know you know the questions he was
  • 00:32:49
    asking in the corner about like hey
  • 00:32:53
    about about like what you know what what
  • 00:32:55
    are these different is it a spline based
  • 00:32:58
    method is it what what do you have going
  • 00:33:00
    on in the model can you explain it to me
  • 00:33:03
    um here for the music industry most of
  • 00:33:07
    the people that make it to the top and
  • 00:33:08
    are decision
  • 00:33:10
    makers have a they're there because they
  • 00:33:14
    love music at some point um even though
  • 00:33:18
    it may not always seem that way but at
  • 00:33:20
    some point people love music and that's
  • 00:33:22
    how they got into it
  • 00:33:24
    um but they're not all statisticians
  • 00:33:26
    they're not all Machin learning people
  • 00:33:29
    they want to know um what you know how
  • 00:33:33
    these how Jack harlo was going to do why
  • 00:33:36
    because they wanted to to know whether
  • 00:33:38
    or not they should push more for Jack
  • 00:33:40
    harlo or do a leipa right they have a
  • 00:33:43
    budget problem they don't care how these
  • 00:33:46
    numbers came out which is why it goes
  • 00:33:48
    back to like most of the things in the
  • 00:33:51
    model um we kind of HD away from people
  • 00:33:54
    at that point it's there right we're not
  • 00:33:56
    we're not hiding it for the purpose of
  • 00:33:57
    hiding it we're hiding it for the
  • 00:33:59
    purpose of us talking to the
  • 00:34:00
    stakeholders on their terms and that's
  • 00:34:04
    what this model gave to us
  • 00:34:06
    um yeah I mean yeah we we tried talking
  • 00:34:09
    to them
  • 00:34:10
    about you know these populations that
  • 00:34:13
    didn't exist and they kind
  • 00:34:16
    of they don't want to hear that not in a
  • 00:34:20
    quick meeting
  • 00:34:21
    um but yeah in general I think it
  • 00:34:24
    depends you have to read the room
  • 00:34:29
    quick
  • 00:34:31
    question how far the long time horiz you
  • 00:34:34
    have
  • 00:34:39
    to um for for
  • 00:34:44
    the for music if I remember correctly it
  • 00:34:47
    was
  • 00:34:48
    um they had three different time
  • 00:34:51
    Horizons that they were looking at one
  • 00:34:53
    was a weekly time Horizon they want to
  • 00:34:55
    be on the charts um to be on charts
  • 00:34:58
    you're looking one week
  • 00:34:59
    ahead um the second is I think about two
  • 00:35:03
    months out because then you you get if I
  • 00:35:08
    remember right it's like the amount of
  • 00:35:10
    Revenue you get in the first two months
  • 00:35:12
    is really most of the revenue that most
  • 00:35:14
    artists will get so that's that's what
  • 00:35:16
    they were concerned with and whether or
  • 00:35:18
    not they can increase that value and
  • 00:35:20
    then third was like do you have a back
  • 00:35:23
    catalog that's worth anything so can you
  • 00:35:25
    have rights on this and like people were
  • 00:35:27
    thinking about at that point people were
  • 00:35:29
    thinking about buying and selling rights
  • 00:35:30
    to music
  • 00:35:35
    so do you like characteristics of the
  • 00:35:38
    song there was another team working
  • 00:35:40
    directly on the characteristics of the
  • 00:35:42
    song listening to or um looking at the
  • 00:35:45
    lyrics trying to see if the content made
  • 00:35:47
    a difference um they were breaking down
  • 00:35:50
    like beats for minute feel Vibes
  • 00:35:53
    whatever you want to call it um they
  • 00:35:56
    were trying to group similar songs
  • 00:35:58
    together to see if that had an effect
  • 00:36:01
    but I I worked on you know a different
  • 00:36:02
    team that was just looking at the the
  • 00:36:04
    streams
  • 00:36:06
    themselves um and yeah every other every
  • 00:36:09
    like 10th song would have some weird
  • 00:36:11
    blip like this so if you end up on a
  • 00:36:14
    random Spotify playlist and you're only
  • 00:36:16
    getting like let's say you're only
  • 00:36:17
    getting a th streams a week that might
  • 00:36:20
    boost you up to like 20,000 one week and
  • 00:36:23
    then you go back down to th but you know
  • 00:36:27
    for some like Jack harlo that doesn't
  • 00:36:28
    make any difference but for you know
  • 00:36:31
    someone with very few streams you get
  • 00:36:34
    random blips here and
  • 00:36:49
    there yeah the question was when do you
  • 00:36:51
    make the the projections um and yeah the
  • 00:36:55
    they were done pre-release um a lot of
  • 00:36:58
    it was asking pre-release a lot of the
  • 00:37:01
    questions that were sort of really hard
  • 00:37:03
    for us to get to by the time we're we
  • 00:37:05
    wrapped up the project was
  • 00:37:08
    um uh
  • 00:37:11
    like
  • 00:37:14
    how you have two different songs that an
  • 00:37:16
    artist wants to release how high are
  • 00:37:18
    they going to be in the maximum number
  • 00:37:19
    of streams for the first week and then
  • 00:37:21
    on top of that how do you keep an artist
  • 00:37:24
    at their Peak for many weeks at a time
  • 00:37:27
    is that through advertising and and it's
  • 00:37:29
    like it's really hard to tell what we
  • 00:37:31
    didn't have were marketing numbers
  • 00:37:34
    attached to these which made it really
  • 00:37:36
    hard to tell what interventions people
  • 00:37:38
    were doing like is it natural is it
  • 00:37:40
    spreading because people like the music
  • 00:37:43
    is it spreading because Spotify put it
  • 00:37:45
    on the top like radar you know whatever
  • 00:37:49
    playlist that is um so yeah anyway there
  • 00:37:53
    was a lot of
  • 00:37:55
    that maximum
  • 00:38:04
    um I think the the the Decay part um so
  • 00:38:12
    the reason why the maximum wasn't that
  • 00:38:14
    hard was like you could be off by 2X or
  • 00:38:18
    half and it's not going to matter in the
  • 00:38:20
    long scheme of things right it matters
  • 00:38:22
    whether or not you're going to make it
  • 00:38:23
    to the charts if you don't have that big
  • 00:38:25
    boom you don't get to top 40s
  • 00:38:29
    ever
  • 00:38:30
    but you know the the amount of time that
  • 00:38:33
    you stay relatively strong um there lots
  • 00:38:39
    of different behaviors that drive that
  • 00:38:41
    um for a for a song with low tracks you
  • 00:38:45
    can we've seen like a couple accounts
  • 00:38:48
    that have streamed it like as many times
  • 00:38:50
    as humanly possible which probably
  • 00:38:52
    indicates not a
  • 00:38:54
    human um but at at the bigger artist
  • 00:38:58
    that doesn't make a difference um so
  • 00:39:01
    it's like people are consuming it stream
  • 00:39:04
    at a time like stream 10 times a
  • 00:39:07
    day
  • 00:39:11
    um so that that's it like how do
  • 00:39:15
    you how do you keep something relevant
  • 00:39:19
    is a very hard question to
  • 00:39:21
    ask I don't know the answer to that I
  • 00:39:23
    Don't Know Remix remixes is definitely
  • 00:39:27
    one y was there a hand this side I was
  • 00:39:31
    ask
  • 00:39:34
    question were any
  • 00:39:36
    general
  • 00:39:38
    like insights one could
  • 00:39:43
    like take
  • 00:39:46
    away how
  • 00:39:54
    inrease there there wasn't um
  • 00:39:58
    um
  • 00:40:00
    and the reason I'm I'm saying that there
  • 00:40:03
    were there were lots of little takeaways
  • 00:40:05
    when you subset things into um different
  • 00:40:08
    bins
  • 00:40:09
    but
  • 00:40:12
    um it turns out dealing with
  • 00:40:14
    stakeholders is often difficult when
  • 00:40:17
    they're asking questions that are too
  • 00:40:19
    broad um and you don't have when you
  • 00:40:21
    don't have a lot of time with them it it
  • 00:40:23
    makes it even more difficult to try to
  • 00:40:25
    give them a truthful answer because
  • 00:40:27
    they'll they'll be asking stuff like
  • 00:40:30
    um you know for all the artists in this
  • 00:40:33
    genre what's going to happen it's
  • 00:40:36
    like is
  • 00:40:39
    that
  • 00:40:42
    EAS artist or a specific song so you
  • 00:40:46
    know so some of the really cool
  • 00:40:48
    questions that we dug into were um
  • 00:40:50
    there's this like sub genre that someone
  • 00:40:52
    was really interested in as an
  • 00:40:54
    anr and um so an is someone that that uh
  • 00:40:59
    scouts out new artists and try anyway
  • 00:41:03
    they were looking at the sub genre they
  • 00:41:05
    were trying to figure out whether or not
  • 00:41:07
    that had a positive trend because if
  • 00:41:09
    they did they would sign more artists to
  • 00:41:11
    that genre and try to make that genre
  • 00:41:13
    bigger if the genre gets bigger their
  • 00:41:15
    back catalog gets worth more like
  • 00:41:17
    everyone in that little
  • 00:41:19
    subfield um those were cool questions to
  • 00:41:22
    ask and we could start seeing a trend of
  • 00:41:25
    you know they're they're getting a
  • 00:41:26
    little more popular
  • 00:41:29
    um but yeah a lot of the questions that
  • 00:41:32
    they were asking were about these pop
  • 00:41:34
    stars um duppa lizos um Jack harlos of
  • 00:41:39
    the world who are massive and so when
  • 00:41:42
    you're talking like it was actually
  • 00:41:44
    pretty easy to say like Jack Carlo did
  • 00:41:46
    four million streams on um was that
  • 00:41:50
    Spotify right that first week if you
  • 00:41:53
    looked at his last release it was about
  • 00:41:55
    4 million and I could tell you that the
  • 00:41:57
    next release he's going to do is going
  • 00:41:59
    to be about 4 million
  • 00:42:02
    um I'm good there it's it's really like
  • 00:42:06
    you know they want to know how's the
  • 00:42:08
    album going to do and it's like I don't
  • 00:42:11
    know people are releasing music piece at
  • 00:42:13
    a time there this song is really popular
  • 00:42:16
    the 12th cut on that
  • 00:42:19
    album who's listening to the whole album
  • 00:42:21
    these days like you know I do but
  • 00:42:30
    I don't I have recogniz these artists at
  • 00:42:32
    all completely that's all
  • 00:42:36
    right that's all right
  • 00:42:39
    um yeah I started I was excited to work
  • 00:42:42
    with worm music because i' I've been a
  • 00:42:44
    DJ for most half more than half my life
  • 00:42:48
    and um just kind of knowing about
  • 00:42:51
    music's really cool um but then digging
  • 00:42:55
    into it it gets really messy
  • 00:42:57
    especially when it came to like
  • 00:43:00
    um You' think that you'd have metadata
  • 00:43:03
    control of your own metadata and they
  • 00:43:05
    just don't and it's wild too busy
  • 00:43:08
    po POA is
  • 00:43:11
    gone um you
  • 00:43:13
    know
  • 00:43:16
    supposedly is immedate or
  • 00:43:22
    netive oh yeah so what they were trying
  • 00:43:24
    to do was actually um
  • 00:43:27
    if you think about it as your record
  • 00:43:30
    label you have a number of
  • 00:43:32
    artists uh two things you can do you can
  • 00:43:35
    stagger releases right so instead of
  • 00:43:38
    everyone competing for the same uh user
  • 00:43:40
    base the question is whether or not
  • 00:43:43
    releasing two things at the same time is
  • 00:43:44
    better for you or Worse first of all it
  • 00:43:46
    could be better people might listen to
  • 00:43:48
    more of your music in
  • 00:43:50
    aggregate um but the second thing is
  • 00:43:53
    where does a marketing budget go if you
  • 00:43:55
    got someone with 4 million streams and
  • 00:43:57
    you could boost that to 5 million
  • 00:43:59
    streams that's a lot better than someone
  • 00:44:02
    with 10,000 streams and boosting them to
  • 00:44:05
    12,000 streams right and so they had
  • 00:44:08
    that question that was top of mind for
  • 00:44:13
    them
  • 00:44:17
    dat yeah we didn't get to that though at
  • 00:44:20
    the at this point we were interested in
  • 00:44:23
    so many things but you know
  • 00:44:36
    um the the honest answer to that was um
  • 00:44:41
    at the at the major record label they're
  • 00:44:44
    interested in the the very first few
  • 00:44:48
    weeks they want to know the peak and the
  • 00:44:51
    first few weeks that they're they're
  • 00:44:52
    going to be there's there's a very big
  • 00:44:55
    difference between
  • 00:44:58
    not charting and charting yeah so if you
  • 00:45:01
    can chart you win if you don't chart
  • 00:45:03
    you're in a secondary class of music so
  • 00:45:08
    at the record label um not for the
  • 00:45:10
    musician the musician wants to release
  • 00:45:13
    good music and be heard but if you're
  • 00:45:15
    the record label there's you know a lot
  • 00:45:18
    of things that happen once you get to
  • 00:45:21
    that level um so the question is always
  • 00:45:24
    like if you're close how do you
  • 00:45:27
    get
  • 00:45:32
    there
  • 00:45:34
    um the we were using things like artist
  • 00:45:38
    um past
  • 00:45:39
    releases um similarity in terms of uh
  • 00:45:42
    genre and different things so we're
  • 00:45:45
    doing okay
  • 00:45:50
    but we were plugging that stuff into
  • 00:45:54
    yeah we were we were trying to scrape
  • 00:45:56
    like Twitter follow followers insta
  • 00:45:57
    followers the thing that we didn't have
  • 00:45:59
    at that time because it was too early
  • 00:46:01
    was Tik Tok like we didn't have control
  • 00:46:04
    we didn't have a lot of Tik Tok data but
  • 00:46:06
    Tik Tok was blowing up at the
  • 00:46:08
    time so that that would have been really
  • 00:46:11
    interesting to follow like some of these
  • 00:46:14
    drivers of like these other other tracks
  • 00:46:16
    that you would see that just are flat
  • 00:46:18
    and and pop up all of a sudden a lot of
  • 00:46:21
    that was coming from Tik Tok and if we
  • 00:46:23
    were tracking Tik Tok we would have
  • 00:46:24
    known but we just looking at the um dsps
  • 00:46:29
    themselves we didn't have an indication
  • 00:46:32
    of
  • 00:46:36
    that I got
  • 00:46:38
    online there's a few questions online
  • 00:46:40
    but might just be more jokes check out
  • 00:46:43
    of them
  • 00:46:44
    yeah how many songs are time coded at
  • 00:46:47
    the Unix EP
  • 00:46:56
    other questions we still have a little
  • 00:46:58
    bit of time is there is there any like
  • 00:47:01
    motivation to clean up the
  • 00:47:04
    metadata yeah I I spent a lot of time I
  • 00:47:07
    was motivated yeah in terms of like do
  • 00:47:11
    they see that as like a bus opportunity
  • 00:47:15
    and you know they work with say
  • 00:47:19
    theps to to make that better or is there
  • 00:47:23
    there some business reason that the
  • 00:47:25
    metad suck
  • 00:47:28
    um I think there's always a business
  • 00:47:30
    reason metadata sucks which is probably
  • 00:47:33
    the business doesn't value enough um I I
  • 00:47:36
    think the truth is that the uh record
  • 00:47:38
    industry is a very old industry
  • 00:47:41
    relatively um everything was built
  • 00:47:43
    around physical sales so the inventory
  • 00:47:46
    system the you know the numbering of
  • 00:47:48
    these things the it was all based on
  • 00:47:50
    physical counting of like I ship a crate
  • 00:47:55
    of Records to to a record store and
  • 00:47:58
    after a month none of them came back we
  • 00:48:01
    assume they're
  • 00:48:02
    sold yeah and then we'll send them some
  • 00:48:04
    more
  • 00:48:06
    um you know and and the the way the
  • 00:48:11
    music industry was set up is um like
  • 00:48:14
    Warner Records has a bunch of sublabels
  • 00:48:17
    under it and they all operate their own
  • 00:48:20
    piano um as far as I could tell and that
  • 00:48:24
    meant they had their own marketing
  • 00:48:25
    department that meant that there were
  • 00:48:26
    data scientists working you know imagine
  • 00:48:30
    like hundreds of data scientists working
  • 00:48:32
    off of the same data and learning to
  • 00:48:35
    collect it and process it in different
  • 00:48:36
    ways so even if we asked two sets of
  • 00:48:39
    people or three sets of people to look
  • 00:48:40
    at the same thing and ask about the
  • 00:48:43
    results of this thing we get widely
  • 00:48:45
    different forecasts just because we're
  • 00:48:47
    counting things
  • 00:48:48
    differently um so this project was done
  • 00:48:52
    under the the the record uh the head
  • 00:48:55
    record label the the central place
  • 00:48:59
    um but yeah the the metadata cleanup is
  • 00:49:03
    I
  • 00:49:03
    think it you know there just too many
  • 00:49:06
    cooks in the kitchen and it's it's hard
  • 00:49:08
    to fix
  • 00:49:10
    that I don't know do the dsps have
  • 00:49:16
    better
  • 00:49:18
    I the dsps have
  • 00:49:22
    metadata they know what they have um
  • 00:49:26
    they're motiv they're using it to like
  • 00:49:31
    to push stuff yeah I I don't know I
  • 00:49:34
    maybe you can get someone from Spotify
  • 00:49:36
    out here to talk about the internals but
  • 00:49:41
    um yeah I'm sure it's it's very similar
  • 00:49:44
    my guess is that they they just need to
  • 00:49:46
    count and for compliance reasons count
  • 00:49:48
    how many times something was streamed so
  • 00:49:50
    that they can send royalties back and
  • 00:49:52
    they're not concerned with you know who
  • 00:49:55
    they pay is like the individual artist
  • 00:49:58
    is not on them they pay
  • 00:50:00
    the the record label and let the record
  • 00:50:03
    it's you know just gets passed down
  • 00:50:06
    sound
  • 00:50:08
    and Arbitron or whatever it is they took
  • 00:50:10
    care of a lot of these PRS back in
  • 00:50:12
    the90s so yeah they probably like
  • 00:50:14
    cleaner data which they for but it's
  • 00:50:17
    like everything else if you stop
  • 00:50:19
    cleaning your data it gets messy real
  • 00:50:21
    quick yeah so clean your data everyone
  • 00:50:28
    okay any other
  • 00:50:30
    questions there are any L
  • 00:50:33
    questions then if there's no other
  • 00:50:35
    questions um did you see that Lego
  • 00:50:37
    released a turntable yeah I did yeah so
  • 00:50:40
    it's perfect for you a little tiny Lego
  • 00:50:41
    turntable but this big it looks a lot
  • 00:50:43
    bigger than the picture but it's perfect
  • 00:50:44
    for him you need to get that all right
  • 00:50:47
    well thank you very much thank you
  • 00:50:49
    for connect with me if you want um I'm
  • 00:50:53
    around I'm in New York he just had a
  • 00:50:56
    baby maybe 3 months ago though so you
  • 00:50:58
    know he's going to go fall
  • 00:51:01
    over okay with that um again you you
  • 00:51:06
    filled in as of like last night so thank
  • 00:51:08
    you for pulling this together I know you
  • 00:51:09
    had this ready for the nyr conference
  • 00:51:11
    but always great to have you always
  • 00:51:12
    always great to have you but also thank
  • 00:51:13
    you very much for doing this uh Dan's
  • 00:51:15
    been a member of this community for 15
  • 00:51:19
    years probably let's go 15ish right
  • 00:51:21
    towards near the beginning um so awesome
  • 00:51:25
    then I guess we will say we will go to
  • 00:51:28
    the bar before we go to the bar remember
  • 00:51:30
    uh thank you to NYU of course for making
  • 00:51:32
    this happen and next month December 3rd
  • 00:51:35
    we'll get that announced tomorrow
  • 00:51:37
    hopefully it'll be right George we can
  • 00:51:38
    have announced tomorrow you think we'll
  • 00:51:39
    have a room so we'll have that announced
  • 00:51:40
    tomorrow or Thursday he said he could
  • 00:51:42
    probably by tomorrow probably by
  • 00:51:43
    tomorrow but maybe Thursday we'll get
  • 00:51:44
    that announced and uh that'll be
  • 00:51:46
    December 3rd somewhere here ATU campus
  • 00:51:48
    likely this building maybe even this
  • 00:51:50
    room we'll see and then we'll announce
  • 00:51:52
    January February March and April as we
  • 00:51:54
    go we have things lined up we'll get
  • 00:51:56
    there the r and government videos will
  • 00:51:58
    be up online by Thanksgiving so you
  • 00:52:00
    could enjoy those um when you're eating
  • 00:52:02
    too much we're going to pack up here uh
  • 00:52:05
    and we're going to go over to Malt House
  • 00:52:06
    and have a post meet up beverage to talk
  • 00:52:09
    more to have a what a malt a malt yes
  • 00:52:11
    you get all malt balls right so enjoy
  • 00:52:14
    that thank you all for coming and I'll
  • 00:52:15
    see you next month
  • 00:52:17
    [Applause]
标签
  • statistiques
  • modèles bayésiens
  • musique
  • streams
  • data science
  • métadonnées
  • prévision
  • analyse
  • DSP
  • communication