My Favorite Model with Daniel Lee - nyhackr November Meetup
Résumé
TLDRLa réunion mensuelle initialement prévue pour être légère a été détournée vers une discussion technique sur les modèles statistiques, particulièrement ceux d'inférence bayésienne, appliqués dans l'industrie musicale. Le conférencier, Daniel, a expliqué comment il utilise ces modèles pour prévoir les streams de musique en ligne via le service DSP dans différents pays. En utilisant Stan, un langage de programmation probabilistique, il calque les streams de musique sur le modèle d'une enveloppe musicale, ce qui facilite la communication des résultats aux professionnels de la musique. De plus, il aborde les nombreux défis posés par les métadonnées incorrectes des morceaux de musique. Enfin, les rencontres se terminent souvent par une réunion sociale au Malt House qui permet à chacun de discuter des sujets de la réunion de manière plus informelle.
A retenir
- 📅 La réunion prévue pour être légère s'est centrée sur des discussions techniques.
- 📊 Daniel utilise Stan pour des modèles statistiques dans l'industrie musicale.
- 🎶 Il utilise un modèle d'enveloppe musicale pour prédire les streams.
- 📉 Les modèles aident à comprendre la progression et le déclin des streams.
- 🌍 Ils analysent les données par pays via les services DSP.
- 🔄 Les métadonnées incorrectes sont un défi important.
- 🔍 La compréhension du comportement utilisateur diffère entre Apple et Spotify.
- 👥 La communication des résultats techniques à des non-experts est cruciale.
- 🍕 La réunion se termine par une rencontre sociale au Malt House.
- 🔢 Les modèles aident à gérer les sorties et le marketing de musique.
Chronologie
- 00:00:00 - 00:05:00
La discussion commence de manière légère mais devient plus sérieuse avec une discussion sur les statistiques bayésiennes et les offres d'emploi.
- 00:05:00 - 00:10:00
Il est question de participer à un canal Slack pour publier des offres d'emploi, et la nourriture est évaluée lors de la rencontre.
- 00:10:00 - 00:15:00
Les remerciements vont à NYU pour leur soutien à l'événement, tandis qu'on discute des vidéos à venir et des prochains événements mensuels.
- 00:15:00 - 00:20:00
L'orateur remplaçant parle des platines tourne-disques, de leurs caractéristiques et de ses préférences personnelles.
- 00:20:00 - 00:25:00
Il partage son expérience avec Stan, un langage open-source pour la modélisation probabiliste, et décrit divers modèles statistiques qu'il a utilisés.
- 00:25:00 - 00:30:00
L'orateur explique un modèle statistique préféré pour prédire les écoutes de musique, en utilisant des enveloppes sonores pour la modélisation.
- 00:30:00 - 00:35:00
Le modèle relationne les écoutes de musique avec l'évolution des enveloppes ADSR, ce qui aide à prévoir les flux futurs en tenant compte des tendances actuelles.
- 00:35:00 - 00:40:00
L'orateur montre les résultats prévisionnels et discute des défis liés à des facteurs non représentés dans les données, comme les tournées d'artistes.
- 00:40:00 - 00:45:00
Les avantages de reformuler le modèle en termes simples et compréhensibles pour ceux qui ne connaissent pas les statistiques, sont mis en valeur.
- 00:45:00 - 00:52:29
Une discussion dynamique consiste en des réponses aux questions de l'audience sur l'application des modèles, la motivation, et les défis de l'industrie musicale.
Carte mentale
Questions fréquemment posées
Que fait le conférencier Daniel?
Daniel travaille sur l'estimation et la projection des compétences des joueurs de sport, notamment dans la NBA, en utilisant des modèles statistiques.
Quel sujet a été ajouté à la discussion ce mois-ci?
Au lieu d'une discussion légère, un sujet technique sur les modèles bayésiens et statistiques a été abordé.
Pourquoi le conférencier remplaçant était-il nécessaire?
Il y a eu une annulation tardive, donc un remplaçant a été nécessaire pour maintenir la session.
Quel modèle statistique est son préféré?
Son modèle préféré est celui utilisé pour prédire les streams de musique en ligne, car il est intuitif pour les professionnels de la musique.
Quels défis Daniel rencontre-t-il avec les métadonnées musicales?
Les métadonnées incorrectes sont un problème majeur, car elles affectent la précision des modèles de prévision de streams.
Y a-t-il eu un changement de thème prévu pour cette réunion?
Oui, un thème léger était planifié, mais a été remplacé par une discussion technique détaillée.
Comment se termine généralement la réunion?
Les participants ont tendance à se retrouver à Malt House pour continuer les discussions de manière informelle après la réunion.
Quel est le but principal du modèle développé par Daniel?
Estimer et prédire les streams de musique en fonction de divers facteurs comme le releasement de nouveaux morceaux et le marketing.
Voir plus de résumés vidéo
- 00:00:00so we're going going so it is now
- 00:00:02November right okay uh we promised you a
- 00:00:06lighthearted talk for this month but it
- 00:00:07all came crashing
- 00:00:09down he's right there no he's cool he's
- 00:00:12going he's okay physically fine he's in
- 00:00:14the room uh and we're going to try to
- 00:00:16get him to speak again sometime next
- 00:00:18year so you know we all be excited for
- 00:00:19when that happens so we're going to make
- 00:00:21that happen right cool absolutely all
- 00:00:23right so so we're going to have a deep
- 00:00:25heavy statistical nerdy talk today then
- 00:00:28all right so anyone here is expecting
- 00:00:30fuzzy monsters it's not going to happen
- 00:00:32we're going to get a different type of
- 00:00:32monster a basian monster that's the
- 00:00:35stuff in nightmares all right so first
- 00:00:37things first though jobs is Anybody
- 00:00:39hiring this
- 00:00:41month Anybody
- 00:00:43hiring no okay is anyone
- 00:00:48firing yeah it seems like everyone is
- 00:00:51right okay well in case anyone who is
- 00:00:53watching virtually is hiring you know
- 00:00:56you can go to NY the NY hackr slack if
- 00:00:58you want to find that you go to NY hackr
- 00:01:00.org and click on the slack hashtag and
- 00:01:03then you could go to the job postings
- 00:01:04Channel and post your job I think a job
- 00:01:06was posted there fairly recently and
- 00:01:08over the past 15 years we've gotten
- 00:01:10numerous people jobs at this meet up
- 00:01:12asking this question used to have a lot
- 00:01:14more people hiring but hopefully that
- 00:01:15will pick up again soon so if you want a
- 00:01:17job check out the job postings Channel
- 00:01:18if you want to hire someone post it in
- 00:01:20the job postings Channel all right
- 00:01:23second order of business the pizza today
- 00:01:25is from Cello's pizza so everyone is in
- 00:01:28the room get out your phone go to bit.
- 00:01:31lee/ pizzle and rank it on the fiveo
- 00:01:34scale and let us know how it is I see a
- 00:01:37bunch of people eating but not taking
- 00:01:38out their phones I will start calling
- 00:01:40you out George all right you weren't
- 00:01:43eating but you I saw you sitting there
- 00:01:45without your phone everyone here great
- 00:01:46great great if you're watching from home
- 00:01:49uh go in the chat and let us know what
- 00:01:50you're eating food is a big part of this
- 00:01:52me not food pizza is a big part of this
- 00:01:54Meetup but if you made poor life choices
- 00:01:56and you don't have pizza let us know
- 00:01:57what you're eating uh if if you want to
- 00:02:00chat there are multiple ways to chat you
- 00:02:01could chat right in the YouTube live
- 00:02:03stream which you're watching or you
- 00:02:05could chat in the NY hackr slack at the
- 00:02:08monthly Meetup chat channel so go there
- 00:02:11or go on YouTube and tell us what you're
- 00:02:13eating or if you have questions from the
- 00:02:15speaker and you're watching virtually
- 00:02:17later on you can go in there and we will
- 00:02:20compile the questions asked virtually
- 00:02:22and we will ask them intermingled with
- 00:02:24the inperson
- 00:02:27questions while everyone is voting I
- 00:02:29will give a big big thank you to NYU
- 00:02:31prism and George for making this happen
- 00:02:33every month so everyone big round of
- 00:02:34applause for George and NYU
- 00:02:37prism so we we have a big thank you to
- 00:02:40him for both enabling the space and
- 00:02:42enabling the projector on the screen
- 00:02:44that comes from George well he gets a
- 00:02:46special shout out for that uh last month
- 00:02:49we had the r in government conference
- 00:02:51the videos for that will be up ready for
- 00:02:54everyone to view within the week it'll
- 00:02:56be a nice Thanksgiving treat when you're
- 00:02:57home avoiding your family you can watch
- 00:02:59those videos
- 00:03:00instead so we'll announce that when that
- 00:03:02comes
- 00:03:03out next month's meet up will be
- 00:03:06December 3rd and it will be about
- 00:03:09population density of different cities
- 00:03:12so that'll be a nice urban planner thing
- 00:03:14then in January George will be giving a
- 00:03:17talk about some something controversial
- 00:03:20statistical theory he has it'll make all
- 00:03:23the causal inference people go
- 00:03:25nuts with these one three cool
- 00:03:28tricks right
- 00:03:30all right but that'll be George speaking
- 00:03:31January then in February Eric will be
- 00:03:34hopefully giving that talk and if not
- 00:03:36that talk some other talk you have
- 00:03:38another talk you'll have to give instead
- 00:03:39if you can't give that one all right
- 00:03:41cool cool cool then in March will'll be
- 00:03:43vickram cool then in April we're going
- 00:03:45to have a talk about assuming we can
- 00:03:47confirm this with them about uh mapping
- 00:03:50they they have a new thing called Subway
- 00:03:51stories it's about mapping you know
- 00:03:53people getting on and off the train oh
- 00:03:55where they start where they stop
- 00:03:56remember a few months ago at an MTA
- 00:03:58person come talk they have a model for
- 00:03:59where people get off these people are
- 00:04:01using that data to make visualizations
- 00:04:04so it's a very nice tie into our
- 00:04:05previous talk all right then after the
- 00:04:08talk today we are going to go to Malt
- 00:04:11House George came up with this he said
- 00:04:13it's a great
- 00:04:15place it's bad it's Jared's fa there you
- 00:04:17go but Nicole said she likes it so we're
- 00:04:19going to trust her more than George
- 00:04:21right and we're going to go to Mt house
- 00:04:22which is located at where is it located
- 00:04:25I think it's on Thompson Thompson a
- 00:04:27little below the park you all have
- 00:04:28Google Maps Malt House find it we're
- 00:04:31gonna go there hopefully it'll be good
- 00:04:32it's good to try a new place um so we'll
- 00:04:34do that and anyone is watching virtually
- 00:04:36in New York City we've had people show
- 00:04:37up at the bar after the talk so come by
- 00:04:39Thompson Thompson and what uh Thompson
- 00:04:41and bleer Thompson and bleer all right
- 00:04:43folks Thompson and bleer that's where
- 00:04:44we're heading all right oh that's right
- 00:04:46by oh Fior is gone but it's right by um
- 00:04:49arturos which we've had for the mut up
- 00:04:50before closer it's closer oh closer than
- 00:04:52that all right cool cool cool oh yeah
- 00:04:53that's on house thank you for correcting
- 00:04:55thank you all right anyway so we had a
- 00:04:58cancellation today so we're not learning
- 00:04:59about pix we want to have a big thank
- 00:05:01you for our fill in speaker who we
- 00:05:04literally filled in last night around
- 00:05:0610:30 at night I think we got him to
- 00:05:07fill in so everyone both thank him and
- 00:05:09just welcome to the stage welcome
- 00:05:12[Applause]
- 00:05:17Daniel come on
- 00:05:20down and a big round of applause
- 00:05:21seriously it was
- 00:05:26great I'm going to disappoint Jared this
- 00:05:29talk is really light-hearted it's not
- 00:05:31going to be tight I figured it's a long
- 00:05:34day it's uh what day is it today Tuesday
- 00:05:37um yeah so I'm going to be talking about
- 00:05:40turntables um on the left here we have
- 00:05:44the Technic 1200 Mar 2s and on the right
- 00:05:48we have the m3ds and I'm going to tell
- 00:05:51you about my favorite model here um if
- 00:05:54you don't know these were the Workhorse
- 00:05:56turntables used in all clubs for many
- 00:05:59decades
- 00:06:00they're awesome um but there was a major
- 00:06:03Improvement that happened in
- 00:06:0597 um that little onoff switch right
- 00:06:08there and the difference is really
- 00:06:11subtle but on the left the whole top
- 00:06:14turned right so you can easily hit it
- 00:06:17and the hold top turn but on the right
- 00:06:19had this little casing that went all the
- 00:06:21way to the top so you actually had to be
- 00:06:23um intentional about turning it there
- 00:06:26were a couple other improvements here
- 00:06:28the the pitch fader is different where
- 00:06:30there's a little um depression in the
- 00:06:33middle on the The Mark 2s um which made
- 00:06:36it it's like a rounding error if you get
- 00:06:37close to zero it's going to just go to
- 00:06:39zero which is not good for trying to mix
- 00:06:42but this is really the thing um and so
- 00:06:46my favorite model is the mark II's the
- 00:06:49m3ds just because of
- 00:06:52usability and that's probably not why
- 00:06:54you're
- 00:06:55here all right I'm I'm going to talk to
- 00:06:58you about my favorite statistical model
- 00:07:02um just to give you some background I've
- 00:07:05worked on Stan for a long time uh stands
- 00:07:08an open source probabilistic programming
- 00:07:10language it's known for basing inference
- 00:07:12but you can use it for other things uh
- 00:07:14it's open source is used the cross
- 00:07:16Industries um there are all these uh
- 00:07:18secondary packages that make it easier
- 00:07:20to use rstan rstan Arm brms um Pi stand
- 00:07:25command stand Pi uh if you look at sort
- 00:07:28of the r thing this by the way this talk
- 00:07:30was prepared as a backup to the nyr so
- 00:07:33this is a bunch of R type of stuff
- 00:07:37here um I've been working with Stan for
- 00:07:40about 13 years um did a lot of the
- 00:07:43community building much like Jared has
- 00:07:45been doing here online meetups in New
- 00:07:47York City uh given a lot of talks um
- 00:07:52some of the different models that I've
- 00:07:54worked on with Stan and you could ask me
- 00:07:56questions about any of this stuff and
- 00:07:57I'll I'll be happy to answer them or
- 00:07:59what I can
- 00:08:01um Quality Control across different
- 00:08:04sites for Diamond rating um there's
- 00:08:07price elasticity um for Consumer
- 00:08:11consumer pricing uh mixed medium
- 00:08:14modeling is pretty big uh if you're into
- 00:08:17pharmaco kinetics or pharmacodynamics
- 00:08:19Stan is used a lot there um one of the
- 00:08:23things is that this those types of
- 00:08:25models were the hardest for me to
- 00:08:26implement sorry most effort to implement
- 00:08:29I wouldn't say hardest but it it
- 00:08:30definitely took a lot of time to
- 00:08:31implement so working with ordinary
- 00:08:33differential
- 00:08:35equations um survival models um that was
- 00:08:40the hardest math it took months of
- 00:08:43pencil and paper to do integrals so that
- 00:08:45I didn't have to put it into Stan uh you
- 00:08:48know actual neural Nets using a brain
- 00:08:51epilepsy model
- 00:08:54um my current job is estimating and
- 00:08:57projection skills of sports players um
- 00:09:00right now I'm working in the NBA so
- 00:09:02that's what I
- 00:09:03do uh and we'll talk about my favorite
- 00:09:06model uh this work was not done alone
- 00:09:10this was was done while at Warner Music
- 00:09:12and with a team of great people um so
- 00:09:16these are the people that I worked with
- 00:09:17at the time and what we were trying to
- 00:09:20do was estimate uh for any track a music
- 00:09:24track estimate how many streams you
- 00:09:26would get for the track in the future um
- 00:09:29some of the reasons that we care the
- 00:09:31first is it drives Revenue the more
- 00:09:33streams you have the more money you have
- 00:09:34great um the second thing that people
- 00:09:37were interested in was how can we affect
- 00:09:41the stream counts can we do anything to
- 00:09:43boost it can we um you know advertise a
- 00:09:47little
- 00:09:48more um so what we were trying to do was
- 00:09:51forecast and estimate these tracks into
- 00:09:54the future and what we're interested in
- 00:09:57is by what they call DSP which is a
- 00:09:59digital service provider that's Apple
- 00:10:01music uh
- 00:10:03Spotify dieser um there're a bunch of
- 00:10:06them and we we were also interested in
- 00:10:09by country um some of the things that
- 00:10:11people were interested in were if it
- 00:10:14takes off in the US is Europe going to
- 00:10:16follow later if it if it takes off in
- 00:10:18Paris in France does it go to Belgium
- 00:10:21next where where's it
- 00:10:23going
- 00:10:24um so let's take a guess can you guess
- 00:10:28where these songs are going to go based
- 00:10:30on the Stream So the stream counts are
- 00:10:32on are going
- 00:10:34vertical uh Spotify is in blue Apple
- 00:10:37music is in green and Amazon is in Red
- 00:10:41so if you take a look this is a this is
- 00:10:43an afro afro beats type of song here and
- 00:10:49you know it's going there Jack harlo
- 00:10:51first class if you're into like popish
- 00:10:54hip-hop stuff that was released here and
- 00:10:57it's it's going down
- 00:11:00um so take a second in your mind try to
- 00:11:04take a guess of where it's going to go
- 00:11:06I'll show you what what actually
- 00:11:13happen
- 00:11:15oh I'll I'll show you in a little
- 00:11:18bit oh actually it's right here okay
- 00:11:22cool so yeah um if you look you'll see
- 00:11:27this sort of pattern it goes up and
- 00:11:29comes down pretty easy except for mother
- 00:11:32mother halaf 2 right and you'll see that
- 00:11:35it was going down and then something
- 00:11:37happened and you got a lift um some
- 00:11:40other things to note are the Y AIS here
- 00:11:45Jack harlo is in the 4 million at the
- 00:11:47top range uh CK was at 300K right so we
- 00:11:53have completely different types of
- 00:11:55artists here that are represented in
- 00:11:57this data set
- 00:12:00um the forecasting
- 00:12:02model so this model um like I said it's
- 00:12:07my favorite model here is representing
- 00:12:10the streams of music as or streams of um
- 00:12:13stream counts as music
- 00:12:16envelopes so how many of you play
- 00:12:19instruments and have seen something like
- 00:12:21this
- 00:12:24before
- 00:12:25okay so um all right I'll take a second
- 00:12:29to explain what's going on um if you
- 00:12:31imagine this is the envelope of sound um
- 00:12:34amplitude goes up that's how loud a
- 00:12:36sound is and as it comes back down it's
- 00:12:38it gets quieter if you play a note on a
- 00:12:40piano you press it that that's what
- 00:12:43happens here it's going to get louder
- 00:12:46right and then piano is is kind of
- 00:12:47percussive so it gets loud really fast
- 00:12:49if you play a different instrument it's
- 00:12:50going to get loud over time um so that
- 00:12:53that happens here it gets to a maximum
- 00:12:56and as you hold it it's going to Decay a
- 00:12:58little bit and sorry it's going to Decay
- 00:13:00a little bit naturally and then as you
- 00:13:02hold it it's going to
- 00:13:03sustain and then at some point you're
- 00:13:05going to let go of the the key and it's
- 00:13:07going to release and you know this this
- 00:13:09pattern happens differently with
- 00:13:11different musical instruments um you can
- 00:13:13do this with electronic instruments as
- 00:13:15well so asdr attack Decay sustain
- 00:13:21release for our forecasting model what
- 00:13:24we're trying to do was estimate from the
- 00:13:26data first of all a delay so between the
- 00:13:30time that you release a song um we're
- 00:13:33going to estimate the time before it
- 00:13:36starts ramping up why are we trying to
- 00:13:39estimate that part of the problem is
- 00:13:41there's incorrect data it'll say that a
- 00:13:44song was released a week prior two weeks
- 00:13:47prior to when it actually was released
- 00:13:49on the
- 00:13:50platforms um we're trying to we're
- 00:13:52trying to estimate this maximum the
- 00:13:54distance between zero and the maximum in
- 00:13:56the number
- 00:13:57streams uh we want to know the sustain
- 00:14:01amplitude here so how high it stays
- 00:14:05pretty
- 00:14:06flat um we also want to know the attack
- 00:14:09duration right so how long does it take
- 00:14:11to get to its peak and then the Decay
- 00:14:13duration how long does it go from its
- 00:14:15peak back down to a place where it just
- 00:14:18kind of flat lines for a while um in our
- 00:14:21model we we ignore this release part
- 00:14:24because if you look back at
- 00:14:26the sorry if you look back at the
- 00:14:31data um most of these songs if you look
- 00:14:35further down in time especially a a big
- 00:14:38pop song like first class you'll see
- 00:14:40that it just stays people listen to it
- 00:14:43at about the same rate for a long time
- 00:14:46so we can ignore
- 00:14:48this
- 00:14:50um so here's here's how the forecasting
- 00:14:53model does right so if we truncated the
- 00:14:56data at this dotted line and so we
- 00:14:58didn't show show the model we didn't
- 00:15:00show the the model the data after that
- 00:15:03it kind of estimates things like that
- 00:15:05and how does it do that it's building
- 00:15:07out this model for this attack Decay um
- 00:15:12and you know something that's not uh
- 00:15:15captured in the data and this is why
- 00:15:17mother mother halaf 2 is weird is
- 00:15:21because mother mother went on a global
- 00:15:23tour right here and made another release
- 00:15:27so they they got a bump because they did
- 00:15:30something right and just looking at the
- 00:15:33streams absent of that context you're
- 00:15:35going to miss all
- 00:15:41that all right
- 00:15:45so streams as music envelop once again
- 00:15:48we're estimating from the data the Decay
- 00:15:50which is not pictured a Max a sustain
- 00:15:52amplitude intact duration Decay
- 00:15:55duration all right um
- 00:15:59question is why is this particular model
- 00:16:01my favorite and we kind of have to look
- 00:16:03at the alternative here so I could have
- 00:16:08described this as a compartmental model
- 00:16:10in epidemiology and try to explain this
- 00:16:12to a musician or musician turn music
- 00:16:16executive right this exe this resembles
- 00:16:18a sir model how many are you are
- 00:16:21familiar with this sort of model oh cool
- 00:16:24so we got a bunch of people in here that
- 00:16:26are familiar with this model but not you
- 00:16:29know not music envelopes right so within
- 00:16:33this community I can talk about you know
- 00:16:35you have a population of susceptible
- 00:16:37people um that's the population of
- 00:16:39people that listen to this type of music
- 00:16:41right and then you get the Infectious
- 00:16:43and they're going to start spreading out
- 00:16:45who listens to models oh one thing I
- 00:16:49should mention is that if you look at
- 00:16:51the Infectious curve that sort of looks
- 00:16:54like what we were looking at right so
- 00:16:56the the two other curves are are sort of
- 00:16:58hidden for from us our
- 00:17:00latent um we could describe this as an
- 00:17:02ordinary differential equation right so
- 00:17:05it takes a little bit of
- 00:17:07math
- 00:17:09um and what we're trying to estimate
- 00:17:12here are two different rate variables
- 00:17:14that kind of describe that'll give
- 00:17:16different shapes to this eye
- 00:17:18curve we're trying to estimate these
- 00:17:20population sizes um and we're we'll end
- 00:17:24up talking about derivatives and rates
- 00:17:26of change right that that comes natural
- 00:17:30from these like these equations here
- 00:17:33it's because of the way the math is
- 00:17:36um uh another thing that comes naturally
- 00:17:39out of a model like this is half lives
- 00:17:41like when do you get to half of the the
- 00:17:44population and you're interested in
- 00:17:46sometimes steady States so if you have a
- 00:17:49more complex version of this sort of
- 00:17:50model you talk about steady state
- 00:17:54Behavior so the reason why this
- 00:17:57particular model model is my favorite is
- 00:18:00because I was able to reframe the model
- 00:18:02into something that's intuitive to
- 00:18:04people in the music
- 00:18:06industry um it's easier to communicate
- 00:18:09for them right so I can talk about if I
- 00:18:12talk to a musician about here's a Max
- 00:18:14amplitude here's the max number of
- 00:18:15streams that I expect to get here's the
- 00:18:17the Decay here's the time to take it'll
- 00:18:20take for it to like be
- 00:18:22steady um so I can communicate all that
- 00:18:25and the the switch and
- 00:18:27nomenclature test something that's very
- 00:18:29familiar to the Musicians is um and and
- 00:18:33communicate
- 00:18:35communicable was
- 00:18:37um was really the the key and so much
- 00:18:42like you know the it's it's a little
- 00:18:44subtle change in usage right it's it's
- 00:18:47not the model is any different we could
- 00:18:49have done the same thing with the sir
- 00:18:52model here but by changing and reframing
- 00:18:54it we get something that's really uh
- 00:18:57useful in communicated um useful for the
- 00:19:06purpose I ran through those slides
- 00:19:09really fast Jared so first I'll take
- 00:19:12questions sure let me see this question
- 00:19:15line any questions from the crowd here
- 00:19:19so are the is is there some view of the
- 00:19:22two models that they're potentially
- 00:19:24equivalent under the hood or can be yeah
- 00:19:26they they can be so um
- 00:19:29um I've used this trick a couple of
- 00:19:32other times as well so you change the
- 00:19:35quantities of interest from
- 00:19:39um from the natural parameters of the
- 00:19:42statistical model so instead of thinking
- 00:19:44about this as
- 00:19:46um beta and gamma and trying to estimate
- 00:19:49these and communicate beta and gamma to
- 00:19:52people that you know what does beta mean
- 00:19:54what does gamma mean when you start
- 00:19:56talking about what priories do you put
- 00:19:58on a beta
- 00:19:59you know people that work in music
- 00:20:01aren't really going to be able to tell
- 00:20:03you anything about it and you start
- 00:20:07framing that as all right this is um
- 00:20:11this is Jack harlo what do you think the
- 00:20:13maximum number of streams you're going
- 00:20:15to get the first week is there some way
- 00:20:18you can bump that to 2x that how long do
- 00:20:21you think they're going to be at the top
- 00:20:22of the charts for what does that mean
- 00:20:24when are they going to fall off the top
- 00:20:26of the charts his last song did 10 weeks
- 00:20:29at the top of the chart so we expect it
- 00:20:31to be 10 weeks or 12 or eight right you
- 00:20:34can talk about that and those terms but
- 00:20:37if you try talking about them in these
- 00:20:39terms minus beta these gamas you're you
- 00:20:43know it's the same model right it's
- 00:20:47just well you you can you can basic of
- 00:20:49the curve you can impute the the priors
- 00:20:53for the parameters by looking at it in a
- 00:20:55different frame of reference
- 00:21:00is there a functional limit to how far
- 00:21:02in the future you can prict
- 00:21:05streams we're we're pretty bad
- 00:21:08well two things um streams for most
- 00:21:13songs are really consistent once they
- 00:21:16hit this like sustain they just people
- 00:21:19will listen to it at that same
- 00:21:21rate things that have changed that are
- 00:21:24Tik Tok and um what they call it
- 00:21:30um it's when they they play the old
- 00:21:33songs on new shows and then it picks up
- 00:21:37and then it gets back on Tik Tok
- 00:21:38Instagram and then all of a sudden you
- 00:21:40had um K Bush running up the hill made
- 00:21:42it to a number one chart for the first
- 00:21:44time and like that was the longest one
- 00:21:45it was like 30e Gap since the release to
- 00:21:48being number one right stuff like that
- 00:21:52like good luck predicting that I you
- 00:21:56know I'd like to publish a song that
- 00:21:58years later gets picked up and Fs
- 00:22:00royalties
- 00:22:02for
- 00:22:08hello I like how you the show here's
- 00:22:12situation where
- 00:22:15model
- 00:22:19external
- 00:22:21ATT represent these models one thing I
- 00:22:24was wondering though is what the model
- 00:22:26specification was
- 00:22:39model
- 00:22:42usor yeah so those are great questions
- 00:22:45it started out as a sir model with the
- 00:22:48derivatives uh specified that way and
- 00:22:51then we were having a lot of trouble
- 00:22:54with that model and fitting it it was
- 00:22:57written in a different language we won't
- 00:22:59talk about all the problems but that's
- 00:23:01how it originally started and we were
- 00:23:04actually having trouble
- 00:23:06specifying
- 00:23:08um reasonable priors for these things
- 00:23:11because these you know it depends on the
- 00:23:13size of the the different populations
- 00:23:16right the the priors are context driven
- 00:23:18in that sense um so at the end of the
- 00:23:22day the thing that we that we actually
- 00:23:24fit for this work here which has been
- 00:23:27expanded on since was actually a linear
- 00:23:29Model A piecewise linear model model
- 00:23:32yeah but um it was done basing so if you
- 00:23:36look there you actually do
- 00:23:38see um like a bit of uncertainty there
- 00:23:42because we're not just optimizing it oh
- 00:23:44yeah and
- 00:23:48um that's right so um you know it has
- 00:23:52this bit of like a change Point type of
- 00:23:55property to it as well because we're
- 00:23:56estimating the time to to when these
- 00:23:58things happen so those were variable in
- 00:24:01time um the other thing was it was
- 00:24:04hierarchical so we were um assuming that
- 00:24:09uh all three of these curves we actually
- 00:24:12had like a dozen um dsps stacked
- 00:24:15together but they we assume that they
- 00:24:18were moving together in some fashion
- 00:24:21across different
- 00:24:22countries um it turns out that that
- 00:24:24isn't quite
- 00:24:26true um if you start looking into user
- 00:24:29Behavior Spotify users are very
- 00:24:31different than Apple users and both
- 00:24:34those are very different than people in
- 00:24:35France on dieser and those were just
- 00:24:38some of the major ones and we were
- 00:24:40seeing things like house music
- 00:24:43absolutely pop off in um in Europe in
- 00:24:47dieser and it wouldn't it maybe but it
- 00:24:52wasn't it wasn't it wasn't popping off
- 00:24:54here and it turned out it was in like uh
- 00:24:57was it beats source or some other other
- 00:24:59DSP that was like didn't have streams
- 00:25:02otherwise
- 00:25:09like yeah so I mean this this work right
- 00:25:13every time you work on a model like this
- 00:25:14it all these questions are good we were
- 00:25:16thinking about a lot of these it's like
- 00:25:17where can we do better right we have
- 00:25:19this hierarchical model tying these
- 00:25:22different dsps together in a certain way
- 00:25:24assuming that they should all function
- 00:25:26similarly but then no they don't so you
- 00:25:30know we we were battling that there was
- 00:25:33what other things oh um some of the
- 00:25:35other things that were were battling was
- 00:25:38uh the reason why we had to estimate the
- 00:25:40decay in there was because I think the
- 00:25:44most egregious one was lizo about damn
- 00:25:47time was um the metadata had the the
- 00:25:52track released two years prior to the
- 00:25:54actual release date and if you're
- 00:25:56thinking like you're working inside
- 00:25:58music you this is your artist don't you
- 00:26:01don't you know the actual date you like
- 00:26:04handed this over to Apple
- 00:26:06like um it turns out that that's a very
- 00:26:10hard problem um for unknown reasons to
- 00:26:14me
- 00:26:15um and some of the other things
- 00:26:19were each of these
- 00:26:21songs so we think of these as
- 00:26:24songs um we think of this as a recording
- 00:26:28and you might think that one recording
- 00:26:30gets blasted out to everywhere it turns
- 00:26:33out if you look under the hood this one
- 00:26:36recording might have 12 different uh
- 00:26:39equivalent of isbns like SKS so why does
- 00:26:43it do that every time it's on a release
- 00:26:45it gets another number so if it's on an
- 00:26:48album cut if it's on a on a this cut
- 00:26:51it's this release this compilation it
- 00:26:53gets another number attached to it um it
- 00:26:57turns out that people people are very
- 00:26:58careful about how they spell things and
- 00:27:01want features and whether or not they're
- 00:27:03listed as a as a artist so something
- 00:27:08titled with featuring someone else is
- 00:27:10different than a song that says that
- 00:27:13with a featured artist is different than
- 00:27:16you know all sorts of other things
- 00:27:20sometimes the Publishers are in there
- 00:27:21sometimes the the writers get credit in
- 00:27:24the actual song so each of those are
- 00:27:26different um
- 00:27:28it turns out that there it was actually
- 00:27:30very difficult for us to find a master
- 00:27:32list of um which
- 00:27:36recording had how many like skew numbers
- 00:27:39attached to it and so we had to do this
- 00:27:42like an inverse NLP prom instead of
- 00:27:45sitting on the data NLP is you know it's
- 00:27:49awesome but it's not it's hard when you
- 00:27:51have to when you think you when you
- 00:27:53assume that this stuff is
- 00:27:56known yeah that's when that's when it
- 00:27:58gets you like we didn't discover some of
- 00:28:01these problems until months in which is
- 00:28:03you know the only reason we knew that is
- 00:28:06because we were looking at a very
- 00:28:09well-known track and then realized we
- 00:28:12were missing Spotify streams for a
- 00:28:13couple couple months
- 00:28:16the uh if you see CK and
- 00:28:20Milana how many different ways is that
- 00:28:23spelled um how many
- 00:28:26hyphenations how many different
- 00:28:28compilations does it get
- 00:28:30on um turns out people play games like
- 00:28:34chop the song by a couple seconds so it
- 00:28:37fits on Apple to get onto their top
- 00:28:40playlists like because there's a limit
- 00:28:43so there are lots and lots and lots of
- 00:28:45games to play and you know just looking
- 00:28:48at the data we're not in control
- 00:28:51yet bre first
- 00:28:55repeat yeah some of the challenge was we
- 00:28:58were dealing right at the time was the
- 00:29:01shift over from like having a lot of
- 00:29:03High Fidelity cookie data to less
- 00:29:06Fidelity so everything was
- 00:29:09aggregated um but we did have a lot of
- 00:29:11that um this model didn't incorporate a
- 00:29:14lot of that I'm hoping that they've done
- 00:29:16more
- 00:29:21since how did you decide when it
- 00:29:23switched what what when it switched from
- 00:29:29we were estimating that um so that was
- 00:29:31one of the parameters that we were
- 00:29:33estimating in the
- 00:29:35model um when when we starting out with
- 00:29:39the Sur model it it naturally comes out
- 00:29:43of the the beta and the gamma so
- 00:29:46combination of that will determine where
- 00:29:48the peak is
- 00:29:51but um once we once we change it up we
- 00:29:55can actually specify where that is and
- 00:29:57put priors on it
- 00:29:59was it one model for everything you have
- 00:30:01a model for the change points a model
- 00:30:02for each section individually it was one
- 00:30:04model for everything which was actually
- 00:30:06really cool
- 00:30:09um go back to
- 00:30:13the
- 00:30:15yeah would you say that with mother
- 00:30:18mother was it like another adsr
- 00:30:25happening yeah so um back to
- 00:30:29communication to um music Executives and
- 00:30:33people that know music the thing that we
- 00:30:35were going with this model was
- 00:30:38um if you look at something like this if
- 00:30:42you look at midi instruments you can
- 00:30:43press a key hold it down and press a key
- 00:30:45again and it'll just stack on and we
- 00:30:49were thinking the first thing we're
- 00:30:51going to do was add um do additive a
- 00:30:54second thing happens it's sustaining and
- 00:30:57then you have a second Peak so you just
- 00:30:59add it to that um there are nonlinear
- 00:31:02ways to think about that as well but the
- 00:31:04first step is linear so that was one of
- 00:31:06the thinking one of the things that
- 00:31:08we're thinking when we're talking to
- 00:31:09people it's like
- 00:31:12um for example uh like they were pushing
- 00:31:18um there was a lot of work around Edge
- 00:31:20sharing with a lot of remixes for a
- 00:31:22particular song and the question was if
- 00:31:25I do a remix and release it it on this
- 00:31:28date and then wait a week and release it
- 00:31:30on a second date and then wait a week
- 00:31:32and release it on a third date now you
- 00:31:33have three events is it better to do one
- 00:31:36event or is it better to do three um
- 00:31:39does that add to the first song or does
- 00:31:42it just kind of Peter out so that that
- 00:31:45was sort of the thinking behind that
- 00:32:22that's a great question and I'll say um
- 00:32:26for the music industry in particular
- 00:32:28particular the actual methods mattered
- 00:32:32less um than say the farm industry so at
- 00:32:36the Pharma
- 00:32:37level uh the the stakeholders were PhD
- 00:32:42biostatisticians that we were talking to
- 00:32:44that understood their methods and wanted
- 00:32:46to know you know the questions he was
- 00:32:49asking in the corner about like hey
- 00:32:53about about like what you know what what
- 00:32:55are these different is it a spline based
- 00:32:58method is it what what do you have going
- 00:33:00on in the model can you explain it to me
- 00:33:03um here for the music industry most of
- 00:33:07the people that make it to the top and
- 00:33:08are decision
- 00:33:10makers have a they're there because they
- 00:33:14love music at some point um even though
- 00:33:18it may not always seem that way but at
- 00:33:20some point people love music and that's
- 00:33:22how they got into it
- 00:33:24um but they're not all statisticians
- 00:33:26they're not all Machin learning people
- 00:33:29they want to know um what you know how
- 00:33:33these how Jack harlo was going to do why
- 00:33:36because they wanted to to know whether
- 00:33:38or not they should push more for Jack
- 00:33:40harlo or do a leipa right they have a
- 00:33:43budget problem they don't care how these
- 00:33:46numbers came out which is why it goes
- 00:33:48back to like most of the things in the
- 00:33:51model um we kind of HD away from people
- 00:33:54at that point it's there right we're not
- 00:33:56we're not hiding it for the purpose of
- 00:33:57hiding it we're hiding it for the
- 00:33:59purpose of us talking to the
- 00:34:00stakeholders on their terms and that's
- 00:34:04what this model gave to us
- 00:34:06um yeah I mean yeah we we tried talking
- 00:34:09to them
- 00:34:10about you know these populations that
- 00:34:13didn't exist and they kind
- 00:34:16of they don't want to hear that not in a
- 00:34:20quick meeting
- 00:34:21um but yeah in general I think it
- 00:34:24depends you have to read the room
- 00:34:29quick
- 00:34:31question how far the long time horiz you
- 00:34:34have
- 00:34:39to um for for
- 00:34:44the for music if I remember correctly it
- 00:34:47was
- 00:34:48um they had three different time
- 00:34:51Horizons that they were looking at one
- 00:34:53was a weekly time Horizon they want to
- 00:34:55be on the charts um to be on charts
- 00:34:58you're looking one week
- 00:34:59ahead um the second is I think about two
- 00:35:03months out because then you you get if I
- 00:35:08remember right it's like the amount of
- 00:35:10Revenue you get in the first two months
- 00:35:12is really most of the revenue that most
- 00:35:14artists will get so that's that's what
- 00:35:16they were concerned with and whether or
- 00:35:18not they can increase that value and
- 00:35:20then third was like do you have a back
- 00:35:23catalog that's worth anything so can you
- 00:35:25have rights on this and like people were
- 00:35:27thinking about at that point people were
- 00:35:29thinking about buying and selling rights
- 00:35:30to music
- 00:35:35so do you like characteristics of the
- 00:35:38song there was another team working
- 00:35:40directly on the characteristics of the
- 00:35:42song listening to or um looking at the
- 00:35:45lyrics trying to see if the content made
- 00:35:47a difference um they were breaking down
- 00:35:50like beats for minute feel Vibes
- 00:35:53whatever you want to call it um they
- 00:35:56were trying to group similar songs
- 00:35:58together to see if that had an effect
- 00:36:01but I I worked on you know a different
- 00:36:02team that was just looking at the the
- 00:36:04streams
- 00:36:06themselves um and yeah every other every
- 00:36:09like 10th song would have some weird
- 00:36:11blip like this so if you end up on a
- 00:36:14random Spotify playlist and you're only
- 00:36:16getting like let's say you're only
- 00:36:17getting a th streams a week that might
- 00:36:20boost you up to like 20,000 one week and
- 00:36:23then you go back down to th but you know
- 00:36:27for some like Jack harlo that doesn't
- 00:36:28make any difference but for you know
- 00:36:31someone with very few streams you get
- 00:36:34random blips here and
- 00:36:49there yeah the question was when do you
- 00:36:51make the the projections um and yeah the
- 00:36:55they were done pre-release um a lot of
- 00:36:58it was asking pre-release a lot of the
- 00:37:01questions that were sort of really hard
- 00:37:03for us to get to by the time we're we
- 00:37:05wrapped up the project was
- 00:37:08um uh
- 00:37:11like
- 00:37:14how you have two different songs that an
- 00:37:16artist wants to release how high are
- 00:37:18they going to be in the maximum number
- 00:37:19of streams for the first week and then
- 00:37:21on top of that how do you keep an artist
- 00:37:24at their Peak for many weeks at a time
- 00:37:27is that through advertising and and it's
- 00:37:29like it's really hard to tell what we
- 00:37:31didn't have were marketing numbers
- 00:37:34attached to these which made it really
- 00:37:36hard to tell what interventions people
- 00:37:38were doing like is it natural is it
- 00:37:40spreading because people like the music
- 00:37:43is it spreading because Spotify put it
- 00:37:45on the top like radar you know whatever
- 00:37:49playlist that is um so yeah anyway there
- 00:37:53was a lot of
- 00:37:55that maximum
- 00:38:04um I think the the the Decay part um so
- 00:38:12the reason why the maximum wasn't that
- 00:38:14hard was like you could be off by 2X or
- 00:38:18half and it's not going to matter in the
- 00:38:20long scheme of things right it matters
- 00:38:22whether or not you're going to make it
- 00:38:23to the charts if you don't have that big
- 00:38:25boom you don't get to top 40s
- 00:38:29ever
- 00:38:30but you know the the amount of time that
- 00:38:33you stay relatively strong um there lots
- 00:38:39of different behaviors that drive that
- 00:38:41um for a for a song with low tracks you
- 00:38:45can we've seen like a couple accounts
- 00:38:48that have streamed it like as many times
- 00:38:50as humanly possible which probably
- 00:38:52indicates not a
- 00:38:54human um but at at the bigger artist
- 00:38:58that doesn't make a difference um so
- 00:39:01it's like people are consuming it stream
- 00:39:04at a time like stream 10 times a
- 00:39:07day
- 00:39:11um so that that's it like how do
- 00:39:15you how do you keep something relevant
- 00:39:19is a very hard question to
- 00:39:21ask I don't know the answer to that I
- 00:39:23Don't Know Remix remixes is definitely
- 00:39:27one y was there a hand this side I was
- 00:39:31ask
- 00:39:34question were any
- 00:39:36general
- 00:39:38like insights one could
- 00:39:43like take
- 00:39:46away how
- 00:39:54inrease there there wasn't um
- 00:39:58um
- 00:40:00and the reason I'm I'm saying that there
- 00:40:03were there were lots of little takeaways
- 00:40:05when you subset things into um different
- 00:40:08bins
- 00:40:09but
- 00:40:12um it turns out dealing with
- 00:40:14stakeholders is often difficult when
- 00:40:17they're asking questions that are too
- 00:40:19broad um and you don't have when you
- 00:40:21don't have a lot of time with them it it
- 00:40:23makes it even more difficult to try to
- 00:40:25give them a truthful answer because
- 00:40:27they'll they'll be asking stuff like
- 00:40:30um you know for all the artists in this
- 00:40:33genre what's going to happen it's
- 00:40:36like is
- 00:40:39that
- 00:40:42EAS artist or a specific song so you
- 00:40:46know so some of the really cool
- 00:40:48questions that we dug into were um
- 00:40:50there's this like sub genre that someone
- 00:40:52was really interested in as an
- 00:40:54anr and um so an is someone that that uh
- 00:40:59scouts out new artists and try anyway
- 00:41:03they were looking at the sub genre they
- 00:41:05were trying to figure out whether or not
- 00:41:07that had a positive trend because if
- 00:41:09they did they would sign more artists to
- 00:41:11that genre and try to make that genre
- 00:41:13bigger if the genre gets bigger their
- 00:41:15back catalog gets worth more like
- 00:41:17everyone in that little
- 00:41:19subfield um those were cool questions to
- 00:41:22ask and we could start seeing a trend of
- 00:41:25you know they're they're getting a
- 00:41:26little more popular
- 00:41:29um but yeah a lot of the questions that
- 00:41:32they were asking were about these pop
- 00:41:34stars um duppa lizos um Jack harlos of
- 00:41:39the world who are massive and so when
- 00:41:42you're talking like it was actually
- 00:41:44pretty easy to say like Jack Carlo did
- 00:41:46four million streams on um was that
- 00:41:50Spotify right that first week if you
- 00:41:53looked at his last release it was about
- 00:41:554 million and I could tell you that the
- 00:41:57next release he's going to do is going
- 00:41:59to be about 4 million
- 00:42:02um I'm good there it's it's really like
- 00:42:06you know they want to know how's the
- 00:42:08album going to do and it's like I don't
- 00:42:11know people are releasing music piece at
- 00:42:13a time there this song is really popular
- 00:42:16the 12th cut on that
- 00:42:19album who's listening to the whole album
- 00:42:21these days like you know I do but
- 00:42:30I don't I have recogniz these artists at
- 00:42:32all completely that's all
- 00:42:36right that's all right
- 00:42:39um yeah I started I was excited to work
- 00:42:42with worm music because i' I've been a
- 00:42:44DJ for most half more than half my life
- 00:42:48and um just kind of knowing about
- 00:42:51music's really cool um but then digging
- 00:42:55into it it gets really messy
- 00:42:57especially when it came to like
- 00:43:00um You' think that you'd have metadata
- 00:43:03control of your own metadata and they
- 00:43:05just don't and it's wild too busy
- 00:43:08po POA is
- 00:43:11gone um you
- 00:43:13know
- 00:43:16supposedly is immedate or
- 00:43:22netive oh yeah so what they were trying
- 00:43:24to do was actually um
- 00:43:27if you think about it as your record
- 00:43:30label you have a number of
- 00:43:32artists uh two things you can do you can
- 00:43:35stagger releases right so instead of
- 00:43:38everyone competing for the same uh user
- 00:43:40base the question is whether or not
- 00:43:43releasing two things at the same time is
- 00:43:44better for you or Worse first of all it
- 00:43:46could be better people might listen to
- 00:43:48more of your music in
- 00:43:50aggregate um but the second thing is
- 00:43:53where does a marketing budget go if you
- 00:43:55got someone with 4 million streams and
- 00:43:57you could boost that to 5 million
- 00:43:59streams that's a lot better than someone
- 00:44:02with 10,000 streams and boosting them to
- 00:44:0512,000 streams right and so they had
- 00:44:08that question that was top of mind for
- 00:44:13them
- 00:44:17dat yeah we didn't get to that though at
- 00:44:20the at this point we were interested in
- 00:44:23so many things but you know
- 00:44:36um the the honest answer to that was um
- 00:44:41at the at the major record label they're
- 00:44:44interested in the the very first few
- 00:44:48weeks they want to know the peak and the
- 00:44:51first few weeks that they're they're
- 00:44:52going to be there's there's a very big
- 00:44:55difference between
- 00:44:58not charting and charting yeah so if you
- 00:45:01can chart you win if you don't chart
- 00:45:03you're in a secondary class of music so
- 00:45:08at the record label um not for the
- 00:45:10musician the musician wants to release
- 00:45:13good music and be heard but if you're
- 00:45:15the record label there's you know a lot
- 00:45:18of things that happen once you get to
- 00:45:21that level um so the question is always
- 00:45:24like if you're close how do you
- 00:45:27get
- 00:45:32there
- 00:45:34um the we were using things like artist
- 00:45:38um past
- 00:45:39releases um similarity in terms of uh
- 00:45:42genre and different things so we're
- 00:45:45doing okay
- 00:45:50but we were plugging that stuff into
- 00:45:54yeah we were we were trying to scrape
- 00:45:56like Twitter follow followers insta
- 00:45:57followers the thing that we didn't have
- 00:45:59at that time because it was too early
- 00:46:01was Tik Tok like we didn't have control
- 00:46:04we didn't have a lot of Tik Tok data but
- 00:46:06Tik Tok was blowing up at the
- 00:46:08time so that that would have been really
- 00:46:11interesting to follow like some of these
- 00:46:14drivers of like these other other tracks
- 00:46:16that you would see that just are flat
- 00:46:18and and pop up all of a sudden a lot of
- 00:46:21that was coming from Tik Tok and if we
- 00:46:23were tracking Tik Tok we would have
- 00:46:24known but we just looking at the um dsps
- 00:46:29themselves we didn't have an indication
- 00:46:32of
- 00:46:36that I got
- 00:46:38online there's a few questions online
- 00:46:40but might just be more jokes check out
- 00:46:43of them
- 00:46:44yeah how many songs are time coded at
- 00:46:47the Unix EP
- 00:46:56other questions we still have a little
- 00:46:58bit of time is there is there any like
- 00:47:01motivation to clean up the
- 00:47:04metadata yeah I I spent a lot of time I
- 00:47:07was motivated yeah in terms of like do
- 00:47:11they see that as like a bus opportunity
- 00:47:15and you know they work with say
- 00:47:19theps to to make that better or is there
- 00:47:23there some business reason that the
- 00:47:25metad suck
- 00:47:28um I think there's always a business
- 00:47:30reason metadata sucks which is probably
- 00:47:33the business doesn't value enough um I I
- 00:47:36think the truth is that the uh record
- 00:47:38industry is a very old industry
- 00:47:41relatively um everything was built
- 00:47:43around physical sales so the inventory
- 00:47:46system the you know the numbering of
- 00:47:48these things the it was all based on
- 00:47:50physical counting of like I ship a crate
- 00:47:55of Records to to a record store and
- 00:47:58after a month none of them came back we
- 00:48:01assume they're
- 00:48:02sold yeah and then we'll send them some
- 00:48:04more
- 00:48:06um you know and and the the way the
- 00:48:11music industry was set up is um like
- 00:48:14Warner Records has a bunch of sublabels
- 00:48:17under it and they all operate their own
- 00:48:20piano um as far as I could tell and that
- 00:48:24meant they had their own marketing
- 00:48:25department that meant that there were
- 00:48:26data scientists working you know imagine
- 00:48:30like hundreds of data scientists working
- 00:48:32off of the same data and learning to
- 00:48:35collect it and process it in different
- 00:48:36ways so even if we asked two sets of
- 00:48:39people or three sets of people to look
- 00:48:40at the same thing and ask about the
- 00:48:43results of this thing we get widely
- 00:48:45different forecasts just because we're
- 00:48:47counting things
- 00:48:48differently um so this project was done
- 00:48:52under the the the record uh the head
- 00:48:55record label the the central place
- 00:48:59um but yeah the the metadata cleanup is
- 00:49:03I
- 00:49:03think it you know there just too many
- 00:49:06cooks in the kitchen and it's it's hard
- 00:49:08to fix
- 00:49:10that I don't know do the dsps have
- 00:49:16better
- 00:49:18I the dsps have
- 00:49:22metadata they know what they have um
- 00:49:26they're motiv they're using it to like
- 00:49:31to push stuff yeah I I don't know I
- 00:49:34maybe you can get someone from Spotify
- 00:49:36out here to talk about the internals but
- 00:49:41um yeah I'm sure it's it's very similar
- 00:49:44my guess is that they they just need to
- 00:49:46count and for compliance reasons count
- 00:49:48how many times something was streamed so
- 00:49:50that they can send royalties back and
- 00:49:52they're not concerned with you know who
- 00:49:55they pay is like the individual artist
- 00:49:58is not on them they pay
- 00:50:00the the record label and let the record
- 00:50:03it's you know just gets passed down
- 00:50:06sound
- 00:50:08and Arbitron or whatever it is they took
- 00:50:10care of a lot of these PRS back in
- 00:50:12the90s so yeah they probably like
- 00:50:14cleaner data which they for but it's
- 00:50:17like everything else if you stop
- 00:50:19cleaning your data it gets messy real
- 00:50:21quick yeah so clean your data everyone
- 00:50:28okay any other
- 00:50:30questions there are any L
- 00:50:33questions then if there's no other
- 00:50:35questions um did you see that Lego
- 00:50:37released a turntable yeah I did yeah so
- 00:50:40it's perfect for you a little tiny Lego
- 00:50:41turntable but this big it looks a lot
- 00:50:43bigger than the picture but it's perfect
- 00:50:44for him you need to get that all right
- 00:50:47well thank you very much thank you
- 00:50:49for connect with me if you want um I'm
- 00:50:53around I'm in New York he just had a
- 00:50:56baby maybe 3 months ago though so you
- 00:50:58know he's going to go fall
- 00:51:01over okay with that um again you you
- 00:51:06filled in as of like last night so thank
- 00:51:08you for pulling this together I know you
- 00:51:09had this ready for the nyr conference
- 00:51:11but always great to have you always
- 00:51:12always great to have you but also thank
- 00:51:13you very much for doing this uh Dan's
- 00:51:15been a member of this community for 15
- 00:51:19years probably let's go 15ish right
- 00:51:21towards near the beginning um so awesome
- 00:51:25then I guess we will say we will go to
- 00:51:28the bar before we go to the bar remember
- 00:51:30uh thank you to NYU of course for making
- 00:51:32this happen and next month December 3rd
- 00:51:35we'll get that announced tomorrow
- 00:51:37hopefully it'll be right George we can
- 00:51:38have announced tomorrow you think we'll
- 00:51:39have a room so we'll have that announced
- 00:51:40tomorrow or Thursday he said he could
- 00:51:42probably by tomorrow probably by
- 00:51:43tomorrow but maybe Thursday we'll get
- 00:51:44that announced and uh that'll be
- 00:51:46December 3rd somewhere here ATU campus
- 00:51:48likely this building maybe even this
- 00:51:50room we'll see and then we'll announce
- 00:51:52January February March and April as we
- 00:51:54go we have things lined up we'll get
- 00:51:56there the r and government videos will
- 00:51:58be up online by Thanksgiving so you
- 00:52:00could enjoy those um when you're eating
- 00:52:02too much we're going to pack up here uh
- 00:52:05and we're going to go over to Malt House
- 00:52:06and have a post meet up beverage to talk
- 00:52:09more to have a what a malt a malt yes
- 00:52:11you get all malt balls right so enjoy
- 00:52:14that thank you all for coming and I'll
- 00:52:15see you next month
- 00:52:17[Applause]
- statistiques
- modèles bayésiens
- musique
- streams
- data science
- métadonnées
- prévision
- analyse
- DSP
- communication