How Naive Bayes Classifier Works 1/2.. Understanding Naive Bayes and Example

00:11:21
https://www.youtube.com/watch?v=XcwH9JGfZOU

Resumen

TLDRDie video verduidelik hoe die Naïve Bayes-klasifiseerder werk, gebaseer op die Bayes-teorema. Dit bespreek die aannames van onafhanklikheid tussen voorspellende veranderlikes en hoe om posterior probabilities te bereken. 'n Voorbeeld met weerdata word gebruik om die proses van die opstel van frekwensietable en die berekening van waarskynlikhede te demonstreer. Die video sluit ook die uitdaging van die 'zero frequency problem' in en 'n metode om dit op te los.

Para llevar

  • 📊 Naïve Bayes is gebaseer op die Bayes-teorema.
  • 📈 Aannames van onafhanklikheid tussen voorspellers is belangrik.
  • 📉 Die posterior probability word bereken deur waarskynlikhede te vermenigvuldig.
  • 🗃️ Frekwensietable is noodsaaklik vir die berekening van waarskynlikhede.
  • 🔄 Die zero frequency problem kan opgelos word deur een by tellings te voeg.

Cronología

  • 00:00:00 - 00:05:00

    In hierdie video word die naiewe basisklassifiseerder bespreek, wat op die basisstelling gebou is, met die aannames van onafhanklikheid tussen voorspellers. Hierdie model is maklik om te bou en werk goed met groot datastelle. Die kern van die naiewe basisklassifiseerder is die berekening van die posterior probaliteit van 'n klas gegee 'n voorspellingsdata, wat die klas voor die data (prior probaliteit) en die kans van die data gegee die klas (likelihood) insluit. Dit maak dit 'n gewilde keuse in die navorsingsgemeenskap weens sy eenvoud en doeltreffendheid, selfs in vergelyking met meer ingewikkelde metodes.

  • 00:05:00 - 00:11:21

    Die video verduidelik ook hoe om die posterior probaliteit te bereken deur 'n frekwensietabel te bou en dit in 'n kans tabel te omskep. 'n Voorbeeld van weerdata word gegee om die berekening van 'n posterior probaliteit te demonstreer. Daar word ook 'n metode bespreek om die 'zero frequency problem' aan te spreek deur 1 by alle telling toe te voeg wanneer 'n attribuutwaarde nie in 'n klas voorkom nie. Hierdie stappe illustreer die praktiese toepassing van die naiewe basisklassifiseerder en die berekening van die kans om 'n besluit te neem, soos of om te speel of nie.

Mapa mental

Vídeo de preguntas y respuestas

  • Wat is 'n Naïve Bayes-klasifiseerder?

    Dit is 'n statistiese klassifikator wat gebaseer is op die Bayes-teorema en aannames van onafhanklikheid tussen voorspellende veranderlikes.

  • Hoe bereken jy die posterior probability in 'n Naïve Bayes-klasifiseerder?

    Deur die kans van die data gegewe die klas te vermenigvuldig met die prior probability van die klas en dit te deel deur die kans van die data.

  • Wat is die 'zero frequency problem'?

    Dit gebeur wanneer 'n bepaalde attribuutwaarde nie voorkom nie, wat lei tot 'n kans van nul. Om dit te verhoed, kan jy een by al die tellings voeg.

  • Hoe werk die aannames van onafhanklikheid?

    Dit beteken dat die kennis van die waarde van een attribuut nie enige inligting oor die waarde van 'n ander attribuut gee nie.

  • Waarom is Naïve Bayes maklik om te verstaan en uit te leg?

    Die metode is eenvoudig en min ingewikkeld, wat dit maklik maak om te implementeer en te debug.

Ver más resúmenes de vídeos

Obtén acceso instantáneo a resúmenes gratuitos de vídeos de YouTube gracias a la IA.
Subtítulos
en
Desplazamiento automático:
  • 00:00:01
    welcome back in this video I'll be
  • 00:00:02
    explaining to you the naive based
  • 00:00:05
    classifier how it works and we'll take a
  • 00:00:08
    simple example now the naive base
  • 00:00:10
    classifier as we mentioned before is
  • 00:00:12
    based on the frequency
  • 00:00:15
    table knif Bas classifier is based on
  • 00:00:18
    base theorem with Independence
  • 00:00:20
    assumptions between predictors I hope
  • 00:00:22
    you're familiar with base theorem it's
  • 00:00:24
    very nice and easy and quite a nice way
  • 00:00:26
    of of explaining how things work
  • 00:00:30
    um and we assume that our predictors are
  • 00:00:35
    independent what that means
  • 00:00:37
    is knowing the value of one attribute
  • 00:00:39
    does not tell us anything about the
  • 00:00:41
    value of another attribute or another
  • 00:00:43
    predictor a naive base model is usually
  • 00:00:46
    easy to build with no complicated
  • 00:00:47
    iterative parameter estimation and that
  • 00:00:50
    makes it particularly useful for very
  • 00:00:51
    large data sets so naive base is quite
  • 00:00:55
    uh well known and well- liked amongst
  • 00:00:56
    the research Community it's quite simple
  • 00:00:59
    but it actually performs really well
  • 00:01:02
    many uh uh quite
  • 00:01:05
    often even sometimes it outperforms
  • 00:01:08
    sophisticated methods the good thing
  • 00:01:10
    about uh uh uh naive base classifier is
  • 00:01:13
    that it's easy to understand and easy to
  • 00:01:15
    explain and easy to
  • 00:01:18
    debug now the way it works as we
  • 00:01:20
    mentioned before it's based on base
  • 00:01:23
    theorem Bas theorem provides a way of
  • 00:01:25
    calculating the posterior probability
  • 00:01:27
    probability of C given x c is our class
  • 00:01:30
    X our data or our predictors or our
  • 00:01:32
    attributes from the probability of see
  • 00:01:34
    probability of the class that's before
  • 00:01:36
    seeing any data probability of the data
  • 00:01:39
    and probability of the data given the
  • 00:01:40
    class naive based classifier assumes
  • 00:01:43
    that the effect of the value of a
  • 00:01:46
    predictor x on a given Class C is
  • 00:01:49
    independent of the values of other
  • 00:01:51
    predictors so other so predictors are
  • 00:01:53
    independent from each other this
  • 00:01:55
    assumption is called the class
  • 00:01:57
    conditional Independence now let's have
  • 00:01:59
    a look at this equation here probability
  • 00:02:01
    this is R probability of C given X
  • 00:02:04
    probability of the class given x x here
  • 00:02:06
    is our data or our predictors we can
  • 00:02:08
    have one or more predictors yes
  • 00:02:11
    probability of the class given the data
  • 00:02:13
    equals probability of the data given the
  • 00:02:15
    class times probability of the class
  • 00:02:18
    this is probability of the class before
  • 00:02:20
    seeing any data divide by probability of
  • 00:02:24
    X IE probability of the data itself this
  • 00:02:26
    is called the posterior probability
  • 00:02:28
    probability of C given X probability of
  • 00:02:30
    x given C is called the likelihood
  • 00:02:33
    probability of uh uh of the class is
  • 00:02:36
    called the class prior probability again
  • 00:02:38
    this is probability of the class before
  • 00:02:40
    seeing any data and here probability of
  • 00:02:43
    the data is called the predictive prior
  • 00:02:46
    probability so probability of C given X
  • 00:02:48
    is po posterior probability of class or
  • 00:02:51
    the target given predictor or attribute
  • 00:02:52
    C given x x is our attributes
  • 00:02:55
    probability of C is the prior
  • 00:02:56
    probability of the class prior means
  • 00:02:59
    before c see any data probability of the
  • 00:03:01
    data given the class is the likelihood
  • 00:03:03
    which is the probability of the
  • 00:03:05
    predictor given the class and
  • 00:03:06
    probability of X is the prior
  • 00:03:08
    probability of the predictor probability
  • 00:03:10
    of the DAT itself sometimes it's not
  • 00:03:11
    always possible to know probability of X
  • 00:03:14
    this one here but there's a way around
  • 00:03:16
    that so the probability of C given X
  • 00:03:19
    which is X is our attribute is the
  • 00:03:21
    probability of the first attribute given
  • 00:03:23
    the class times probability of the
  • 00:03:25
    second attribute given the class times
  • 00:03:28
    and we go over all the nend attributes
  • 00:03:30
    that we have given the class times
  • 00:03:32
    probability of the class itself the
  • 00:03:34
    reason we multiply probabilities here is
  • 00:03:36
    because as we mentioned before the uh um
  • 00:03:40
    predictors are independent so this
  • 00:03:42
    Independence assumption assumption helps
  • 00:03:44
    us to solve this by multiplying
  • 00:03:47
    probabilities let's take an example the
  • 00:03:50
    posterior probability can be calculated
  • 00:03:52
    first by constructing a frequency table
  • 00:03:56
    for each attribute against the target so
  • 00:03:58
    we do uh frequency tables remember if
  • 00:04:01
    our data is numerical then we can
  • 00:04:03
    transform it into categorical or I'll
  • 00:04:05
    show you another technique in the next
  • 00:04:07
    video how to deal with uh numerical uh
  • 00:04:10
    variables to build a basian uh class
  • 00:04:13
    Nave basian
  • 00:04:14
    classifier now after we build the
  • 00:04:17
    frequency tables we transform them into
  • 00:04:19
    likelihood tables or into probability
  • 00:04:20
    tables and finally we use the naive base
  • 00:04:23
    equation to calculate the posterior
  • 00:04:25
    probability for each class now the class
  • 00:04:28
    with the highest post the probability is
  • 00:04:30
    the outcome of prediction let's have a
  • 00:04:32
    look at an example if you remember the
  • 00:04:34
    weather data now for example we have
  • 00:04:36
    four categorical
  • 00:04:38
    attributes and we have our class what we
  • 00:04:41
    do here is for example we Pro calculate
  • 00:04:43
    the probability of the class now this is
  • 00:04:45
    the prior probability probability of yes
  • 00:04:48
    is 9 over 14 probability of no is 5 over
  • 00:04:52
    14 14 yes and now we buil a frequency
  • 00:04:55
    table you've seen this before in the 1 R
  • 00:04:57
    uh classifier
  • 00:04:59
    and from the frequency table now from
  • 00:05:01
    the from these Counts from the
  • 00:05:02
    frequencies we extract these
  • 00:05:05
    probabilities probability of yes given
  • 00:05:08
    it's a
  • 00:05:09
    sunny I'm sorry probability of Sunny
  • 00:05:11
    given it's a yes probability of Sunny
  • 00:05:14
    given it's a no probability of overcast
  • 00:05:16
    given as and so on and so forth and
  • 00:05:18
    these are just the frequency over the
  • 00:05:22
    sum of the column so 3 over 9 4 over 9 2
  • 00:05:26
    over 9 this column here sums to 9
  • 00:05:28
    because we have nine yeses and here 2
  • 00:05:30
    over 5 0 over 5 and 3 over 5 because
  • 00:05:33
    this column here sums to five we have
  • 00:05:35
    five nodes yes likewise for the two
  • 00:05:38
    different values for humidity high and
  • 00:05:39
    normal we compute their corresponding
  • 00:05:41
    probabilities likewise for windy and for
  • 00:05:46
    temperature uh if you see here for
  • 00:05:48
    example for the uh
  • 00:05:51
    Outlook we have the frequency table and
  • 00:05:53
    we have the probabilities now now if we
  • 00:05:55
    see here the probability of x given C so
  • 00:05:59
    probability of the variable given the
  • 00:06:00
    class is uh read as follows or extracted
  • 00:06:05
    as follows probability for example
  • 00:06:07
    probability of Sunny given yes of Sunny
  • 00:06:09
    given the class is yes is 3 over9 or 33
  • 00:06:13
    the probability of Sunny given it's a no
  • 00:06:15
    is 2 over five yes and now the
  • 00:06:17
    probability of just Sunny regardless of
  • 00:06:19
    the class this is probability of X is 5
  • 00:06:22
    over 14 probability of overcast is 404
  • 00:06:25
    and continue likewise for
  • 00:06:27
    rainy and now at the bottom here we have
  • 00:06:30
    the probability of yes and the
  • 00:06:31
    probability of no as we mentioned this
  • 00:06:33
    should sum up to nine and this should
  • 00:06:35
    sum up to five the number of yeses and
  • 00:06:37
    the number of NOS I hope this makes
  • 00:06:39
    sense now let's say for example we want
  • 00:06:40
    to compute the probability of yes given
  • 00:06:43
    the day is sunny so what we do is we uh
  • 00:06:47
    uh
  • 00:06:48
    multiply probability of and this is by
  • 00:06:51
    the way probability of C given X so we
  • 00:06:52
    have probability of x given C we
  • 00:06:56
    multiply probability of Sunny given yes
  • 00:06:58
    which is 3 9 times probability of uh C
  • 00:07:03
    which is yes now our class is C which is
  • 00:07:06
    64 and we divide by the probability of X
  • 00:07:10
    the probability of X of Sunny now is 5
  • 00:07:13
    over 14 which is 36 and that results in
  • 00:07:18
    0.6 yes just a direct application of
  • 00:07:22
    this equation here and we mentioned that
  • 00:07:24
    probability is multiply because we have
  • 00:07:28
    Independence now for these you can you
  • 00:07:30
    can uh we can use a simple example for
  • 00:07:32
    example from uh this table let's say we
  • 00:07:34
    have a random day now some input now we
  • 00:07:36
    want to decide either to play or not yes
  • 00:07:39
    or no let's say we have a day with
  • 00:07:42
    Outlook rainy temperature mild humidity
  • 00:07:45
    normal and windy is true uh and we want
  • 00:07:48
    to decide either to play or no either to
  • 00:07:51
    play or not using naive based classifier
  • 00:07:53
    what we do is we compute the likelihood
  • 00:07:55
    of yes and the likelihood of the no so
  • 00:07:57
    the probability of the likelihood of the
  • 00:07:59
    yes is the probability of the Outlook
  • 00:08:01
    equals rainy giving it a yes times
  • 00:08:04
    probability of a temperature mild giving
  • 00:08:06
    it a yes times the probability of
  • 00:08:09
    humidity is normal given it a yes times
  • 00:08:11
    the probability of the windy is true giv
  • 00:08:14
    it a yes times the probability of yes
  • 00:08:17
    and these values we can extract them
  • 00:08:19
    easily from our frequency and
  • 00:08:22
    probability
  • 00:08:23
    tables as you can see here so for
  • 00:08:25
    example probability of Outlook equals
  • 00:08:28
    when given it a
  • 00:08:30
    yes
  • 00:08:33
    is rainy equ say yes it's actually two
  • 00:08:37
    over 9 yes likewise probability of
  • 00:08:40
    temperature is mild given it a yes is
  • 00:08:44
    probability of temperature is mild given
  • 00:08:46
    say yes is 4 over9 and we multiply these
  • 00:08:49
    things together and multiply by the
  • 00:08:52
    probability of the yes which is 9 over
  • 00:08:54
    14 and we get this number this number
  • 00:08:56
    now is not a probability is just the
  • 00:08:58
    likelihood of the yes because
  • 00:09:00
    uh we in the same way we can compute the
  • 00:09:02
    likelihood of the no probability of
  • 00:09:04
    Outlook is rainy given it a no
  • 00:09:06
    probability of temperature is mild
  • 00:09:08
    giving it a no likewise humidity normal
  • 00:09:10
    windy normal giv it a no times the
  • 00:09:12
    probability of the no and we can extract
  • 00:09:15
    this from the table for example the
  • 00:09:16
    probability of windy is true given it's
  • 00:09:19
    a it's a no probability I'm sorry
  • 00:09:22
    probability of windy is true given its
  • 00:09:24
    no is 3 over
  • 00:09:26
    5 and that's as you can see there
  • 00:09:30
    Time 5 over 14 which is probability of
  • 00:09:32
    the no and we get that now we can
  • 00:09:33
    normalize to get the probability for
  • 00:09:36
    probability of yes is likelihood of yes
  • 00:09:39
    over likelihood of yes plus likelihood
  • 00:09:41
    of no we get probability of yes
  • 00:09:42
    probability of no is likelihood of no
  • 00:09:45
    over likelihood of yes plus likelihood
  • 00:09:48
    of no and gives us that probability and
  • 00:09:50
    notice now that probability of the yes
  • 00:09:52
    is larger than probability of the now so
  • 00:09:54
    we probably decide to play on that
  • 00:09:57
    day uh this is how the na base
  • 00:09:59
    classifier works you may have noticed
  • 00:10:01
    something because we multiply
  • 00:10:04
    probabilities if we have zero
  • 00:10:07
    frequencies then we have a problem if
  • 00:10:09
    you remember this idea of Independence
  • 00:10:11
    and we multiply probabilities if we have
  • 00:10:14
    zero counts I'm sorry where is it we
  • 00:10:16
    have if we have zero counts then we have
  • 00:10:18
    a problem because we multiply by zero we
  • 00:10:20
    end up with a zero for for for that
  • 00:10:23
    value well there's a way around this and
  • 00:10:25
    this is this is called the zero
  • 00:10:27
    frequency problem it happens s when an
  • 00:10:30
    attribute value for example Outlook is
  • 00:10:34
    overcast doesn't occur with every class
  • 00:10:36
    value for example with play golf equals
  • 00:10:40
    no so here if you see the Outlook is
  • 00:10:42
    overcast and play no is zero it doesn't
  • 00:10:44
    occur and that causes a problem and the
  • 00:10:46
    way around that is to add one to all the
  • 00:10:49
    counts um so we just add one to all the
  • 00:10:52
    counts so these counts instead of 3 4 2
  • 00:10:55
    2 0 3 they become 4 f 5 3 3 1 4 and we
  • 00:11:02
    do the same thing for everything else
  • 00:11:04
    and the this probability will slightly
  • 00:11:06
    change but that's just a way around this
  • 00:11:09
    zero uh frequency problem thanks for
  • 00:11:12
    watching in the next video I'll show you
  • 00:11:13
    how to deal with uh numerical data when
  • 00:11:17
    trying to build a basian a naive basian
  • 00:11:19
    classifier
Etiquetas
  • Naïve Bayes
  • Bayes-teorema
  • klassifikasie
  • frekwensietabelles
  • waarskynlikhede
  • kansberekening
  • sagteware ontwikkeling
  • data wetenskap
  • masjienleer