Math 13X Lesson 23 Goodness of Fit Test

00:17:11
https://www.youtube.com/watch?v=0TDttbhbc30

Resumen

TLDRLa vidéo présente le test de conformité du chi-carré, une méthode statistique pour vérifier si une distribution observée correspond à une distribution théorique. Exemple donné : les étudiants notés de A à F sont supposés avoir une distribution égale (même nombre de chaque note). Le total des étudiants (55) est divisé par le nombre de catégories (5) pour obtenir le nombre attendu dans chaque catégorie (11). Une table est utilisée pour déterminer la valeur critique (9,48) en fonction des degrés de liberté. Si le test statistique observé dépasse cette valeur, la réclamation est rejetée (exemples avec des pourcentages spécifiques sont aussi discutés). Finalement, la nécessité de la précision dans les calculs est mise en évidence.

Para llevar

  • 🔍 Le test de conformité du chi-carré évalue la correspondance entre une distribution observée et une distribution attendue.
  • 📊 Pour des distributions égales, chaque catégorie doit avoir un nombre égal de données.
  • 📈 Les valeurs critiques déterminent si une réclamation statistique peut être rejetée.
  • 🔢 Les degrés de liberté influencent la détermination des valeurs critiques.
  • 🧮 Une table ou un calculateur est utilisé pour déterminer les valeurs critiques dans ce test.
  • ⚖️ Une statistique de test élevée par rapport à la valeur critique indique une distribution non conforme.
  • 🔄 Le test offre une capacité à vérifier les réclamations de distribution via des proportions spécifiques.
  • 🔗 Le test se base sur les données observées versus attendues pour valider ou invalider une hypothèse.
  • 🚫 Une erreur dans les entrées peut mener à des conclusions incorrectes.
  • ✔️ L'importance de la précision dans l'entrée des données et des calculs est soulignée.

Cronología

  • 00:00:00 - 00:05:00

    Le test du Chi-carré de conformité est une méthode statistique utilisée pour vérifier si les données catégorielles suivent une distribution supposée. Dans ce cours, l'exemple utilisé est celui des notes scolaires, où l'on fait l'hypothèse que les notes sont distribuées uniformément. Le processus comprend le calcul de la taille totale de l'échantillon, la répartition de celui-ci dans des catégories égales, et l'utilisation de la table de distribution du Chi-carré pour trouver les valeurs critiques. La comparaison des valeurs observées avec les valeurs attendues permet de déterminer si l'hypothèse de distribution uniforme peut être rejetée.

  • 00:05:00 - 00:10:00

    En utilisant les valeurs observées et les valeurs attendues, on effectue un test de Chi-carré sur une calculatrice. Dans cet exemple, la valeur du test statistique est de 26,545. Sachant que la valeur critique est de 9,48, la valeur statistique observée dépasse cette valeur critique, ce qui suggère un rejet de l'hypothèse de distribution uniforme des notes scolaires. Cela conclut que les notes ne sont pas uniformément distribuées, contrairement à l'hypothèse initiale.

  • 00:10:00 - 00:17:11

    Pour illustrer un autre type de test de Chi-carré de conformité, un scénario avec des proportions spécifiques pour chaque catégorie est examiné. Les pourcentages sont convertis en valeurs attendues et comparés aux valeurs observées. Dans cet exemple, le test statistique donne une valeur de 1,80, qui est inférieure à la valeur critique de 9,48, indiquant qu'il n'y a pas suffisamment de preuves pour rejeter l'hypothèse selon laquelle les proportions sont conformes à celles proposées. Cela souligne l'importance d'une vérification minutieuse des calculs pour éviter les erreurs.

Mapa mental

Vídeo de preguntas y respuestas

  • Qu'est-ce que le test d'ajustement du chi-carré?

    Le test d'ajustement du chi-carré est une méthode statistique utilisée pour vérifier si une distribution observée de données s'aligne avec une distribution attendue.

  • Comment calcule-t-on la taille de l'échantillon pour le test du chi-carré?

    La taille de l'échantillon est calculée en additionnant toutes les observations individuelles des catégories.

  • Que signifie une valeur critique dans le contexte du test du chi-carré?

    Une valeur critique est le seuil au-delà duquel la statistique de test observée suggère que l'hypothèse nulle peut être rejetée.

  • Quand peut-on rejeter la réclamation lors d'un test d'ajustement du chi-carré ?

    On peut rejeter la réclamation si la statistique de test est supérieure à la valeur critique déterminée pour le niveau de confiance choisi.

  • Qu'est-ce qu'un degré de liberté dans le test du chi-carré?

    Un degré de liberté est calculé en soustrayant un du nombre de catégories (K-1).

  • Pourquoi est-il important de vérifier les entrées lors de calculs du chi-carré?

    Vérifier les entrées est crucial pour s'assurer que les calculs sont corrects, évitant ainsi des erreurs de calcul qui peuvent fausser les résultats.

  • Comment se situe la distribution du chi-carré?

    La distribution du chi-carré est asymétrique, commençant à zéro et s'élargissant vers la droite.

  • Quelle est la différence entre une distribution égale et une distribution par pourcentage dans les tests du chi-carré?

    Une distribution égale suppose que chaque catégorie devrait avoir le même nombre de fréquences, tandis qu'une distribution par pourcentage utilise des proportions spécifiques pour chaque catégorie.

Ver más resúmenes de vídeos

Obtén acceso instantáneo a resúmenes gratuitos de vídeos de YouTube gracias a la IA.
Subtítulos
en
Desplazamiento automático:
  • 00:00:00
    hello and welcome to lesson
  • 00:00:02
    23 the goodness of fit test the idea of
  • 00:00:06
    the goodness of fit test is somebody
  • 00:00:08
    makes a claim like for example I claim
  • 00:00:10
    that the grades in the class are evenly
  • 00:00:13
    distributed so basically that's saying
  • 00:00:15
    you'll have the same number of A's the
  • 00:00:17
    same number of B's the same number of
  • 00:00:19
    C's as you can see my claim is not even
  • 00:00:24
    close they're not evenly
  • 00:00:26
    distributed but I'm still going to make
  • 00:00:28
    the claim that they're evenly dist
  • 00:00:31
    distributed so one thing we need to do
  • 00:00:33
    is count the number of categories so
  • 00:00:35
    there's a b c d f so that's 1 2 3 four
  • 00:00:39
    five and for some reason they use the
  • 00:00:43
    letter k for categories so categories
  • 00:00:46
    there's five of
  • 00:00:47
    them the next thing I need to do is find
  • 00:00:50
    out what the sample size is so add up
  • 00:00:52
    all of these students and see how many
  • 00:00:54
    that
  • 00:00:57
    is calculator is looking a little bit
  • 00:00:59
    bright right let me turn that
  • 00:01:02
    down okay that's better so a 12 + a 16
  • 00:01:09
    and a 22 and a three and a five means
  • 00:01:13
    there's 55
  • 00:01:16
    students so
  • 00:01:19
    n equals
  • 00:01:22
    55 and when the claim is that they're
  • 00:01:25
    evenly
  • 00:01:26
    distributed basically you just take the
  • 00:01:28
    n and divide by the K so basically 55
  • 00:01:32
    students total divide them evenly into
  • 00:01:35
    five categories and
  • 00:01:39
    55 / 5 that should be an 11 55 ID 5
  • 00:01:47
    that's an 11 and if this turned out to
  • 00:01:49
    be a decimal just go ahead and use the
  • 00:01:52
    decimal so this turned out to be a nice
  • 00:01:55
    even 11 so that's what's expected so if
  • 00:01:59
    the claim is true and they are evenly
  • 00:02:02
    distributed then there should have been
  • 00:02:04
    11 in each
  • 00:02:06
    category so this is 11 11 and in theory
  • 00:02:11
    that's what it be would be if they're
  • 00:02:13
    evenly
  • 00:02:15
    distributed you can also say that the
  • 00:02:18
    claim is that the proportion of people
  • 00:02:21
    that get A's equals the proportion of
  • 00:02:24
    people to get B's etc for
  • 00:02:27
    C's and D's and
  • 00:02:31
    Fs so there's the claim and with this
  • 00:02:34
    there is no h sub o and H1 we just go
  • 00:02:37
    with the claim just by
  • 00:02:41
    itself okay now this is going to be a
  • 00:02:44
    Kai squar test and the K squ
  • 00:02:46
    distribution looks like
  • 00:02:49
    this so it starts at zero and then it's
  • 00:02:54
    skewed looking like
  • 00:02:57
    this and in order to find the critical
  • 00:03:00
    value well for one thing this is a right
  • 00:03:02
    tail test only so there's only a right
  • 00:03:11
    tail that makes it nice because then you
  • 00:03:13
    don't have to decide left tail right
  • 00:03:16
    tail this is always right tail and then
  • 00:03:20
    if we use 95% level of confidence then
  • 00:03:23
    this little tail over here would be 5%
  • 00:03:27
    or 1us .95 is the
  • 00:03:30
    05 and then we need to look up the
  • 00:03:32
    critical
  • 00:03:34
    value so for many of these videos I've
  • 00:03:36
    been using the calculator to find the
  • 00:03:38
    critical values using the distribution
  • 00:03:41
    right here but they don't happen to have
  • 00:03:43
    this one in the calculator so we
  • 00:03:45
    actually have to rely on a
  • 00:03:48
    table
  • 00:03:50
    and wherever you clicked on the the
  • 00:03:52
    video I did put a link to this table
  • 00:03:55
    right
  • 00:03:56
    here so this is the kai Square distri
  • 00:04:00
    distribution and the ones over here are
  • 00:04:02
    if you have a left tail and the ones
  • 00:04:04
    over here are for a right tail so we're
  • 00:04:06
    actually only going to be using the
  • 00:04:09
    right
  • 00:04:12
    side and what I'm looking for is on the
  • 00:04:15
    right side there's
  • 00:04:16
    05 so I use
  • 00:04:20
    05 and also we need degrees of freedom
  • 00:04:22
    so I know how far down to go in this
  • 00:04:25
    list well the degrees of freedom just
  • 00:04:28
    comes from the k
  • 00:04:31
    -1 for the T Test it's n minus1 but for
  • 00:04:34
    this one it's K
  • 00:04:36
    minus1 so the degrees of
  • 00:04:39
    freedom so in general degrees of freedom
  • 00:04:43
    is K minus one so in this case it's
  • 00:04:47
    going to be 5 - 1 it's
  • 00:04:50
    four okay now back to my piece of
  • 00:04:55
    paper so over here they have the degrees
  • 00:04:58
    of freedom and degrees of Freedom Four
  • 00:05:00
    is right
  • 00:05:02
    here and then I like to make a line
  • 00:05:06
    right
  • 00:05:08
    there so this is degrees of freedom of
  • 00:05:10
    four this is the column that said
  • 00:05:13
    05 so then I go right here and that says
  • 00:05:17
    a
  • 00:05:18
    9.48 so that's the critical value
  • 00:05:25
    9.48 so the critical value is
  • 00:05:28
    9.4
  • 00:05:30
    88 all right now we're almost done now
  • 00:05:34
    in order to get the test statistic we're
  • 00:05:36
    going to use the calculator and I'm
  • 00:05:38
    going to put these observed values in
  • 00:05:40
    list one and these expected values in
  • 00:05:43
    list
  • 00:05:47
    two so go to stat and then edit the list
  • 00:05:53
    and I happen to have some stuff left
  • 00:05:54
    over so I need to go to the top of the
  • 00:05:57
    list and then use clear to clear out the
  • 00:06:01
    list and L2 also clear it
  • 00:06:05
    out and now L1 is going to be the
  • 00:06:08
    observed values so that's a
  • 00:06:11
    12 16
  • 00:06:14
    22 a three and a
  • 00:06:18
    two and then the expected values all of
  • 00:06:22
    those 11s go in list two so 11 just keep
  • 00:06:25
    typing 11 five
  • 00:06:28
    times there we
  • 00:06:31
    go and then for the last part you go
  • 00:06:34
    to you go to
  • 00:06:37
    stat move over to
  • 00:06:39
    test and this is called the goodness of
  • 00:06:42
    fit test so you go down to where it says
  • 00:06:46
    gof for goodness of fit so that's letter
  • 00:06:53
    D so D the kai squared good
  • 00:07:00
    of fit
  • 00:07:02
    test is what I'm using on the
  • 00:07:10
    calculator and when you hit enter it's
  • 00:07:13
    then going to ask you where are the
  • 00:07:15
    lists so the observed should be in list
  • 00:07:18
    one so if it doesn't say it put second
  • 00:07:21
    one the expected values should be in
  • 00:07:23
    list two and put second two for
  • 00:07:27
    L2 and then there they want to know
  • 00:07:30
    degrees of freedom which is
  • 00:07:33
    four and then just go down to
  • 00:07:37
    calculate and from this all we need is
  • 00:07:39
    the kai squared which is the
  • 00:07:46
    26545 so the test
  • 00:07:52
    statistic Kai squar equals a
  • 00:07:58
    26.5 4
  • 00:08:02
    five so this is the cut off for things
  • 00:08:04
    being unusual anything that goes past
  • 00:08:06
    9.4 is unusual this goes way past
  • 00:08:10
    that so that means that we can reject
  • 00:08:13
    the
  • 00:08:15
    claim so the
  • 00:08:17
    claim is
  • 00:08:20
    false in other words grades are not
  • 00:08:23
    evenly distributed if they were they
  • 00:08:25
    would have all been 11s They are not
  • 00:08:28
    close to 11
  • 00:08:30
    so my claim was
  • 00:08:33
    false okay then we need to do one more
  • 00:08:36
    example so basically with the goodness
  • 00:08:38
    of fit test there's two types one is it
  • 00:08:40
    says evenly distributed so you just take
  • 00:08:44
    the total number of people divide by the
  • 00:08:46
    categories and then that is what you use
  • 00:08:48
    for all of the expected
  • 00:08:50
    values then with the second type of
  • 00:08:54
    example it could give you
  • 00:08:57
    specific percentages to to use for each
  • 00:09:01
    category so we need to start off the
  • 00:09:03
    same the number of categories count how
  • 00:09:05
    many that is and that's still five
  • 00:09:09
    categories and then add up the number of
  • 00:09:12
    people and the these are the same
  • 00:09:14
    numbers as the last example all I'm
  • 00:09:17
    doing is changing the claim to show you
  • 00:09:19
    the idea of the two possibilities for
  • 00:09:22
    the claim so when you add these up that
  • 00:09:24
    still equals n equals 55
  • 00:09:28
    people
  • 00:09:30
    and then this one is saying that for the
  • 00:09:32
    A's there's going to be
  • 00:09:35
    20% so change the 20% to a decimal which
  • 00:09:40
    is20 and then multiply with the N
  • 00:09:43
    multiply with
  • 00:09:47
    55 and then go on to the B's and I said
  • 00:09:50
    for the B's it would be 25% so as a
  • 00:09:54
    decimal this is going to be 0 25 and
  • 00:09:57
    then it's always going to be the total
  • 00:09:59
    times the total number of people so next
  • 00:10:01
    is a 25 * 55
  • 00:10:06
    people and for the C's I said it was
  • 00:10:08
    going to be 40% that's going to
  • 00:10:12
    be40 *
  • 00:10:16
    55 for the D's I said it was going to be
  • 00:10:23
    10% and then finally for the fs I said
  • 00:10:26
    it would be 5% and 5% is is
  • 00:10:34
    05 it looks like I should have made the
  • 00:10:36
    Box a little bit bigger because now I
  • 00:10:37
    need to see what these numbers
  • 00:10:39
    equal I'll just write it right below the
  • 00:10:42
    box that's
  • 00:10:44
    okay so let me clear out that old
  • 00:10:46
    problem and then we've got2 *
  • 00:10:50
    55 so that's an
  • 00:10:54
    11 and then next is 25 * 55
  • 00:11:00
    that's a
  • 00:11:06
    13.75 and then next is going to
  • 00:11:10
    be40 *
  • 00:11:13
    55 so that's
  • 00:11:16
    22 and right here you might say it's not
  • 00:11:19
    possible to have 75 of a person that's
  • 00:11:22
    true but right now we're not talking
  • 00:11:24
    about actual people we're just saying in
  • 00:11:27
    general in theory how many people would
  • 00:11:29
    that be so you go ahead and use the
  • 00:11:32
    13.75 and then onto the next one 10% of
  • 00:11:38
    55 that is
  • 00:11:42
    5.5 and then
  • 00:11:46
    last 05 * 55 people and that's
  • 00:11:53
    2.75 so this last one is
  • 00:11:58
    2.75 okay now for the
  • 00:12:01
    claim so basically I said that 20% of
  • 00:12:04
    people will get an A so that's saying
  • 00:12:06
    the proportion of A's will be in decimal
  • 00:12:10
    form that's
  • 00:12:12
    20 the proportion for
  • 00:12:15
    B's is 25 so that's in decimal form
  • 00:12:21
    25 the proportion of people that will
  • 00:12:23
    get C's is 40% so that's 40
  • 00:12:30
    and then for the D's I said that it was
  • 00:12:32
    going to be 10% so that's 10 and then
  • 00:12:35
    last for the fs I said it would be 5% so
  • 00:12:38
    that's
  • 00:12:41
    05 all right now we just need the
  • 00:12:45
    picture so the distribution looks like
  • 00:12:52
    this so it starts at zero and then it's
  • 00:12:56
    it's
  • 00:12:57
    skewed and degrees of freedom is going
  • 00:12:59
    to be the same as last time so that's
  • 00:13:01
    going to be K -1 is
  • 00:13:06
    4 and we're going to use 95% level of
  • 00:13:10
    confidence which means that this right
  • 00:13:12
    tail because remember we only use the
  • 00:13:14
    right tail for this test is
  • 00:13:18
    05 and that's going to be the same
  • 00:13:21
    critical value as last
  • 00:13:23
    time so it's still
  • 00:13:25
    05 and degrees of freedom is four so if
  • 00:13:29
    you go down that is the
  • 00:13:33
    9.48 so the critical value equals
  • 00:13:40
    9.48 and then let's see if we can reject
  • 00:13:43
    my claim so these are going to go in
  • 00:13:46
    list one and these are going these are
  • 00:13:48
    going to go in list
  • 00:13:54
    two so just go to stat and edit and
  • 00:13:59
    these numbers are actually the same so I
  • 00:14:02
    can just leave those in there and then
  • 00:14:04
    these are supposed to be an 11 and then
  • 00:14:06
    a
  • 00:14:09
    13.75 and then a
  • 00:14:11
    22 a
  • 00:14:14
    5.5 and a
  • 00:14:17
    2.75 and just go to stat tests and then
  • 00:14:22
    scroll down to the goodness of fit test
  • 00:14:26
    which is letter
  • 00:14:28
    d
  • 00:14:31
    and because I'm still using list one and
  • 00:14:33
    list two I can just leave that there
  • 00:14:36
    degrees of freedom is actually the same
  • 00:14:38
    so I can just leave that
  • 00:14:40
    there color doesn't matter because I'm
  • 00:14:43
    not graphing
  • 00:14:45
    it and then it says that k^2 equal
  • 00:14:51
    181 that seems very very large did I
  • 00:14:55
    mistype
  • 00:14:57
    something oh look it right
  • 00:15:01
    there when I went to type the 22 I
  • 00:15:04
    accidentally typed
  • 00:15:06
    222 so the reason I thought it was
  • 00:15:09
    strange is because on the first
  • 00:15:12
    example these numbers okay that's close
  • 00:15:15
    but these aren't close these aren't
  • 00:15:16
    close that's not close that's not close
  • 00:15:19
    and so you should get a big test
  • 00:15:21
    statistic in order to say that it's
  • 00:15:23
    false well when I just did it right now
  • 00:15:26
    I got like 181 which I thought was weird
  • 00:15:29
    and that's because I mistype this one
  • 00:15:31
    which of course I did that on a on
  • 00:15:33
    purpose just to show you that nobody's
  • 00:15:36
    perfect sometimes you have to go back
  • 00:15:38
    and check your work so let me double
  • 00:15:41
    check 11
  • 00:15:44
    13.75 a regular 22 a 5.5 and a
  • 00:15:48
    2.75 okay good now go back to stat tests
  • 00:15:54
    and the goodness of fit
  • 00:15:57
    test
  • 00:15:59
    and this is all the same so I can just
  • 00:16:01
    skip down to
  • 00:16:02
    calculate that's more reasonable k^2
  • 00:16:05
    equal a
  • 00:16:08
    1.8 so for the last part the
  • 00:16:11
    test
  • 00:16:14
    statistic ki^ squ equals a
  • 00:16:19
    1.8 and that was actually a 1.80 so you
  • 00:16:23
    could leave it at 1.8 or to emphasize
  • 00:16:25
    you could say
  • 00:16:26
    1.80 so if it were to go past to 9.4
  • 00:16:30
    that would be considered unusual this is
  • 00:16:33
    not even close it's actually Landing
  • 00:16:36
    more like right here so that means we do
  • 00:16:39
    not have enough evidence to reject the
  • 00:16:41
    claim cuz if you look at these numbers a
  • 00:16:44
    2.7 compared to two that's not off by
  • 00:16:47
    very much this one is perfect this one's
  • 00:16:50
    only off by one this one's off by two
  • 00:16:53
    and A4 this one's off by one and a half
  • 00:16:56
    so they're not off by that much so we
  • 00:17:02
    cannot reject the
  • 00:17:10
    claim
Etiquetas
  • test d'adéquation
  • chi-carré
  • distribution
  • statistique
  • réclamation
  • valeur critique
  • degrés de liberté
  • hypothèse
  • pourcentages spécifiques
  • calcul