Qu'est-ce que le test d'ajustement du chi-carré?

Le test d'ajustement du chi-carré est une méthode statistique utilisée pour vérifier si une distribution observée de données s'aligne avec une distribution attendue.

Comment calcule-t-on la taille de l'échantillon pour le test du chi-carré?

La taille de l'échantillon est calculée en additionnant toutes les observations individuelles des catégories.

Que signifie une valeur critique dans le contexte du test du chi-carré?

Une valeur critique est le seuil au-delà duquel la statistique de test observée suggère que l'hypothèse nulle peut être rejetée.

Quand peut-on rejeter la réclamation lors d'un test d'ajustement du chi-carré ?

On peut rejeter la réclamation si la statistique de test est supérieure à la valeur critique déterminée pour le niveau de confiance choisi.

Qu'est-ce qu'un degré de liberté dans le test du chi-carré?

Un degré de liberté est calculé en soustrayant un du nombre de catégories (K-1).

Pourquoi est-il important de vérifier les entrées lors de calculs du chi-carré?

Vérifier les entrées est crucial pour s'assurer que les calculs sont corrects, évitant ainsi des erreurs de calcul qui peuvent fausser les résultats.

Comment se situe la distribution du chi-carré?

La distribution du chi-carré est asymétrique, commençant à zéro et s'élargissant vers la droite.

Quelle est la différence entre une distribution égale et une distribution par pourcentage dans les tests du chi-carré?

Une distribution égale suppose que chaque catégorie devrait avoir le même nombre de fréquences, tandis qu'une distribution par pourcentage utilise des proportions spécifiques pour chaque catégorie.

Math 13X Lesson 23 Goodness of Fit Test

00:17:11

https://www.youtube.com/watch?v=0TDttbhbc30

摘要

TLDRLa vidéo présente le test de conformité du chi-carré, une méthode statistique pour vérifier si une distribution observée correspond à une distribution théorique. Exemple donné : les étudiants notés de A à F sont supposés avoir une distribution égale (même nombre de chaque note). Le total des étudiants (55) est divisé par le nombre de catégories (5) pour obtenir le nombre attendu dans chaque catégorie (11). Une table est utilisée pour déterminer la valeur critique (9,48) en fonction des degrés de liberté. Si le test statistique observé dépasse cette valeur, la réclamation est rejetée (exemples avec des pourcentages spécifiques sont aussi discutés). Finalement, la nécessité de la précision dans les calculs est mise en évidence.

心得

🔍 Le test de conformité du chi-carré évalue la correspondance entre une distribution observée et une distribution attendue.
📊 Pour des distributions égales, chaque catégorie doit avoir un nombre égal de données.
📈 Les valeurs critiques déterminent si une réclamation statistique peut être rejetée.
🔢 Les degrés de liberté influencent la détermination des valeurs critiques.
🧮 Une table ou un calculateur est utilisé pour déterminer les valeurs critiques dans ce test.
⚖️ Une statistique de test élevée par rapport à la valeur critique indique une distribution non conforme.
🔄 Le test offre une capacité à vérifier les réclamations de distribution via des proportions spécifiques.
🔗 Le test se base sur les données observées versus attendues pour valider ou invalider une hypothèse.
🚫 Une erreur dans les entrées peut mener à des conclusions incorrectes.
✔️ L'importance de la précision dans l'entrée des données et des calculs est soulignée.

时间轴

00:00:00 - 00:05:00
Le test du Chi-carré de conformité est une méthode statistique utilisée pour vérifier si les données catégorielles suivent une distribution supposée. Dans ce cours, l'exemple utilisé est celui des notes scolaires, où l'on fait l'hypothèse que les notes sont distribuées uniformément. Le processus comprend le calcul de la taille totale de l'échantillon, la répartition de celui-ci dans des catégories égales, et l'utilisation de la table de distribution du Chi-carré pour trouver les valeurs critiques. La comparaison des valeurs observées avec les valeurs attendues permet de déterminer si l'hypothèse de distribution uniforme peut être rejetée.
00:05:00 - 00:10:00
En utilisant les valeurs observées et les valeurs attendues, on effectue un test de Chi-carré sur une calculatrice. Dans cet exemple, la valeur du test statistique est de 26,545. Sachant que la valeur critique est de 9,48, la valeur statistique observée dépasse cette valeur critique, ce qui suggère un rejet de l'hypothèse de distribution uniforme des notes scolaires. Cela conclut que les notes ne sont pas uniformément distribuées, contrairement à l'hypothèse initiale.
00:10:00 - 00:17:11
Pour illustrer un autre type de test de Chi-carré de conformité, un scénario avec des proportions spécifiques pour chaque catégorie est examiné. Les pourcentages sont convertis en valeurs attendues et comparés aux valeurs observées. Dans cet exemple, le test statistique donne une valeur de 1,80, qui est inférieure à la valeur critique de 9,48, indiquant qu'il n'y a pas suffisamment de preuves pour rejeter l'hypothèse selon laquelle les proportions sont conformes à celles proposées. Cela souligne l'importance d'une vérification minutieuse des calculs pour éviter les erreurs.

思维导图

视频问答

Qu'est-ce que le test d'ajustement du chi-carré?
Le test d'ajustement du chi-carré est une méthode statistique utilisée pour vérifier si une distribution observée de données s'aligne avec une distribution attendue.
Comment calcule-t-on la taille de l'échantillon pour le test du chi-carré?
La taille de l'échantillon est calculée en additionnant toutes les observations individuelles des catégories.
Que signifie une valeur critique dans le contexte du test du chi-carré?
Une valeur critique est le seuil au-delà duquel la statistique de test observée suggère que l'hypothèse nulle peut être rejetée.
Quand peut-on rejeter la réclamation lors d'un test d'ajustement du chi-carré ?
On peut rejeter la réclamation si la statistique de test est supérieure à la valeur critique déterminée pour le niveau de confiance choisi.
Qu'est-ce qu'un degré de liberté dans le test du chi-carré?
Un degré de liberté est calculé en soustrayant un du nombre de catégories (K-1).
Pourquoi est-il important de vérifier les entrées lors de calculs du chi-carré?
Vérifier les entrées est crucial pour s'assurer que les calculs sont corrects, évitant ainsi des erreurs de calcul qui peuvent fausser les résultats.
Comment se situe la distribution du chi-carré?
La distribution du chi-carré est asymétrique, commençant à zéro et s'élargissant vers la droite.
Quelle est la différence entre une distribution égale et une distribution par pourcentage dans les tests du chi-carré?
Une distribution égale suppose que chaque catégorie devrait avoir le même nombre de fréquences, tandis qu'une distribution par pourcentage utilise des proportions spécifiques pour chaque catégorie.

查看更多视频摘要

即时访问由人工智能支持的免费 YouTube 视频摘要！

字幕

自动滚动:

00:00:00
hello and welcome to lesson
00:00:02
23 the goodness of fit test the idea of
00:00:06
the goodness of fit test is somebody
00:00:08
makes a claim like for example I claim
00:00:10
that the grades in the class are evenly
00:00:13
distributed so basically that's saying
00:00:15
you'll have the same number of A's the
00:00:17
same number of B's the same number of
00:00:19
C's as you can see my claim is not even
00:00:24
close they're not evenly
00:00:26
distributed but I'm still going to make
00:00:28
the claim that they're evenly dist
00:00:31
distributed so one thing we need to do
00:00:33
is count the number of categories so
00:00:35
there's a b c d f so that's 1 2 3 four
00:00:39
five and for some reason they use the
00:00:43
letter k for categories so categories
00:00:46
there's five of
00:00:47
them the next thing I need to do is find
00:00:50
out what the sample size is so add up
00:00:52
all of these students and see how many
00:00:54
that
00:00:57
is calculator is looking a little bit
00:00:59
bright right let me turn that
00:01:02
down okay that's better so a 12 + a 16
00:01:09
and a 22 and a three and a five means
00:01:13
there's 55
00:01:16
students so
00:01:19
n equals
00:01:22
55 and when the claim is that they're
00:01:25
evenly
00:01:26
distributed basically you just take the
00:01:28
n and divide by the K so basically 55
00:01:32
students total divide them evenly into
00:01:35
five categories and
00:01:39
55 / 5 that should be an 11 55 ID 5
00:01:47
that's an 11 and if this turned out to
00:01:49
be a decimal just go ahead and use the
00:01:52
decimal so this turned out to be a nice
00:01:55
even 11 so that's what's expected so if
00:01:59
the claim is true and they are evenly
00:02:02
distributed then there should have been
00:02:04
11 in each
00:02:06
category so this is 11 11 and in theory
00:02:11
that's what it be would be if they're
00:02:13
evenly
00:02:15
distributed you can also say that the
00:02:18
claim is that the proportion of people
00:02:21
that get A's equals the proportion of
00:02:24
people to get B's etc for
00:02:27
C's and D's and
00:02:31
Fs so there's the claim and with this
00:02:34
there is no h sub o and H1 we just go
00:02:37
with the claim just by
00:02:41
itself okay now this is going to be a
00:02:44
Kai squar test and the K squ
00:02:46
distribution looks like
00:02:49
this so it starts at zero and then it's
00:02:54
skewed looking like
00:02:57
this and in order to find the critical
00:03:00
value well for one thing this is a right
00:03:02
tail test only so there's only a right
00:03:11
tail that makes it nice because then you
00:03:13
don't have to decide left tail right
00:03:16
tail this is always right tail and then
00:03:20
if we use 95% level of confidence then
00:03:23
this little tail over here would be 5%
00:03:27
or 1us .95 is the
00:03:30
05 and then we need to look up the
00:03:32
critical
00:03:34
value so for many of these videos I've
00:03:36
been using the calculator to find the
00:03:38
critical values using the distribution
00:03:41
right here but they don't happen to have
00:03:43
this one in the calculator so we
00:03:45
actually have to rely on a
00:03:48
table
00:03:50
and wherever you clicked on the the
00:03:52
video I did put a link to this table
00:03:55
right
00:03:56
here so this is the kai Square distri
00:04:00
distribution and the ones over here are
00:04:02
if you have a left tail and the ones
00:04:04
over here are for a right tail so we're
00:04:06
actually only going to be using the
00:04:09
right
00:04:12
side and what I'm looking for is on the
00:04:15
right side there's
00:04:16
05 so I use
00:04:20
05 and also we need degrees of freedom
00:04:22
so I know how far down to go in this
00:04:25
list well the degrees of freedom just
00:04:28
comes from the k
00:04:31
-1 for the T Test it's n minus1 but for
00:04:34
this one it's K
00:04:36
minus1 so the degrees of
00:04:39
freedom so in general degrees of freedom
00:04:43
is K minus one so in this case it's
00:04:47
going to be 5 - 1 it's
00:04:50
four okay now back to my piece of
00:04:55
paper so over here they have the degrees
00:04:58
of freedom and degrees of Freedom Four
00:05:00
is right
00:05:02
here and then I like to make a line
00:05:06
right
00:05:08
there so this is degrees of freedom of
00:05:10
four this is the column that said
00:05:13
05 so then I go right here and that says
00:05:17
a
00:05:18
9.48 so that's the critical value
00:05:25
9.48 so the critical value is
00:05:28
9.4
00:05:30
88 all right now we're almost done now
00:05:34
in order to get the test statistic we're
00:05:36
going to use the calculator and I'm
00:05:38
going to put these observed values in
00:05:40
list one and these expected values in
00:05:43
list
00:05:47
two so go to stat and then edit the list
00:05:53
and I happen to have some stuff left
00:05:54
over so I need to go to the top of the
00:05:57
list and then use clear to clear out the
00:06:01
list and L2 also clear it
00:06:05
out and now L1 is going to be the
00:06:08
observed values so that's a
00:06:11
12 16
00:06:14
22 a three and a
00:06:18
two and then the expected values all of
00:06:22
those 11s go in list two so 11 just keep
00:06:25
typing 11 five
00:06:28
times there we
00:06:31
go and then for the last part you go
00:06:34
to you go to
00:06:37
stat move over to
00:06:39
test and this is called the goodness of
00:06:42
fit test so you go down to where it says
00:06:46
gof for goodness of fit so that's letter
00:06:53
D so D the kai squared good
00:07:00
of fit
00:07:02
test is what I'm using on the
00:07:10
calculator and when you hit enter it's
00:07:13
then going to ask you where are the
00:07:15
lists so the observed should be in list
00:07:18
one so if it doesn't say it put second
00:07:21
one the expected values should be in
00:07:23
list two and put second two for
00:07:27
L2 and then there they want to know
00:07:30
degrees of freedom which is
00:07:33
four and then just go down to
00:07:37
calculate and from this all we need is
00:07:39
the kai squared which is the
00:07:46
26545 so the test
00:07:52
statistic Kai squar equals a
00:07:58
26.5 4
00:08:02
five so this is the cut off for things
00:08:04
being unusual anything that goes past
00:08:06
9.4 is unusual this goes way past
00:08:10
that so that means that we can reject
00:08:13
the
00:08:15
claim so the
00:08:17
claim is
00:08:20
false in other words grades are not
00:08:23
evenly distributed if they were they
00:08:25
would have all been 11s They are not
00:08:28
close to 11
00:08:30
so my claim was
00:08:33
false okay then we need to do one more
00:08:36
example so basically with the goodness
00:08:38
of fit test there's two types one is it
00:08:40
says evenly distributed so you just take
00:08:44
the total number of people divide by the
00:08:46
categories and then that is what you use
00:08:48
for all of the expected
00:08:50
values then with the second type of
00:08:54
example it could give you
00:08:57
specific percentages to to use for each
00:09:01
category so we need to start off the
00:09:03
same the number of categories count how
00:09:05
many that is and that's still five
00:09:09
categories and then add up the number of
00:09:12
people and the these are the same
00:09:14
numbers as the last example all I'm
00:09:17
doing is changing the claim to show you
00:09:19
the idea of the two possibilities for
00:09:22
the claim so when you add these up that
00:09:24
still equals n equals 55
00:09:28
people
00:09:30
and then this one is saying that for the
00:09:32
A's there's going to be
00:09:35
20% so change the 20% to a decimal which
00:09:40
is20 and then multiply with the N
00:09:43
multiply with
00:09:47
55 and then go on to the B's and I said
00:09:50
for the B's it would be 25% so as a
00:09:54
decimal this is going to be 0 25 and
00:09:57
then it's always going to be the total
00:09:59
times the total number of people so next
00:10:01
is a 25 * 55
00:10:06
people and for the C's I said it was
00:10:08
going to be 40% that's going to
00:10:12
be40 *
00:10:16
55 for the D's I said it was going to be
00:10:23
10% and then finally for the fs I said
00:10:26
it would be 5% and 5% is is
00:10:34
05 it looks like I should have made the
00:10:36
Box a little bit bigger because now I
00:10:37
need to see what these numbers
00:10:39
equal I'll just write it right below the
00:10:42
box that's
00:10:44
okay so let me clear out that old
00:10:46
problem and then we've got2 *
00:10:50
55 so that's an
00:10:54
11 and then next is 25 * 55
00:11:00
that's a
00:11:06
13.75 and then next is going to
00:11:10
be40 *
00:11:13
55 so that's
00:11:16
22 and right here you might say it's not
00:11:19
possible to have 75 of a person that's
00:11:22
true but right now we're not talking
00:11:24
about actual people we're just saying in
00:11:27
general in theory how many people would
00:11:29
that be so you go ahead and use the
00:11:32
13.75 and then onto the next one 10% of
00:11:38
55 that is
00:11:42
5.5 and then
00:11:46
last 05 * 55 people and that's
00:11:53
2.75 so this last one is
00:11:58
2.75 okay now for the
00:12:01
claim so basically I said that 20% of
00:12:04
people will get an A so that's saying
00:12:06
the proportion of A's will be in decimal
00:12:10
form that's
00:12:12
20 the proportion for
00:12:15
B's is 25 so that's in decimal form
00:12:21
25 the proportion of people that will
00:12:23
get C's is 40% so that's 40
00:12:30
and then for the D's I said that it was
00:12:32
going to be 10% so that's 10 and then
00:12:35
last for the fs I said it would be 5% so
00:12:38
that's
00:12:41
05 all right now we just need the
00:12:45
picture so the distribution looks like
00:12:52
this so it starts at zero and then it's
00:12:56
it's
00:12:57
skewed and degrees of freedom is going
00:12:59
to be the same as last time so that's
00:13:01
going to be K -1 is
00:13:06
4 and we're going to use 95% level of
00:13:10
confidence which means that this right
00:13:12
tail because remember we only use the
00:13:14
right tail for this test is
00:13:18
05 and that's going to be the same
00:13:21
critical value as last
00:13:23
time so it's still
00:13:25
05 and degrees of freedom is four so if
00:13:29
you go down that is the
00:13:33
9.48 so the critical value equals
00:13:40
9.48 and then let's see if we can reject
00:13:43
my claim so these are going to go in
00:13:46
list one and these are going these are
00:13:48
going to go in list
00:13:54
two so just go to stat and edit and
00:13:59
these numbers are actually the same so I
00:14:02
can just leave those in there and then
00:14:04
these are supposed to be an 11 and then
00:14:06
a
00:14:09
13.75 and then a
00:14:11
22 a
00:14:14
5.5 and a
00:14:17
2.75 and just go to stat tests and then
00:14:22
scroll down to the goodness of fit test
00:14:26
which is letter
00:14:28
d
00:14:31
and because I'm still using list one and
00:14:33
list two I can just leave that there
00:14:36
degrees of freedom is actually the same
00:14:38
so I can just leave that
00:14:40
there color doesn't matter because I'm
00:14:43
not graphing
00:14:45
it and then it says that k^2 equal
00:14:51
181 that seems very very large did I
00:14:55
mistype
00:14:57
something oh look it right
00:15:01
there when I went to type the 22 I
00:15:04
accidentally typed
00:15:06
222 so the reason I thought it was
00:15:09
strange is because on the first
00:15:12
example these numbers okay that's close
00:15:15
but these aren't close these aren't
00:15:16
close that's not close that's not close
00:15:19
and so you should get a big test
00:15:21
statistic in order to say that it's
00:15:23
false well when I just did it right now
00:15:26
I got like 181 which I thought was weird
00:15:29
and that's because I mistype this one
00:15:31
which of course I did that on a on
00:15:33
purpose just to show you that nobody's
00:15:36
perfect sometimes you have to go back
00:15:38
and check your work so let me double
00:15:41
check 11
00:15:44
13.75 a regular 22 a 5.5 and a
00:15:48
2.75 okay good now go back to stat tests
00:15:54
and the goodness of fit
00:15:57
test
00:15:59
and this is all the same so I can just
00:16:01
skip down to
00:16:02
calculate that's more reasonable k^2
00:16:05
equal a
00:16:08
1.8 so for the last part the
00:16:11
test
00:16:14
statistic ki^ squ equals a
00:16:19
1.8 and that was actually a 1.80 so you
00:16:23
could leave it at 1.8 or to emphasize
00:16:25
you could say
00:16:26
1.80 so if it were to go past to 9.4
00:16:30
that would be considered unusual this is
00:16:33
not even close it's actually Landing
00:16:36
more like right here so that means we do
00:16:39
not have enough evidence to reject the
00:16:41
claim cuz if you look at these numbers a
00:16:44
2.7 compared to two that's not off by
00:16:47
very much this one is perfect this one's
00:16:50
only off by one this one's off by two
00:16:53
and A4 this one's off by one and a half
00:16:56
so they're not off by that much so we
00:17:02
cannot reject the
00:17:10
claim

标签

test d'adéquation
chi-carré
distribution
statistique
réclamation
valeur critique
degrés de liberté
hypothèse
pourcentages spécifiques
calcul