Example Of Chi Square Test For Goodness Of Fit

The Chi-Square Goodness-of-Fit test is a powerful statistical tool used to determine whether observed sample data fits a hypothesized distribution. It's particularly useful when dealing with categorical data, allowing us to assess if the proportions of different categories in a sample match a pre-defined expectation. This test relies on comparing the observed frequencies of each category with the frequencies we would expect if the null hypothesis (the hypothesized distribution) were true. A significant difference between these observed and expected frequencies suggests that the sample data does not support the null hypothesis.

Understanding the Core Concepts

Before diving into examples, let's solidify our understanding of the key components of the Chi-Square Goodness-of-Fit test:

Null Hypothesis (H0): This hypothesis states that there is no significant difference between the observed frequencies and the expected frequencies. In other words, the sample data fits the hypothesized distribution.
Alternative Hypothesis (H1): This hypothesis states that there is a significant difference between the observed frequencies and the expected frequencies. The sample data does not fit the hypothesized distribution.
Observed Frequencies (O): These are the actual counts of observations in each category from the sample data.
Expected Frequencies (E): These are the counts we expect to see in each category if the null hypothesis were true. They are calculated based on the hypothesized distribution and the total sample size.
Chi-Square Statistic (χ2): This statistic measures the discrepancy between the observed and expected frequencies. It's calculated using the formula:

χ2 = Σ [(O - E)2 / E]

where Σ represents the sum across all categories.
Degrees of Freedom (df): This value reflects the number of categories that are free to vary. It's calculated as:

df = (number of categories) - (number of estimated parameters) - 1

In the simplest cases where we aren't estimating parameters from the data, df = (number of categories) - 1.
Significance Level (α): This is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common significance levels are 0.05 (5%) and 0.01 (1%).
P-value: This is the probability of obtaining a chi-square statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming the null hypothesis is true. A small p-value (typically less than the significance level) provides evidence against the null hypothesis.

Example 1: Testing a Fair Die

Scenario: You suspect that a six-sided die is not fair. To test this, you roll the die 60 times and observe the following frequencies:

1: 8 times
2: 11 times
3: 9 times
4: 14 times
5: 9 times
6: 9 times

Hypotheses:

H0: The die is fair (each number has an equal probability of being rolled).
H1: The die is not fair (the probabilities of rolling each number are not equal).

Calculations:

Expected Frequencies: If the die is fair, we would expect each number to appear approximately 60/6 = 10 times. So, E = 10 for each category.
Chi-Square Statistic: We calculate the contribution to the chi-square statistic for each category:
- (8 - 10)2 / 10 = 0.4
- (11 - 10)2 / 10 = 0.1
- (9 - 10)2 / 10 = 0.1
- (14 - 10)2 / 10 = 1.6
- (9 - 10)2 / 10 = 0.1
- (9 - 10)2 / 10 = 0.1
Summing these values, we get χ2 = 0.4 + 0.1 + 0.1 + 1.6 + 0.1 + 0.1 = 2.4
Degrees of Freedom: df = (number of categories) - 1 = 6 - 1 = 5
P-value: Using a chi-square distribution table or a statistical calculator with χ2 = 2.4 and df = 5, we find a p-value of approximately 0.79.

Conclusion:

Since the p-value (0.79) is greater than a typical significance level of 0.05, we fail to reject the null hypothesis. There is not enough evidence to conclude that the die is unfair. The observed frequencies are reasonably consistent with what we would expect from a fair die.

Example 2: Mendel's Pea Experiment

Scenario: Gregor Mendel, the father of genetics, conducted experiments with pea plants. In one experiment, he crossed pea plants with round, yellow seeds with plants with wrinkled, green seeds. He predicted that the F2 generation would have the following phenotypic ratio:

Round, Yellow: 9/16
Round, Green: 3/16
Wrinkled, Yellow: 3/16
Wrinkled, Green: 1/16

Mendel observed the following counts in the F2 generation (let's assume these are hypothetical observed counts for this example):

Round, Yellow: 315
Round, Green: 108
Wrinkled, Yellow: 101
Wrinkled, Green: 32

Hypotheses:

H0: The observed phenotypic ratios match Mendel's predicted ratios.
H1: The observed phenotypic ratios do not match Mendel's predicted ratios.

Calculations:

Total Number of Peas: 315 + 108 + 101 + 32 = 556
Expected Frequencies: We calculate the expected frequencies for each phenotype based on Mendel's predicted ratios and the total number of peas:
- Round, Yellow: (9/16) * 556 = 312.75
- Round, Green: (3/16) * 556 = 104.25
- Wrinkled, Yellow: (3/16) * 556 = 104.25
- Wrinkled, Green: (1/16) * 556 = 34.75
Chi-Square Statistic:
- (315 - 312.75)2 / 312.75 = 0.016
- (108 - 104.25)2 / 104.25 = 0.137
- (101 - 104.25)2 / 104.25 = 0.100
- (32 - 34.75)2 / 34.75 = 0.215
Summing these values, we get χ2 = 0.016 + 0.137 + 0.100 + 0.215 = 0.468
Degrees of Freedom: df = (number of categories) - 1 = 4 - 1 = 3
P-value: Using a chi-square distribution table or a statistical calculator with χ2 = 0.468 and df = 3, we find a p-value of approximately 0.926.

Conclusion:

Since the p-value (0.926) is much greater than a typical significance level of 0.05, we fail to reject the null hypothesis. There is strong evidence to support the claim that the observed phenotypic ratios are consistent with Mendel's predicted ratios.

Example 3: Testing for Uniform Distribution of Birthdays

Scenario: You want to investigate if birthdays are uniformly distributed throughout the year. You collect data on the number of births in each month for a particular year. Let's say you have the following (hypothetical) data:

January: 260
February: 230
March: 280
April: 250
May: 270
June: 240
July: 290
August: 300
September: 260
October: 280
November: 240
December: 280

Hypotheses:

H0: Birthdays are uniformly distributed throughout the year.
H1: Birthdays are not uniformly distributed throughout the year.

Calculations:

Total Number of Births: 260 + 230 + 280 + 250 + 270 + 240 + 290 + 300 + 260 + 280 + 240 + 280 = 3280
Expected Frequencies: If birthdays are uniformly distributed, we would expect approximately the same number of births in each month. So, E = 3280 / 12 = 273.33 for each month.
Chi-Square Statistic:
- (260 - 273.33)2 / 273.33 = 0.638
- (230 - 273.33)2 / 273.33 = 7.027
- (280 - 273.33)2 / 273.33 = 0.164
- (250 - 273.33)2 / 273.33 = 2.027
- (270 - 273.33)2 / 273.33 = 0.040
- (240 - 273.33)2 / 273.33 = 4.082
- (290 - 273.33)2 / 273.33 = 1.027
- (300 - 273.33)2 / 273.33 = 2.653
- (260 - 273.33)2 / 273.33 = 0.638
- (280 - 273.33)2 / 273.33 = 0.164
- (240 - 273.33)2 / 273.33 = 4.082
- (280 - 273.33)2 / 273.33 = 0.164
Summing these values, we get χ2 = 0.638 + 7.027 + 0.164 + 2.027 + 0.040 + 4.082 + 1.027 + 2.653 + 0.638 + 0.164 + 4.082 + 0.164 = 22.706
Degrees of Freedom: df = (number of categories) - 1 = 12 - 1 = 11
P-value: Using a chi-square distribution table or a statistical calculator with χ2 = 22.706 and df = 11, we find a p-value of approximately 0.017.

Conclusion:

Since the p-value (0.017) is less than a typical significance level of 0.05, we reject the null hypothesis. There is evidence to suggest that birthdays are not uniformly distributed throughout the year. Some months have significantly more or fewer births than expected under a uniform distribution. This could be due to various factors like seasonal trends in conception rates.

Example 4: Preference for Colors

Scenario: A marketing company wants to know if there's a preference for certain colors in packaging. They survey 200 consumers and ask them to choose their favorite color from a list of four options: Red, Blue, Green, and Yellow. The observed results are:

Red: 60
Blue: 55
Green: 45
Yellow: 40

Hypotheses:

H0: There is no preference for any of the colors (i.e., each color is equally likely to be chosen).
H1: There is a preference for at least one of the colors (i.e., the colors are not equally likely to be chosen).

Calculations:

Total Number of Consumers: 200
Expected Frequencies: If there is no preference, we would expect each color to be chosen approximately 200/4 = 50 times. So, E = 50 for each category.
Chi-Square Statistic:
- (60 - 50)2 / 50 = 2
- (55 - 50)2 / 50 = 0.5
- (45 - 50)2 / 50 = 0.5
- (40 - 50)2 / 50 = 2
Summing these values, we get χ2 = 2 + 0.5 + 0.5 + 2 = 5
Degrees of Freedom: df = (number of categories) - 1 = 4 - 1 = 3
P-value: Using a chi-square distribution table or a statistical calculator with χ2 = 5 and df = 3, we find a p-value of approximately 0.172.

Conclusion:

Since the p-value (0.172) is greater than a typical significance level of 0.05, we fail to reject the null hypothesis. There is not enough evidence to conclude that there is a significant preference for any of the colors. The observed differences in frequencies could be due to random chance.

Example 5: Testing Genetic Ratios (More Complex)

Scenario: In a dihybrid cross involving two genes, each with two alleles (A/a and B/b), the expected phenotypic ratio in the F2 generation is 9:3:3:1. Suppose you observe the following phenotypes in a sample of 400 offspring:

A_B_ (Dominant for both traits): 210
A_bb (Dominant for A, recessive for B): 70
aaB_ (Recessive for A, dominant for B): 80
aabb (Recessive for both traits): 40

Hypotheses:

H0: The observed phenotypic ratios match the expected 9:3:3:1 ratio.
H1: The observed phenotypic ratios do not match the expected 9:3:3:1 ratio.

Calculations:

Total Number of Offspring: 400
Expected Frequencies: Calculate the expected frequencies based on the 9:3:3:1 ratio:
- A_B_: (9/16) * 400 = 225
- A_bb: (3/16) * 400 = 75
- aaB_: (3/16) * 400 = 75
- aabb: (1/16) * 400 = 25
Chi-Square Statistic:
- (210 - 225)2 / 225 = 1
- (70 - 75)2 / 75 = 0.333
- (80 - 75)2 / 75 = 0.333
- (40 - 25)2 / 25 = 9
Summing these values, we get χ2 = 1 + 0.333 + 0.333 + 9 = 10.666
Degrees of Freedom: df = (number of categories) - 1 = 4 - 1 = 3
P-value: Using a chi-square distribution table or a statistical calculator with χ2 = 10.666 and df = 3, we find a p-value of approximately 0.014.

Conclusion:

Since the p-value (0.014) is less than a typical significance level of 0.05, we reject the null hypothesis. There is evidence to suggest that the observed phenotypic ratios do not match the expected 9:3:3:1 ratio. This could indicate linkage between the genes, epistasis, or other factors influencing the inheritance patterns.

Key Considerations and Cautions

Sample Size: The Chi-Square Goodness-of-Fit test is sensitive to sample size. Small sample sizes can lead to inaccurate results. A general rule of thumb is that the expected frequency in each category should be at least 5. If this condition is not met, consider combining categories or using a different statistical test (like Fisher's exact test).
Independence: The observations must be independent of each other. This means that one observation should not influence another.
Mutually Exclusive Categories: The categories must be mutually exclusive, meaning that an observation can only belong to one category.
Interpretation: A statistically significant result (small p-value) indicates that the observed data does not fit the hypothesized distribution. However, it does not tell you why the data doesn't fit. Further investigation is needed to understand the underlying reasons.
Alternatives: If the assumptions of the Chi-Square Goodness-of-Fit test are not met, consider using alternative statistical tests, such as the Kolmogorov-Smirnov test (for continuous data) or Fisher's exact test (for small sample sizes).

Advantages of the Chi-Square Goodness-of-Fit Test

Versatility: Can be used with various types of categorical data.
Ease of Calculation: The formula is relatively simple to apply.
Widely Available: Supported by most statistical software packages.

Disadvantages of the Chi-Square Goodness-of-Fit Test

Sensitivity to Sample Size: Requires sufficiently large sample sizes.
Limited Information: Only indicates whether the data fits the hypothesized distribution, not why.
Assumptions: Requires independent observations and mutually exclusive categories.

In conclusion, the Chi-Square Goodness-of-Fit test is a valuable tool for analyzing categorical data and assessing how well observed frequencies align with expected frequencies. By understanding the underlying principles, calculations, and limitations of this test, you can effectively apply it to a wide range of research and practical applications. Always remember to check the assumptions of the test and interpret the results carefully.

Example Of Chi Square Test For Goodness Of Fit

Table of Contents

Understanding the Core Concepts

Example 1: Testing a Fair Die

Example 2: Mendel's Pea Experiment

Example 3: Testing for Uniform Distribution of Birthdays

Example 4: Preference for Colors

Example 5: Testing Genetic Ratios (More Complex)

Key Considerations and Cautions

Advantages of the Chi-Square Goodness-of-Fit Test

Disadvantages of the Chi-Square Goodness-of-Fit Test

Latest Posts

Latest Posts

Related Post