Hypothesis Testing & Correlation Analysis
Amanda Hidalgo
LIS 4273: Adv Stats & Analytics
Hypothesis Testing & Correlation Analysis
Part 1
A. State the null and alternative hypothesis:
Null Hypothesis (H0): The machine is producing cookies according to the manufacturer's specifications, i.e., the population mean (μ) breaking strength is 70 pounds.
Alternative Hypothesis (Ha): The machine is not producing cookies according to the manufacturer's specifications, i.e., the population mean (μ) breaking strength is not equal to 70 pounds.
B. Is there evidence that the machine is not meeting the manufacturer's specifications for average strength? Use a 0.05 level of significance:
To determine if there is evidence that the machine is not meeting the manufacturer's specifications, we need to perform a hypothesis test using the sample data.
We have:
- Sample mean (x̄) = 69.1 pounds
- Population standard deviation (σ) = 3.5 pounds
- Sample size (n) = 49
- Level of significance (α) = 0.05
We'll use a two-tailed z-test because we are interested in whether the population mean is not equal to 70 pounds.
Calculating the test statistic (z):
z = (x̄ - μ) / (σ / √n)
z = (69.1 - 70) / (3.5 / √49)
z = (-0.9) / (0.5)
z = -1.8
Now, we compare the calculated z-value to the critical z-value at a 0.05 significance level. For a two-tailed test at α = 0.05, the critical z-values are approximately ±1.96.
Since -1.8 is within the range of -1.96 to 1.96, we do not reject the null hypothesis. Therefore, there is not enough evidence to conclude that the machine is not meeting the manufacturer's specifications for average strength at the 0.05 level of significance.
C. Compute the p-value and interpret its meaning:
The p-value is the probability of obtaining a test statistic as extreme as the one calculated (z = -1.8) under the null hypothesis. You can find the p-value using a standard normal distribution table or calculator. For z = -1.8, the p-value is approximately 0.0714.
Interpretation: The p-value of 0.0714 is greater than the chosen significance level of 0.05. This means that there is not enough evidence to reject the null hypothesis. The machine may still be producing cookies according to the manufacturer's specifications.
D. What would be your answer in (B) if the standard deviation were specified as 1.75 pounds?
If the standard deviation were specified as 1.75 pounds, you would need to recalculate the z-test using the new standard deviation value while keeping the same sample mean, sample size, and significance level. The steps would be the same as in part B, but with σ = 1.75 pounds.
Calculating the test statistic (z):
z = (x̄ - μ) / (σ / √n)
z = (69.1 - 70) / (1.75 / √49)
z = (-0.9) / (0.25)
z = -3.6
Now, we compare the calculated z-value to the critical z-value at a 0.05 significance level. For a two-tailed test at α = 0.05, the critical z-values are approximately ±1.96.
Since -3.6 is less than -1.96, we reject the null hypothesis. Therefore, if the standard deviation were specified as 1.75 pounds, there is enough evidence to conclude that the machine is not meeting the manufacturer's specifications for average strength at the 0.05 level of significance.
E. What would be your answer in (B) if the sample mean were 69 pounds and the standard deviation is 3.5 pounds?
If the sample mean were 69 pounds and the standard deviation were 3.5 pounds, you would still need to perform the same hypothesis test as in part B, but with the new sample mean and standard deviation values. The steps would be the same, with x̄ = 69 pounds and σ = 3.5 pounds.
Calculating the test statistic (z):
z = (x̄ - μ) / (σ / √n)
z = (69 - 70) / (3.5 / √49)
z = (-1) / (0.5)
z = -2
Now, compare the calculated z-value to the critical z-value at a 0.05 significance level. For a two-tailed test at α = 0.05, the critical z-values are approximately ±1.96.
Since -2 is within the range of -1.96 to 1.96, we do not reject the null hypothesis. Therefore, if the sample mean were 69 pounds and the standard deviation were 3.5 pounds, there is not enough evidence to conclude that the machine is not meeting the manufacturer's specifications for average strength at the 0.05 level of significance, similar to the original scenario in part B.
Part 2
Part 3
# Load the dataset
data <- read.csv("correlation_data.csv")
# Calculate the correlation coefficient
correlation_coefficient <- cor(data$girls, data$boys)
# Create a scatter plot of the correlation
plot(data$girls, data$boys, xlab = "Girls' Goals", ylab = "Boys' Time on Assignments", main = "Correlation Plot")
# Print the correlation coefficient
print(paste("Correlation Coefficient (Pearson):", round(correlation_coefficient, 2)))
Comments
Post a Comment