Question 1165670: A survey of 250 adults showed that 100 of they drank alcohol daily. Calculate
standard error for the adult population that drink alcohol daily and also calculate
the error bounds at 95% confidence level. (10 marks)
(b) You randomly sample 500 sexually active individuals and collect information on
their HIV status and History of STDs. Using Chi-square test at the 5%
significance level, is there evidence of a relationship?
HIV
History STDs No Yes Total
No 92 58 150
Yes 70 280 350
Total 162 338 500
Answer by CPhill(2189) (Show Source):
You can put this solution on YOUR website! ```python?code_reference&code_event_index=2
import numpy as np
from scipy.stats import norm, chi2_contingency
# --- Part (a) Calculations: Standard Error and 95% CI ---
n_a = 250
x_a = 100
p_hat = x_a / n_a
# 1. Standard Error (SE)
se_p_hat = np.sqrt(p_hat * (1 - p_hat) / n_a)
# 2. Z-score for 95% CI
# Z = 1.96 for 95% confidence
Z = 1.96
# 3. Margin of Error (ME)
me = Z * se_p_hat
# 4. Confidence Interval Bounds
ci_lower = p_hat - me
ci_upper = p_hat + me
print(f"Sample Proportion (p_hat): {p_hat}")
print(f"Standard Error (SE): {se_p_hat}")
print(f"Margin of Error (ME): {me}")
print(f"95% CI Lower Bound: {ci_lower}")
print(f"95% CI Upper Bound: {ci_upper}")
# --- Part (b) Calculations: Chi-square Test of Independence ---
# Observed data (O)
# Rows: No HIV, Yes HIV
# Cols: No STDs, Yes STDs
observed = np.array([[92, 58],
[70, 280]])
# Perform Chi-square test of independence
chi2_stat, p_value, dof, expected = chi2_contingency(observed)
print("\nChi-square Test Results:")
print(f"Chi-square statistic: {chi2_stat}")
print(f"P-value: {p_value}")
print(f"Degrees of Freedom (dof): {dof}")
print(f"Expected Frequencies:\n{expected}")
```
```text?code_stdout&code_event_index=2
Sample Proportion (p_hat): 0.4
Standard Error (SE): 0.030983866769659335
Margin of Error (ME): 0.060728378868532294
95% CI Lower Bound: 0.33927162113146775
95% CI Upper Bound: 0.4607283788685323
Chi-square Test Results:
Chi-square statistic: 80.02645502645503
P-value: 3.694302007277756e-19
Degrees of Freedom (dof): 1
Expected Frequencies:
[[ 48.6 101.4]
[113.4 236.6]]
```
### (a) Standard Error and Confidence Bounds
This calculation pertains to the population proportion ($\hat{p}$) of adults who drink alcohol daily.
**Given:**
* Sample size ($n$): 250
* Number who drink daily ($x$): 100
* Sample proportion ($\hat{p}$): $\hat{p} = \frac{100}{250} = \mathbf{0.40}$
#### 1\. Standard Error (SE)
The standard error of the sample proportion measures the typical distance that the sample proportion ($\hat{p}$) falls from the true population proportion.
$$SE_{\hat{p}} = \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} = \sqrt{\frac{0.40 \times 0.60}{250}} = \sqrt{0.00096}$$
$$\mathbf{SE_{\hat{p}} \approx 0.0310}$$
#### 2\. Error Bounds at 95% Confidence Level
The error bounds (Margin of Error, ME) are calculated using the Z-score for 95% confidence, which is $Z_{0.025} = 1.96$.
$$\text{ME} = Z \times SE_{\hat{p}} = 1.96 \times 0.03098$$
$$\mathbf{\text{ME} \approx 0.0607}$$
The 95% confidence interval (CI) is $\hat{p} \pm ME$:
$$\text{CI} = 0.40 \pm 0.0607$$
$$\text{Lower Bound} \approx \mathbf{0.3393}$$
$$\text{Upper Bound} \approx \mathbf{0.4607}$$
**Interpretation:** We are 95% confident that the true percentage of the adult population that drinks alcohol daily is between **33.93% and 46.07%**.
-----
### (b) Chi-square Test of Independence
We use the Chi-square ($\chi^2$) test of independence to determine if there is a statistically significant relationship between HIV status and history of STDs.
**Observed Frequencies ($O$):**
| | No STDs | Yes STDs | Total |
| :---: | :---: | :---: | :---: |
| **No HIV** | 92 | 58 | 150 |
| **Yes HIV** | 70 | 280 | 350 |
| **Total** | 162 | 338 | 500 |
#### 1\. Hypotheses
* $H_0$: HIV status and History of STDs are **independent** (no relationship).
* $H_a$: HIV status and History of STDs are **dependent** (there is a relationship).
* Significance level ($\alpha$): 0.05
#### 2\. Expected Frequencies ($E$) and Test Statistic
The test compares the observed frequencies ($O$) to the expected frequencies ($E$), which are calculated assuming the null hypothesis of independence is true.
**Expected Frequencies:**
| | No STDs | Yes STDs |
| :---: | :---: | :---: |
| **No HIV** | 48.6 | 101.4 |
| **Yes HIV** | 113.4 | 236.6 |
The **Chi-square test statistic ($\chi^2$)** is calculated based on the differences between observed and expected values:
$$\chi^2 = \sum \frac{(O - E)^2}{E}$$
The calculated $\chi^2$ statistic is:
$$\mathbf{\chi^2 \approx 80.026}$$
#### 3\. Degrees of Freedom (dof) and P-value
* **Degrees of Freedom (dof):** $\text{dof} = (\text{Rows} - 1)(\text{Columns} - 1) = (2 - 1)(2 - 1) = \mathbf{1}$
* **P-value:** The probability of observing a $\chi^2$ value as extreme as $80.026$ (or more extreme) under the null hypothesis is:
$$\mathbf{p\text{-value} \approx 3.69 \times 10^{-19}}$$
#### 4\. Conclusion
We compare the p-value to the significance level $\alpha$:
$$\text{P-value } (3.69 \times 10^{-19}) \ll \alpha (0.05)$$
Since the p-value is extremely small (far less than $0.05$), we **reject the null hypothesis ($H_0$)**.
**Conclusion:**
At the $5\%$ significance level, there is **strong evidence of a statistically significant relationship** between HIV status and history of STDs. The two variables are dependent.
|
|
|