Question 1162016
```python?code_reference&code_event_index=3
import numpy as np
from scipy.stats import chi2_contingency

# Data table: Rows represent preference, Columns represent distance categories
# Distances: <3, 3-6, 7-10, 10-13, >13
observed = np.array([
    [14, 18, 12, 17, 19],  # Prefer Gel
    [5, 5, 8, 12, 8],      # Do not prefer Gel
    [27, 17, 8, 5, 2]      # No opinion
])

# Perform Chi-Square Test of Independence
chi2, p, dof, expected = chi2_contingency(observed)

print(f"{chi2=}")
print(f"{p=}")
print(f"{dof=}")
print(f"expected_frequencies:\n{expected}")


```
```text?code_stdout&code_event_index=3
chi2=32.86666355621393
p=6.508978280300026e-05
dof=8
expected_frequencies:
[[20.79096045 18.07909605 12.65536723 15.36723164 13.10734463]
 [ 9.87570621  8.58757062  6.01129944  7.29943503  6.2259887 ]
 [15.33333333 13.33333333  9.33333333 11.33333333  9.66666667]]

```
To determine if there is a relationship between weekly running distance and runners' preference for a gel-like product, we perform a **Chi-Square Test of Independence**.

### 1. State the Hypotheses
* **Null Hypothesis ($H_0$):** Running distance per week and gel preference are **independent**. (There is no relationship between the two variables).
* **Alternative Hypothesis ($H_A$):** Running distance per week and gel preference are **dependent**. (There is a significant relationship between the two variables).

### 2. Contingency Table (Observed Frequencies)
The data provided is organized into the following $3 \times 5$ table:

| Preference | <3 mi | 3-6 mi | 7-10 mi | 10-13 mi | >13 mi | **Total** |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| **Prefer Gel** | 14 | 18 | 12 | 17 | 19 | **80** |
| **Do Not Prefer** | 5 | 5 | 8 | 12 | 8 | **38** |
| **No Opinion** | 27 | 17 | 8 | 5 | 2 | **59** |
| **Total** | **46** | **40** | **28** | **34** | **29** | **177** |

---

### 3. Calculate the Test Statistic
Using the Chi-Square formula $\chi^2 = \sum \frac{(O - E)^2}{E}$, where $O$ is the observed frequency and $E$ is the expected frequency:

* **Degrees of Freedom ($df$):** $(rows - 1) \times (cols - 1) = (3 - 1) \times (5 - 1) = \mathbf{8}$
* **Chi-Square Statistic ($\chi^2$):** $\approx \mathbf{32.87}$
* **$p$-value:** $\approx \mathbf{0.000065}$

---

### 4. Conclusion at the 1% Level
* **Significance Level ($\alpha$):** 0.01
* **Decision:** Since the $p$-value ($0.000065$) is significantly **less than** $\alpha = 0.01$, we **reject the null hypothesis**.

**Final Result:** There **does appear to be a relationship** between the distance run per week and the runners' preference for the gel product. 

**Observation:** Looking at the data, runners who cover longer distances (>13 miles) are far more likely to "Prefer Gel" (19) compared to having "No Opinion" (2). Conversely, those who run very little (<3 miles) have a much higher rate of "No Opinion" (27) compared to a definitive preference. This suggests that as running volume increases, runners develop stronger, more positive opinions about specialized cushioning features.