Question 1164644
```python?code_reference&code_event_index=2
import pandas as pd
import numpy as np

# Table 4 data
data4 = {
    'Lower': [15, 25, 35, 45, 55, 65],
    'Upper': [24, 34, 44, 54, 64, 74],
    'Frequency': [21, 55, 10, 91, 80, 39]
}
df4 = pd.DataFrame(data4)
df4['Midpoint'] = (df4['Lower'] + df4['Upper']) / 2
df4['f_x'] = df4['Frequency'] * df4['Midpoint']
df4['f_x2'] = df4['Frequency'] * (df4['Midpoint']**2)
df4['Cumulative_Frequency'] = df4['Frequency'].cumsum()

# Part a Calculations
N4 = df4['Frequency'].sum()
mean4 = df4['f_x'].sum() / N4

def get_percentile_val(df, target_rank, N):
    idx = df[df['Cumulative_Frequency'] >= target_rank].index[0]
    row = df.iloc[idx]
    L = row['Lower'] - 0.5
    i = (row['Upper'] + 0.5) - L
    f = row['Frequency']
    cf_prev = df.iloc[idx-1]['Cumulative_Frequency'] if idx > 0 else 0
    val = L + ((target_rank - cf_prev) / f) * i
    return val

median4 = get_percentile_val(df4, N4 * 0.5, N4)
q1_4 = get_percentile_val(df4, N4 * 0.25, N4)
d4_4 = get_percentile_val(df4, N4 * 0.4, N4)
p25_4 = get_percentile_val(df4, N4 * 0.25, N4)

print("Table 4 Results:")
print(f"Total N: {N4}")
print(f"Mean: {mean4}")
print(f"Median: {median4}")
print(f"Q1: {q1_4}")
print(f"D4: {d4_4}")
print(f"P25: {p25_4}")

# Problem 3 Table
data3 = {
    'Lower': [1, 11, 21, 31, 41, 51, 61],
    'Upper': [10, 20, 30, 40, 50, 60, 70],
    'Frequency': [4, 5, 6, 1, 2, 11, 9]
}
df3 = pd.DataFrame(data3)
df3['Midpoint'] = (df3['Lower'] + df3['Upper']) / 2
df3['f_x'] = df3['Frequency'] * df3['Midpoint']
df3['f_x2'] = df3['Frequency'] * (df3['Midpoint']**2)

N3 = df3['Frequency'].sum()
mean3 = df3['f_x'].sum() / N3
sum_fx2 = df3['f_x2'].sum()
sum_fx = df3['f_x'].sum()

# Sample Variance
variance3 = (sum_fx2 - (sum_fx**2 / N3)) / (N3 - 1)
std_dev3 = np.sqrt(variance3)

print("\nProblem 3 Results:")
print(f"Total N: {N3}")
print(f"Mean: {mean3}")
print(f"Variance (Sample): {variance3}")
print(f"Standard Deviation (Sample): {std_dev3}")


```
```text?code_stdout&code_event_index=2
Table 4 Results:
Total N: 296
Mean: 48.6554054054054
Median: 51.31318681318682
Q1: 34.13636363636364
D4: 48.06043956043956
P25: 34.13636363636364

Problem 3 Results:
Total N: 38
Mean: 41.55263157894737
Variance (Sample): 473.18634423897583
Standard Deviation (Sample): 21.752846807693373

```
To solve these statistical problems, we first calculate the necessary components for grouped data (midpoints, cumulative frequencies, etc.) for each table.

### **Part a) Analysis of Table 4 (University Project Costs)**

| Cost ($) | Frequency ($f$) | Midpoint ($x$) | $f \cdot x$ | Cumulative Freq ($CF$) |
| :--- | :--- | :--- | :--- | :--- |
| 15-24 | 21 | 19.5 | 409.5 | 21 |
| 25-34 | 55 | 29.5 | 1622.5 | 76 |
| 35-44 | 10 | 39.5 | 395.0 | 86 |
| 45-54 | 91 | 49.5 | 4504.5 | 177 |
| 55-64 | 80 | 59.5 | 4760.0 | 257 |
| 65-74 | 39 | 69.5 | 2710.5 | 296 |
| **Total** | **$N=296$** | | **$\sum fx = 14402$** | |

**i) Arithmetic Mean**
$$\bar{x} = \frac{\sum fx}{N} = \frac{14402}{296} \approx \$48.66$$

**ii) Median Cost**
The median position is $N/2 = 296/2 = 148$. This falls into the **45-54** class.
Lower boundary ($L$) = 44.5, $CF_{prev}$ = 86, $f$ = 91, class width ($i$) = 10.
$$\text{Median} = L + \left( \frac{\frac{N}{2} - CF_{prev}}{f} \right) \times i = 44.5 + \left( \frac{148 - 86}{91} \right) \times 10 \approx \$51.31$$

**iii) $Q_1$ and $D_4$**
* **$Q_1$ (First Quartile):** Position is $N/4 = 74$. Falls in the **25-34** class.
    $$Q_1 = 24.5 + \left( \frac{74 - 21}{55} \right) \times 10 \approx \$34.14$$
* **$D_4$ (Fourth Decile):** Position is $4N/10 = 118.4$. Falls in the **45-54** class.
    $$D_4 = 44.5 + \left( \frac{118.4 - 86}{91} \right) \times 10 \approx \$48.06$$

**iv) 25th Percentile ($P_{25}$)**
The 25th percentile is identical to the first quartile ($Q_1$).
$$P_{25} \approx \$34.14$$

---

### **Part b) Categorization of Variables**

1.  **Comments on services: Ordinal.**
    * *Reason:* Comments (like "Poor", "Average", "Good") have a logical rank or order, but the numerical distance between the categories is not defined.
2.  **Price of Shirts: Ratio.**
    * *Reason:* It has a natural zero point (a price of \$0 means it's free), and ratios are meaningful (a \$40 shirt is twice as expensive as a \$20 shirt).
3.  **Income of people: Ratio.**
    * *Reason:* It possesses a true zero point, and mathematical operations like multiplication/division are meaningful (someone earning \$50k earns half as much as someone earning \$100k).
4.  **Driving license number: Nominal.**
    * *Reason:* These are unique identifiers used only for labeling. The numbers do not imply any order, magnitude, or mathematical value.

---

### **Variance and Standard Deviation (Problem 3)**

| Class | Freq ($f$) | Midpoint ($x$) | $fx$ | $fx^2$ |
| :--- | :--- | :--- | :--- | :--- |
| 1-10 | 4 | 5.5 | 22 | 121 |
| 11-20 | 5 | 15.5 | 77.5 | 1201.25 |
| 21-30 | 6 | 25.5 | 153 | 3901.5 |
| 31-40 | 1 | 35.5 | 35.5 | 1260.25 |
| 41-50 | 2 | 45.5 | 91 | 4140.5 |
| 51-60 | 11 | 55.5 | 610.5 | 33882.75 |
| 61-70 | 9 | 65.5 | 589.5 | 38612.25 |
| **Total** | **$N=38$** | | **1579** | **83119.5** |

**1. Variance ($s^2$):**
Using the sample variance formula:
$$s^2 = \frac{\sum fx^2 - \frac{(\sum fx)^2}{N}}{N - 1} = \frac{83119.5 - \frac{1579^2}{38}}{37} = \frac{83119.5 - 65611.6}{37} \approx 473.19$$

**2. Standard Deviation ($s$):**
$$s = \sqrt{s^2} = \sqrt{473.19} \approx 21.75$$