Question 1167907
Let's prepare a sample computation and analysis for an observation involving a test of significance using dependent (correlated) data. We'll use example (a): "A Training Director wishes to find out whether or not a unique training program will increase employee efficiency (before and after the training)."

This scenario is perfect for a **paired-samples t-test** (also known as a dependent samples t-test).

---

## Sample Computation and Analysis: Paired-Samples t-test

**Scenario:** A Training Director wants to evaluate the effectiveness of a new training program designed to increase employee efficiency. They measure the efficiency scores of 10 randomly selected employees *before* the training and *after* the training.

**Research Question:** Does the training program significantly increase employee efficiency?

**Hypotheses:**
* **Null Hypothesis ($H_0$):** There is no significant difference in employee efficiency scores before and after the training program. (i.e., $\mu_d = 0$, where $\mu_d$ is the mean difference in efficiency scores).
* **Alternative Hypothesis ($H_1$):** The training program significantly increases employee efficiency. (i.e., $\mu_d > 0$, indicating that after scores are higher than before scores). This is a one-tailed test.

**Significance Level ($\alpha$):** Let's set $\alpha = 0.05$.

**Data Collection:**
Efficiency scores (e.g., tasks completed per hour, error rate, etc., scaled to a common metric) for 10 employees:

| Employee | Before Training (Score 1) | After Training (Score 2) | Difference ($d = \text{Score 2} - \text{Score 1}$) |
| :------- | :------------------------ | :----------------------- | :----------------------------------------------- |
| 1        | 75                        | 80                       | 5                                                |
| 2        | 80                        | 85                       | 5                                                |
| 3        | 68                        | 70                       | 2                                                |
| 4        | 92                        | 95                       | 3                                                |
| 5        | 70                        | 78                       | 8                                                |
| 6        | 85                        | 83                       | -2                                               |
| 7        | 73                        | 79                       | 6                                                |
| 8        | 78                        | 82                       | 4                                                |
| 9        | 88                        | 90                       | 2                                                |
| 10       | 79                        | 84                       | 5                                                |

**Sample Computation:**

1.  **Calculate the difference ($d$) for each pair:** (Already done in the table above).
    Differences: $5, 5, 2, 3, 8, -2, 6, 4, 2, 5$

2.  **Calculate the mean of the differences ($\bar{d}$):**
    $\bar{d} = \frac{\sum d}{n} = \frac{5+5+2+3+8-2+6+4+2+5}{10} = \frac{38}{10} = 3.8$

3.  **Calculate the standard deviation of the differences ($s_d$):**
    First, calculate $(d - \bar{d})^2$ for each difference:
    * $(5 - 3.8)^2 = (1.2)^2 = 1.44$
    * $(5 - 3.8)^2 = (1.2)^2 = 1.44$
    * $(2 - 3.8)^2 = (-1.8)^2 = 3.24$
    * $(3 - 3.8)^2 = (-0.8)^2 = 0.64$
    * $(8 - 3.8)^2 = (4.2)^2 = 17.64$
    * $(-2 - 3.8)^2 = (-5.8)^2 = 33.64$
    * $(6 - 3.8)^2 = (2.2)^2 = 4.84$
    * $(4 - 3.8)^2 = (0.2)^2 = 0.04$
    * $(2 - 3.8)^2 = (-1.8)^2 = 3.24$
    * $(5 - 3.8)^2 = (1.2)^2 = 1.44$
    Sum of $(d - \bar{d})^2 = 1.44 + 1.44 + 3.24 + 0.64 + 17.64 + 33.64 + 4.84 + 0.04 + 3.24 + 1.44 = 67.6$

    $s_d = \sqrt{\frac{\sum (d - \bar{d})^2}{n-1}} = \sqrt{\frac{67.6}{10-1}} = \sqrt{\frac{67.6}{9}} = \sqrt{7.511...} \approx 2.74$

4.  **Calculate the test statistic (t-value):**
    $t = \frac{\bar{d}}{s_d / \sqrt{n}} = \frac{3.8}{2.74 / \sqrt{10}} = \frac{3.8}{2.74 / 3.16} = \frac{3.8}{0.867} \approx 4.383$

5.  **Determine Degrees of Freedom (df):**
    $df = n - 1 = 10 - 1 = 9$

6.  **Find the Critical t-value:**
    For a one-tailed t-test with $\alpha = 0.05$ and $df = 9$, we look up the critical t-value from a t-distribution table.
    Critical $t_{0.05, 9} \approx 1.833$

7.  **Calculate the p-value (optional, but standard in software):**
    Using statistical software or a calculator, for $t = 4.383$ with $df = 9$ (one-tailed), the p-value is approximately $0.0008$.

---

**Analysis and Conclusion:**

* **Comparison of Test Statistic to Critical Value:** Our calculated t-value ($4.383$) is greater than the critical t-value ($1.833$).
* **Comparison of p-value to Significance Level:** Our p-value ($0.0008$) is less than the significance level ($\alpha = 0.05$).

Since our calculated t-value falls into the rejection region (and our p-value is less than $\alpha$), we **reject the null hypothesis**.

**Conclusion:** Based on this analysis, there is sufficient statistical evidence at the 0.05 significance level to conclude that the unique training program significantly *increases* employee efficiency. The observed average increase in efficiency score by 3.8 points is statistically meaningful and unlikely to have occurred by random chance.

---

**Why Paired-Samples t-test is appropriate here:**

* **Dependent Data:** The key is that the two sets of observations (before and after scores) are from the *same* individuals. This makes the data dependent or correlated. Each employee acts as their own control.
* **Controlling for Individual Variability:** By focusing on the *difference* in scores for each individual, the paired t-test effectively controls for the natural variability in efficiency that exists between different employees (e.g., some employees are inherently more efficient than others regardless of training). This increases the statistical power of the test to detect a real effect of the training program.