Question 1193085
<font color=black size=3>
Given Data
<table border = "1" cellpadding = "5"><tr><td>Group 1</td><td>Group 2</td></tr><tr><td>48.86</td><td>48.88</td></tr><tr><td>50.60</td><td>52.63</td></tr><tr><td>51.02</td><td>52.55</td></tr><tr><td>47.99</td><td>50.94</td></tr><tr><td>54.20</td><td>53.02</td></tr><tr><td>50.66</td><td>50.66</td></tr><tr><td>45.91</td><td>47.78</td></tr><tr><td>48.79</td><td>48.44</td></tr><tr><td>47.76</td><td>48.92</td></tr><tr><td>51.13</td><td>51.63</td></tr></table>


Part (A)


Let's compute the sample mean of Group 1.
To get the sample mean, we first add up the data values
48.86+50.6+51.02+47.99+54.2+50.66+45.91+48.79+47.76+51.13 = 496.92


Then we divide that over the sample size n = 10
496.92/n = 496.92/10 = 49.692
This is the sample mean (xbar) for group 1.
I'll refer to this as xbar1.


Follow similar steps for group 2 to find that
xbar2 = 50.545
Note: this does NOT mean xbar squared.


Now let's calculate the variance for group 1.
Here's the data values for group 1 only, which I'll call X.
<table border = "1" cellpadding = "5"><tr><td>X</td></tr><tr><td>48.86</td></tr><tr><td>50.6</td></tr><tr><td>51.02</td></tr><tr><td>47.99</td></tr><tr><td>54.2</td></tr><tr><td>50.66</td></tr><tr><td>45.91</td></tr><tr><td>48.79</td></tr><tr><td>47.76</td></tr><tr><td>51.13</td></tr></table>
For each X value, subtract off the value of xbar = 49.692
Then square the difference.
For example, we have (X-xbar)^2 = (48.86-49.692)^2 = 0.692224 in the first row.
<table border = "1" cellpadding = "5"><tr><td>X</td><td>(X-xbar)^2</td></tr><tr><td>48.86</td><td>0.692224</td></tr><tr><td>50.6</td><td>0.824464</td></tr><tr><td>51.02</td><td>1.763584</td></tr><tr><td>47.99</td><td>2.896804</td></tr><tr><td>54.2</td><td>20.322064</td></tr><tr><td>50.66</td><td>0.937024</td></tr><tr><td>45.91</td><td>14.303524</td></tr><tr><td>48.79</td><td>0.813604</td></tr><tr><td>47.76</td><td>3.732624</td></tr><tr><td>51.13</td><td>2.067844</td></tr></table>
Sum everything in the second column and you should get 48.35376
This is the Sum of the Squared Errors (SSE)
Divide the SSE value over n-1 = 10-1 = 9 to compute the sample variance


sample variance = (SSE)/(n-1)
sample variance = (48.35376)/9
sample variance = 5.37264
I'll refer to this as V1 to represent the variance of group 1.


Follow similar steps to find that V2 = 3.70316 is the approximate sample variance of group 2.
Use of a calculator with a built-in standard deviation function will make quick work of finding the variance.


Now onto the standard error (SE)
SE = sqrt( (V1)/(n1) + (V2)/(n2) )
SE = sqrt( (5.37264)/(10) + (3.70316)/(10) )
SE = 0.95266993234803


Which helps us find the t statistic
t = ((xbar1 - xbar2) - (mu1 - mu2))/(SE)
t = ((49.692 - 50.545) - (0))/(0.95266993234803)
t = -0.89537831628381
t = -0.8954


The degrees of freedom is the smaller of n1-1 or n2-1
Because n1 = n2 = 10, we just simply can think of it as n-1
The degrees of freedom is df = n-1 = 10-1 = 9


Use a calculator like this one
<a href = "https://stattrek.com/online-calculator/t-distribution.aspx">https://stattrek.com/online-calculator/t-distribution.aspx</a>
to find that P(T < -0.8954) = 0.1969 approximately when we have df = 9
This doubles to 2*0.1969 = 0.3938 due to the fact that we have a two-sided test (because the phrasing "P-value for the two-sided alternative.")
The result 0.3938 is the approximate P-value.


------------------------------


Summary:


xbar1 = 49.692 and xbar2 = 50.545 are the sample means
V1 = 5.37264 and V2 = 3.70316 are the sample variances
t = -0.8954 is the test statistic
df = 9 is the degrees of freedom
p-value = 0.3938


==============================================================================================================
Part (B)


Here are the two original groups of data.
<table border = "1" cellpadding = "5"><tr><td>Group 1</td><td>Group 2</td></tr><tr><td>48.86</td><td>48.88</td></tr><tr><td>50.60</td><td>52.63</td></tr><tr><td>51.02</td><td>52.55</td></tr><tr><td>47.99</td><td>50.94</td></tr><tr><td>54.20</td><td>53.02</td></tr><tr><td>50.66</td><td>50.66</td></tr><tr><td>45.91</td><td>47.78</td></tr><tr><td>48.79</td><td>48.44</td></tr><tr><td>47.76</td><td>48.92</td></tr><tr><td>51.13</td><td>51.63</td></tr></table>
For each row, subtract the values in the form X1 - X2
X1 is from group 1
X2 is from group 2
For instance, the first row has 48.86 - 48.88 = -0.02 as the difference
We'll list the differences in the column labeled "d"
<table border = "1" cellpadding = "5"><tr><td>Group 1</td><td>Group 2</td><td>d</td></tr><tr><td>48.86</td><td>48.88</td><td>-0.02</td></tr><tr><td>50.6</td><td>52.63</td><td>-2.03</td></tr><tr><td>51.02</td><td>52.55</td><td>-1.53</td></tr><tr><td>47.99</td><td>50.94</td><td>-2.95</td></tr><tr><td>54.2</td><td>53.02</td><td>1.18</td></tr><tr><td>50.66</td><td>50.66</td><td>0</td></tr><tr><td>45.91</td><td>47.78</td><td>-1.87</td></tr><tr><td>48.79</td><td>48.44</td><td>0.35</td></tr><tr><td>47.76</td><td>48.92</td><td>-1.16</td></tr><tr><td>51.13</td><td>51.63</td><td>-0.5</td></tr></table>
If you were to compute the sample mean of the d column, you should find that the mean is -0.853
We call this value dbar in much the same way xbar is denoted. The "bar" refers to the horizontal line up top.
dbar = -0.853


The sample variance of the d column will follow the same type of steps as described in part (A) when I detailed how to compute the variance of group 1.
You should get a sample variance of 1.610668 which leads to the sample standard deviation of sqrt(1.610668) = 1.269121
I'll refer to this standard deviation as <font color=red>S</font><font color=blue>d</font> to indicate "<font color=red>S</font>tandard deviation of the <font color=blue>d</font>ifferences".


Now onto the standard error (SE)
SE = Sd/sqrt(n)
SE = 1.269121/sqrt(10)
SE = 0.40133129863506
SE = 0.401331


It allows us to compute the test statistic
t = (dbar - mu_d)/SE
t = (-0.853 - 0)/0.40133129863506
t = -2.12542605797524
t = -2.1254


The degrees of freedom is n-1 = 10-1 = 9


I'll then use this calculator again
<a href = "https://stattrek.com/online-calculator/t-distribution.aspx">https://stattrek.com/online-calculator/t-distribution.aspx</a>
to find that P(T < -2.1254) = 0.0312 when df = 9 which doubles to 2*0.0312 = 0.0624 since we're doing a two-tailed test.
The result 0.0624 is the approximate P-value.


------------------------------


Summary:


dbar = -0.853 is the sample mean of the differences (d column)
1.610668 is the approximate sample variance of the differences (d column)
t = -2.1254 is the approximate test statistic
df = 9 = degrees of freedom
P-value = 0.0624


==============================================================================================================
Part (C)


Admittedly there are a lot of numbers and variables to keep track of. 
It might be overwhelming if you aren't familiar with statistics too much.


Though if I had to pick one variable to focus on, I would say it's the P-value. 
In many scientific journals, the researchers report the P-value to the reader to indicate how (in)significant the results were. 


In part (A), we got a P-value of roughly 0.3938
In part (B), we got a P-value of roughly 0.0624
That's quite a gap.


Recall that the P-value determines if you reject or fail to reject the null.
Let's say the significance level is alpha = 0.05 which is the default level.


At this alpha value, we'd fail to reject the null for both part (A) and part (B). Why? Because the p-value for each is not less than alpha = 0.05
We reject the null only if the p-value is smaller than alpha.


If we set alpha = 0.10, then we'd fail to reject in part (A) but reject the null in part (B)
Sometimes you may see a significance level of alpha = 0.10 (of course it depends on the context).
As you can see, part (B) has leads to a situation where we are more likely to reject the null.
</font>