SOLUTION: TABLE: https://imagizer.imageshack.com/img924/3557/S7r4QJ.jpg Disk drives have been getting larger. Their capacity is now often given in terabytes (TB) where 1TB=1000 giga

Click here to see ALL problems on Evaluation Word Problems

Question 1196643: TABLE: https://imagizer.imageshack.com/img924/3557/S7r4QJ.jpg
Disk drives have been getting larger. Their capacity is now often given in terabytes (TB) where 1TB=1000 gigabytes, or about a trillion bytes. A search of prices for external disk drives on a large shopping website in a recent year found the accompanying data. Find and interpret the value of R^2
Answer by math_tutor2020(3817) (Show Source):
You can put this solution on YOUR website!

If you are in a hurry, then you can use technology to quickly compute the r and r^2 values.
Two examples would be the LinReg command on a TI83 (or similar) and using the the CORREL command in a spreadsheet.
There are many other options to choose from. Feel free to search out your favorite.
You should find these approximations
r = 0.9878
r^2 = 0.9757
Since r^2 is very close to 1, this makes the linear regression a good fit. Approximately 97.57% of the variation in x explains the variation in y.

If you have more time to go over the math, then there are various ways to calculate the correlation coefficient.
I'll go over two slightly different methods.

-------------------------------------------------------------------------------------------------------------

Method 1

x = capacity of hard drive in terabytes (TB)
y = price in dollars

Given info:
n = 9 = sample size = number of x,y pairs
xbar = 7.611
ybar = 786.49
SD(x) = 9.854
SD(y) = 1417.82

The term "xbar" refers to the horizontal bar over the x, i.e.
A similar story is with ybar as well.

Given Data

x y
0.5 60.99
1 77.99
2 112.97
3 110.99
4 151.99
6 425.34
8 597.11
12 1081.99
32 4459

Form a third column which is the product of the x and y columns
Eg: 0.5*60.99 = 30.495 in the first row

x y xy
0.5 60.99 30.495
1 77.99 77.99
2 112.97 225.94
3 110.99 332.97
4 151.99 607.96
6 425.34 2552.04
8 597.11 4776.88
12 1081.99 12983.88
32 4459 142688

I strongly recommend using spreadsheet software.
It's not only fast and efficient, but also something that is expected in real world applications.
I'm using LibreOffice but you could use Excel or Google Sheets or whichever app you prefer most.

Add up the values in the xy column to get 164,276.155
Then we subtract off the value of n*xbar*ybar = 9*7.611*786.49 = 53,873.77851

So we have:
Sum(xy) - n*xbar*ybar = 164,276.155 - 53,873.77851 = 110,402.37649

We'll divide that result over the product of the given standard deviation values, multiplied with (n-1)
So,
(n-1)*SD(x)*SD(y) = (9-1)*(9.854)*(1417.82) = 111,769.58624

Therefore,
r = (110,402.37649)/(111,769.58624)
r = 0.98776760480204
r^2 = (0.98776760480204)^2
r^2 = 0.97568484109636
r^2 = 0.9757
which is approximate.

Since r^2 is very close to 1, this makes the linear regression a good fit. Approximately 97.57% of the variation in x explains the variation in y.

Note: The formula I used just now is

-------------------------------------------------------------------------------------------------------------

Method 2

x = capacity of hard drive in terabytes (TB)
y = price in dollars

Given info:
n = 9 = sample size = number of x,y pairs
xbar = 7.611
ybar = 786.49
SD(x) = 9.854
SD(y) = 1417.82

Given Data

x y
0.5 60.99
1 77.99
2 112.97
3 110.99
4 151.99
6 425.34
8 597.11
12 1081.99
32 4459

Instead of an xy column, we'll form the Zx column
Zx = (x - xbar)/(SD(x))
We're computing the z score for each x term

For instance, in the first row we have
Zx = (x-xbar)/(SD(x))
Zx = (0.5-7.611)/(9.854)
Zx = -0.72163588390501
Zx = -0.721636
Do the same thing for each item in the x column. The values of xbar and SD(x) will remain constant.

This is what the updated table looks like

x y Zx
0.5 60.99 -0.721636
1 77.99 -0.670895
2 112.97 -0.569413
3 110.99 -0.467932
4 151.99 -0.36645
6 425.34 -0.163487
8 597.11 0.039476
12 1081.99 0.445403
32 4459 2.475036

Follow similar steps for the Zy column
For example, we'll have the following calculation for the 1st row.
Zy = (y - ybar)/(SD(y))
Zy = (60.99 - 786.49)/(1417.82)
Zy = -0.511701

We have this so far

x y Zx Zy
0.5 60.99 -0.721636 -0.511701
1 77.99 -0.670895 -0.499711
2 112.97 -0.569413 -0.475039
3 110.99 -0.467932 -0.476436
4 151.99 -0.36645 -0.447518
6 425.34 -0.163487 -0.254722
8 597.11 0.039476 -0.133571
12 1081.99 0.445403 0.208419
32 4459 2.475036 2.590251

Then we'll multiply the Zx and Zy items for each row.
Eg: Zx*Zy = (-0.721636)*(-0.511701) = 0.369262 in row one

This is what the fully completed table looks like

x y Zx Zy ZxZy
0.5 60.99 -0.721636 -0.511701 0.369262
1 77.99 -0.670895 -0.499711 0.335254
2 112.97 -0.569413 -0.475039 0.270493
3 110.99 -0.467932 -0.476436 0.22294
4 151.99 -0.36645 -0.447518 0.163993
6 425.34 -0.163487 -0.254722 0.041644
8 597.11 0.039476 -0.133571 -0.005273
12 1081.99 0.445403 0.208419 0.09283
32 4459 2.475036 2.590251 6.410964

Add up the values in the final column and you should get roughly 7.902107

So,
r = Sum(ZxZy)/(n-1)
r = 7.902107/(9-1)
r = 7.902107/8
r = 0.987763375
r^2 = (0.987763375)^2
r^2 = 0.9756764849914
r^2 = 0.9757

Since r^2 is very close to 1, this makes the linear regression a good fit. Approximately 97.57% of the variation in x explains the variation in y.

Answer: r^2 = 0.9757 approximately

SOLUTION: TABLE: https://imagizer.imageshack.com/img924/3557/S7r4QJ.jpg Disk drives have been getting larger. Their capacity is now often given in terabytes​ (TB) where 1TB=1000 ​giga

SOLUTION: TABLE: https://imagizer.imageshack.com/img924/3557/S7r4QJ.jpg Disk drives have been getting larger. Their capacity is now often given in terabytes (TB) where 1TB=1000 giga