SOLUTION: Sorry...this is more of a calculus question but I really need some help. The following experimental data set has been collected {(xi, yi)} = { (0.6, 10), (1.3, 20), (2.1, 30), (

Algebra ->  Probability-and-statistics -> SOLUTION: Sorry...this is more of a calculus question but I really need some help. The following experimental data set has been collected {(xi, yi)} = { (0.6, 10), (1.3, 20), (2.1, 30), (      Log On


   



Question 436793: Sorry...this is more of a calculus question but I really need some help.
The following experimental data set has been collected {(xi, yi)} = { (0.6, 10), (1.3, 20), (2.1, 30), (3.6, 50), (7.3, 100) }. Find the best straight line y = mx that fists the data by using a method that minimizes the error squared:
E =n=1 (y- mx)^2 = (y1 - mx1)^2 + (y2- mx2)^2 + ... + (yn - mxn)^2.
The task is to find slope, m of the straight line by minimizing E using the data set given above. Please help. I don't know where to begin.

Answer by Edwin McCravy(20064) About Me  (Show Source):
You can put this solution on YOUR website!

You'll have to see if you can get this to match your teacher's
notation.  I'll explain it from scratch the way I teach it:

If we could find a line y = mx+b that would satisfy the above data
perfectly then we would have

  y = m*x   + b
---------------- 
 10 = m*0.6 + b
 20 = m*1.3 + b
 30 = m*2.1 + b
 50 = m*3.6 + b
100 = m*7.3 + b

But we know we can't likely have that, so we calculate the 
"errors" by subtracting the right sides from the left, and those 
values are called the "residuals"

 10 - (m*0.6 + b) =  10 - 0.6m - b
 20 - (m*1.3 + b) =  10 - 1.3m - b 
 30 - (m*2.1 + b) =  10 - 2.1m - b
 50 - (m*3.6 + b) =  10 - 3.6m - b
100 - (m*7.3 + b) = 100 - 7.3m - b

Next we make a function by adding up the sum of the squares
of these residuals:

S(m,b) = (10-0.6m-b)²+(20-1.3m-b)²+(30-2.1m-b)²+(50-3.6m-b)²+(100-7.3m-b)² 

1.  Take the partial derivative with respect to b, which is just 
    like ordinary derivatives considering m to be a constant and 
    b and S to be variables:

∂S(m,b)/∂b = 

2(10-0.6m-b)(-1)+2(20-1.3m-b)(-1)+2(30-2.1m-b)(-1)+

                            2(50-3.6m-b)(-1)+(100-7.3m-b)(-1)

= -2(10-0.6m-b)-2(20-1.3m-b)-2(30-2.1m-b)-2(50-3.6m-b)-2(100-7.3m-b)  
 
= -2[(10-0.6m-b)+(20-1.3m-b)+(30-2.1m-b)+(50-3.6m-b)+(190-7.3m-b)]

= -2[10-0.6m-b+20-1.3m-b+30-2.1m-b+50-3.6m-b+100-7.3m-b]

= -2[210-14.9m-5b]  

= -420+29.8m+10b

--------------------------------------------------------------

S(m,b) = (10-0.6m-b)²+(20-1.3m-b)²+(30-2.1m-b)²+(50-3.6m-b)²+(100-7.3m-b)² 

2.  Now take the partial derivative considering b to be a constant 
    and m and S to be variables:

∂S(m,b)/∂m = 

2(10-0.6m-b)(-0.6)+2(20-1.3m-b)(-1.3)+2(30-2.1m-b)(-2.1)+

                          2(50-3.6m-b)(-3.6)+(100-7.3m-b)(-7.3)

= -0.6(10-0.6m-b)-1.3(20-1.3m-b)-2.1(30-2.1m-b)-3.6(50-3.6m-b)-7.3(100-7.3m-b)  
 
= -6+.36m+0.6b-26+1.69m+1.3b-63+4.41m+2.1b-180+12.96m+3.6b-730+53.29m+7.3b
                  
= -1005 + 72.71m + 14.9b        

We set each of those derivatives = 0 to find the minimum sum

 -420 +  29.8m +   10b = 0
-1005 + 72.71m + 14.9b = 0

 29.8m +   10b =  420
72.71m + 14.9b = 1005

Solve that system of equations and get:

m = 13.39550657,   b = 2.08139042

So the regression line is y = mx + b or

y = 13.39550657x + 2.08139042

That is the correct answer, but you'll have to put it
in your own teacher's notation.  But that's the way I
explain it to my students.  Your teacher only asked for
the slope, not the y-intercept, so you only need the
m = 13.39550657.  

Edwin