Question 436793: Sorry...this is more of a calculus question but I really need some help.
The following experimental data set has been collected {(xi, yi)} = { (0.6, 10), (1.3, 20), (2.1, 30), (3.6, 50), (7.3, 100) }. Find the best straight line y = mx that fists the data by using a method that minimizes the error squared:
E =n=1 (y- mx)^2 = (y1 - mx1)^2 + (y2- mx2)^2 + ... + (yn - mxn)^2.
The task is to find slope, m of the straight line by minimizing E using the data set given above. Please help. I don't know where to begin.
Answer by Edwin McCravy(20064) (Show Source):
You can put this solution on YOUR website!
You'll have to see if you can get this to match your teacher's
notation. I'll explain it from scratch the way I teach it:
If we could find a line y = mx+b that would satisfy the above data
perfectly then we would have
y = m*x + b
----------------
10 = m*0.6 + b
20 = m*1.3 + b
30 = m*2.1 + b
50 = m*3.6 + b
100 = m*7.3 + b
But we know we can't likely have that, so we calculate the
"errors" by subtracting the right sides from the left, and those
values are called the "residuals"
10 - (m*0.6 + b) = 10 - 0.6m - b
20 - (m*1.3 + b) = 10 - 1.3m - b
30 - (m*2.1 + b) = 10 - 2.1m - b
50 - (m*3.6 + b) = 10 - 3.6m - b
100 - (m*7.3 + b) = 100 - 7.3m - b
Next we make a function by adding up the sum of the squares
of these residuals:
S(m,b) = (10-0.6m-b)²+(20-1.3m-b)²+(30-2.1m-b)²+(50-3.6m-b)²+(100-7.3m-b)²
1. Take the partial derivative with respect to b, which is just
like ordinary derivatives considering m to be a constant and
b and S to be variables:
∂S(m,b)/∂b =
2(10-0.6m-b)(-1)+2(20-1.3m-b)(-1)+2(30-2.1m-b)(-1)+
2(50-3.6m-b)(-1)+(100-7.3m-b)(-1)
= -2(10-0.6m-b)-2(20-1.3m-b)-2(30-2.1m-b)-2(50-3.6m-b)-2(100-7.3m-b)
= -2[(10-0.6m-b)+(20-1.3m-b)+(30-2.1m-b)+(50-3.6m-b)+(190-7.3m-b)]
= -2[10-0.6m-b+20-1.3m-b+30-2.1m-b+50-3.6m-b+100-7.3m-b]
= -2[210-14.9m-5b]
= -420+29.8m+10b
--------------------------------------------------------------
S(m,b) = (10-0.6m-b)²+(20-1.3m-b)²+(30-2.1m-b)²+(50-3.6m-b)²+(100-7.3m-b)²
2. Now take the partial derivative considering b to be a constant
and m and S to be variables:
∂S(m,b)/∂m =
2(10-0.6m-b)(-0.6)+2(20-1.3m-b)(-1.3)+2(30-2.1m-b)(-2.1)+
2(50-3.6m-b)(-3.6)+(100-7.3m-b)(-7.3)
= -0.6(10-0.6m-b)-1.3(20-1.3m-b)-2.1(30-2.1m-b)-3.6(50-3.6m-b)-7.3(100-7.3m-b)
= -6+.36m+0.6b-26+1.69m+1.3b-63+4.41m+2.1b-180+12.96m+3.6b-730+53.29m+7.3b
= -1005 + 72.71m + 14.9b
We set each of those derivatives = 0 to find the minimum sum
-420 + 29.8m + 10b = 0
-1005 + 72.71m + 14.9b = 0
29.8m + 10b = 420
72.71m + 14.9b = 1005
Solve that system of equations and get:
m = 13.39550657, b = 2.08139042
So the regression line is y = mx + b or
y = 13.39550657x + 2.08139042
That is the correct answer, but you'll have to put it
in your own teacher's notation. But that's the way I
explain it to my students. Your teacher only asked for
the slope, not the y-intercept, so you only need the
m = 13.39550657.
Edwin
|
|
|