SOLUTION: Because it is not practical to weigh bears in the field, researchers sought to develop a model to predict a bear's weight based on its length. Here are the results for a sample:

Algebra ->  Statistics  -> Linear-regression -> SOLUTION: Because it is not practical to weigh bears in the field, researchers sought to develop a model to predict a bear's weight based on its length. Here are the results for a sample:       Log On


   



Question 1196665: Because it is not practical to weigh bears in the field, researchers sought to develop a model to predict a bear's weight based on its length. Here are the results for a sample:
Total Length (cm) Weight (kg)
139.0 110
138.0 60
139.0 90
120.5 60
149.0 85
141.0 100
141.0 95
150.0 85
166.0 155
151.5 140
129.5 105
150.0 110
The residual associated with the bear whose length is 149.0 cm and weight is 85 kg is _______kg. (round your answer to three digits after the decimal)

Answer by math_tutor2020(3816) About Me  (Show Source):
You can put this solution on YOUR website!

Answer: -24.960

============================================================

Explanation:

x = total length (cm)
y = weight (kg)

Use technology to find the equation of the regression line
I used GeoGebra to get y = 1.69417x - 142.47092 approximately.
You could use a spreadsheet program or any linear regression calculator to get the same thing.
I'll go into further detail where this equation comes from in the next section below.

Plug in x = 149.0 to find that,
y = 1.69417x - 142.47092
y = 1.69417*149.0 - 142.47092
y = 109.96041

The length of x = 149.0 cm leads to a predicted weight of about y = 109.96041 kg
The true weight associated with this x value should be y = 85 kg instead.

The residual is the error between the observed y value and predicted y value
residual = (observed y value) - (predicted y value)
residual = 85 - 109.96041
residual = -24.96041
residual = -24.960
This is approximate and rounded to three decimal places (nearest thousandth)


---------------------------------------------


This section will go into further detail where the regression line came from.
Depending on your teacher, this section is optional.

In real world settings, you won't need to know the formulas. It's much more efficient to use calculators, software, or spreadsheets.
However, it's still good to know what's going on under the hood. I'll leave out the proofs and derivations of each formula.
Those are better suited for calculus and linear algebra settings.

Here's the original data set of x and y values paired up together
xy
139110
13860
13990
120.560
14985
141100
14195
15085
166155
151.5140
129.5105
150110
We'll form the following columns:
x^2
xy

The x^2 column is where we square each x value
eg: 139 squares to 139^2 = 139*139 = 19321

The xy column has us multiply each x and y value together (separately per row).
Eg: 139.0*110 = 15290 in the first row of this column.

I strongly recommend using spreadsheet software rather than doing it all by hand.

Here's what all that looks like
xyx^2xy
1391101932115290
13860190448280
139901932112510
120.56014520.257230
149852220112665
1411001988114100
141951988113395
150852250012750
1661552755625730
151.514022952.2521210
129.510516770.2513597.5
1501102250016500
Next we add up the values of each column
P = sum of the x values = 1714.5
Q = sum of the y values = 1195
R = sum of the x^2 values = 246447.75
S = sum of the xy values = 173257.5

The linear regression equation is of the form y = mx+b
m = slope
b = y intercept

To calculate m and b, we use these two formulas
m+=+%28n%2AS-P%2AQ%29%2F%28n%2AR-P%5E2%29

b+=+%28Q%2AR-P%2AS%29%2F%28n%2AR-P%5E2%29
where P,Q,R,S were mentioned in the previous paragraph above.

The numerators are different, but the denominators are identical.
The n refers to the sample size. It's the number of x,y pairs of values. In this case we have n = 12 such items.

So,
m+=+%28n%2AS-P%2AQ%29%2F%28n%2AR-P%5E2%29

m+=+%2812%2A173257.5-1714.5%2A1195%29%2F%2812%2A246447.75-%281714.5%29%5E2%29

m+=+1.6941680312382

m+=+1.69417
is the approximate slope

And,
b+=+%28Q%2AR-P%2AS%29%2F%28n%2AR-P%5E2%29

b+=+%281195%2A246447.75+-+1714.5%2A173257.5%29%2F%2812%2A246447.75-%281714.5%29%5E2%29

b+=+-142.470924129823

b+=+-142.47092
is the approximate y intercept.

Therefore, the template y = mx+b updates to the approximation of y = 1.69417x - 142.47092
This was the linear regression equation (aka line of best fit) mentioned in the previous section.
Follow the steps mentioned in the previous section to get an answer of -24.960