SOLUTION: Hi everyone thanks for taking the time to read my question. I was reading this article about residuals (statistics) https://www.statology.org/residuals/ and in their example they

Algebra ->  Probability-and-statistics -> SOLUTION: Hi everyone thanks for taking the time to read my question. I was reading this article about residuals (statistics) https://www.statology.org/residuals/ and in their example they      Log On


   



Question 1200118: Hi everyone thanks for taking the time to read my question. I was reading this article about residuals (statistics)
https://www.statology.org/residuals/
and in their example they have x = 12 go to more than one y value.
My question is: what is the residual for x = 12? Is it 3.31 or 0.31? Do I just pick one and ignore the other? Do I list them both at the same time? Do I average them or find the midpoint? I looked through my textbook and notes and can't find anything about what to do about multiple residuals. When I do google searches I just keep getting multiple regression which isn't what I want. This is just the regular linear regression.
I'm honestly very lost and would appreciate your insight. Thank you!

Answer by ikleyn(52788) About Me  (Show Source):
You can put this solution on YOUR website!
.

I understand the reason of your dither: it is because you have a tendency to consider the set of observations as a function.

In reality, the set of observation is NOT NECESSARY a function: it is only a sequence of pairs (input,output)
without a requirement that every single input must have a unique output:
in different pairs, the output CAN BE DIFFERENT for the same input.

I think it is a regular situation (not exclusive), when people work with observations.

A standard procedure of finding a linear regression allows such pairs and treats them with no problems:
- it treats them as sequence of pairs, not as a function.


To check, I went to website
https://www.graphpad.com/quickcalcs/linear1/

and used free of charge calculator for linear regression there.


I inputted their numbers as a table (as is), and got precisely the same linear regression formula as in the referred site
https://www.statology.org/residuals/

without any trouble. It means that the procedure works/treats smoothly in such cases.


So, my suggestion/conclusion/diagnosis is that it is a regular case: you should not be in trouble.


When you consider residuals, they are the differences between the real output of a pair
and the prediction of the linear regression for this input value.


Again, for your better understanding: the observations are the set of pairs, not necessary a function,
while a regression is a real function.