Question 1117276: Hey there,
I’m not sure if this is the correct category for this question, but I’m not sure where else to post it. Also, this is not technically a homework question, but a question for a personal project I have been working on.
Some background: I am trying to predict the score of MLB baseball games before they occur using team statistics. Right now, to predict the score for a given team, I multiply four “factors” together. The factors are:
1. MLB Average Team Runs Per Game (typically around 4.5)
2. Team’s Hitting Score (a score I give to this team to tell me how much better or worse than average they are at hitting – typically ranges from 0.8 to 1.2)
3. Opponent’s Pitching Score (a score I give to this team’s opponent based on how good or bad their starting pitching and bullpen pitching is – typically ranges from 0.6 to 1.4)
4. Park Factor (how many runs are scored at the ballpark they are playing at compared to the average park – typically ranges from 0.8 to 1.3)
It’s important to note that the hitting score, pitching score, and park factor are all “normalized” to 1.00, meaning the most average team with the most average pitcher at the most average ballpark would have 1.00 for each of those scores. A team that hits 20% worse than average would have a score of 0.8, etc.
For example, I will take today’s Atlanta Braves vs Chicago Cubs projections to show you how this is done.
For Atlanta’s Score Projection:
MLB Average Runs Per Game: 4.46
Hitting Score: 1.20 (about 20% better than the average team)
Opponent Pitching Score: 1.15 (their opponent – Cubs – are pitching about 15% worse than the average team)
Park Factor: 0.98 (2% less runs are scored at this ballpark than the average park)
When I multiply these four factors together, I get (4.46 x 1.20 x 1.15 x 0.98) which comes out to a projection of 6.03 runs scored. I have an excel spreadsheet that is tracks all of these factors, projections, and the number of runs that the team actually scored. My question for you is, how can I improve this formula that I am using for more accurate projections? Currently, this assumes that the four factors I am using are equally predictive. Basically, my formula looks like this:
Factor1 * Factor2 * Factor3 * Factor4 = ScoreProjection
However, I feel like there should be a coefficient for each factor. Maybe the team’s hitting score is not as predictive as their opponent’s pitching score, so I would need the pitching score to be more heavily weighted in my projections. So, I think I need a formula like this:
(X1*Factor1)*(X2*Factor2)*(X3*Factor3)*(X4*Factor4) = ScoreProjection
Where X1, X2, X3, and X4 are my factor coefficients. How can I go about finding these using my data that I have collected? There are obviously no perfect coefficients that would give me perfect score projections every time, so I suppose there is no true “solution” to this problem. I have investigated “multiple regression” through machine learning, but this does not yield the results I would like. Multiple regression gives me a formula like:
(X1*Factor1)+(X2*Factor2)+(X3*Factor3)+(X4*Factor4) = ScoreProjection
Where the factors are being added together as opposed to multiplied like I want. Really, I think I want to know which coefficients would lead me to the most accurate score projection (best fit line?). If this can be done, please let me know how!
Let me know of any questions that you might have. I appreciate the help!!
Barrett Gray
BarrettGray11@Yahoo.Com
Answer by ikleyn(52775) (Show Source):
You can put this solution on YOUR website! .
Sorry,
we do not accept such kind of assignments for execution.
While our goal in this forum is to teach students to solve their school Math problems,
it is totally out of the profile of this forum to substitute students or users in doing their own work.
Best regards.
|
|
|