SOLUTION: Should I sacrifice near-normality in order to run regression analysis in Excel2007? The first part of my assignment was correcting for missing data. Seeing that some of the var

Algebra ->  Probability-and-statistics -> SOLUTION: Should I sacrifice near-normality in order to run regression analysis in Excel2007? The first part of my assignment was correcting for missing data. Seeing that some of the var      Log On


   



Question 459853: Should I sacrifice near-normality in order to run regression analysis in Excel2007?
The first part of my assignment was correcting for missing data. Seeing that some of the variables were anything but normal, I removed outliers. This left me with uneven sample sizes. If I am trying to analyze the factors affecting hospital cost don't I have to run regression analysis? When I try to do it in excel it obviously won't allow me to do it since the sample sizes are not equivalent.
So, if I am interested in how the different factors affect cost, do I remove the outliers in cost variable (y), and subsequently remove the accompanying variables across the board? List wise deletion except to correct for outliers?
Or do I just accept that the majority of these are skewed and go from there?
Any guidance would be appreciated. My Ocd tendencies are not good when taking an introductory biostats class LOL.
Thanks in advance!
Here are my desc stats
w/ orig data

Mean 125636198.1 85 137 967 3100 18507 20577 2082 651
Median 77698890.47 48 134 498 1251 8582 20833 1296 420
Mode #N/A 15 124 4771 #N/A #N/A 0 #N/A 0
Standard Deviation 163253552.2 102 30 1178 4491 25771 16569 2244 792
Sample Variance 2.66517E+16 10468.92 887.90 1386601.94 20169352.31 664147515.61 274531737.27 5034278.69 626619.76
Kurtosis 19.36996943 7.47 -0.19 4.45 4.97 7.61 1.77 4.61 3.88
Skewness 3.862249824 2.48 0.41 2.24 2.31 2.65 1.08 2.07 2.00
Range 1118414097 537 128 4842 21092 133432 83229 10913 3669
Minimum 11017817.58 2 78 1 49 435 0 98 0
Maximum 1129431915 539 206 4843 21141 133867 83229 11011 3669
Sum 9673987257 6497 10112 73467 229400 1425053 1605018 154057 50098
Count 77 76 74 76 74 77 78 74 77
Below shows descriptive statistics afterthe removal of outliers & missing data
Mean 71.79631515 59 133 581 1226 9711 16776 1392 384
Median 66.9061 40 133 431 938 7502.5 16856 1204.5 381
Mode #N/A 40 124 431 #N/A #N/A 0 #N/A 0
Standard Deviation 41.44398741 53 25 453 1016 8129 10960 1135 349
Sample Variance 1717.604093 2859 650 204778 1031616 66077065 120111788 1287801 121824
Kurtosis 0.491434794 0.268559434 0.088756826 0.80076461 2.047540427 0.26399267 -0.992319081 1.308159612 0.93664116
Skewness 0.859322829 1.040215988 0.304137886 1.117717591 1.372018541 1.069535728 -0.069213819 1.286450345 0.974957725
Range 183.414746 222 108 1970 4765 31075 37873 4599 1502
Minimum 11.017818 2 83 8 49 723 0 98 0
Maximum 194.432564 224 191 1978 4814 31798 37873 4697 1502
Sum 3805.204703 3375 7599 31964 62532 524388 956256 75158 21123
Count 53 57 57 55 51 54 57 54 55
Confidence Level(95.0%) 11.42337744 14.18755301 6.765346749 122.3344783 285.6660103 2218.729794 2907.960637 309.7443555 94.35664243
Original Data (No Adjustments)

Mean 125636198.1 85 137 967 3100 18507 20577 2082 651
Median 77698890.47 48 134 498 1251 8582 20833 1296 420
Mode #N/A 15 124 4771 #N/A #N/A 0 #N/A 0
Standard Deviation 163253552.2 102 30 1178 4491 25771 16569 2244 792
Sample Variance 2.66517E+16 10468.92 887.90 1386601.94 20169352.31 664147515.61 274531737.27 5034278.69 626619.76
Kurtosis 19.36996943 7.47 -0.19 4.45 4.97 7.61 1.77 4.61 3.88
Skewness 3.862249824 2.48 0.41 2.24 2.31 2.65 1.08 2.07 2.00
Range 1118414097 537 128 4842 21092 133432 83229 10913 3669
Minimum 11017817.58 2 78 1 49 435 0 98 0
Maximum 1129431915 539 206 4843 21141 133867 83229 11011 3669
Sum 9673987257 6497 10112 73467 229400 1425053 1605018 154057 50098
Count 77 76 74 76 74 77 78 74 77




orig. data set
Hospital Total Beds in the Hospital Total Number of Services offered in the facility Total operating cost per year Full-Time Equivalents Case Mix Adjusted Discharges Total Surgical Visits per year Total ER Visits per year Total Ambulatory Visits per year Total Live Births per year
1 60 140 64331896 376 1282 6898 24126 1393 702
2 65 140 122146708 582 2277 7075 11402 251 250
3 43 144 46481090 303 661 2970 9753 1287 443
4 122 152 135096200 857 3311 21494 25580 3289 1502
5 79 170 233734561 1395 5756 32258 59382 3076
6 60 149 159399584 734 12978 35053 2621 750
7 78 110 1129431915 572 1468 24319 27641 2911
8 40 124 101344345 496 1220 6402 16856 853 416
9 77 135 107606586 456 1098 8566 33607 1115 462
10 20 136 52811772 503 1579 8582 22893 1544 714
11 121 71332538 476 921 8905 22497 568 129
12 75 149 80100410 886 1645 11186 30815 1483 238
13 40 136 47997004 71 231 6574 0 625 0
14 114 149 106953900 1050 2800 17868 31857 1861 789
15 29 135 66159737 181 406 7530 0 928 0
16 60 128 77698890 374 919 6004 0 749 0
17 86 149 72571020 490 2294 11024 26801 1305 702
18 20 33890360 318 541 435 1874 453 353
19 92 144 90378372 1080 2000 11954 37873 1700 1200
20 40 133 72571812 130 1366 5656 14841 997 449
21 224 198 354097261 2686 12966 71663 30207 8792 0
22 134 197 252896395 1302 7041 22066 35080 5400 0
23 150 187 99809865 1213 4814 26267 23052 3146 1067
24 140 189 291002040 4771 10530 66084 51377 6193 1750
25 210 206 2713 12000 42050 41104 5109 2682
26 250 195 635719011 4609 16636 76728 19670 11011 0
27 150 157 159291211 1418 6437 31798 34890 3908 660
28 155 191 240386978 1694 7856 54653 52290 4695 2956
29 2 83 47824729 8 122 7699 0 325 0
30 2 86 39505252 192 583 7427 0 363 114
31 131 136 168112580 1368 41119 51141 4277 1076
32 30 106 50374425 361 614 1775 12525 869 0
33 101 133 101811566 962 1739 11042 22105 1807 720
34 180 132674596 1374 3504 19576 38521 3165 1619
35 150 145 133246519 1512 3283 23641 26454 3421 1616
36 2 78 42533051 1 170 7602 0 0
37 23 147 42267738 431 643 2375 22863 307 435
38 131 121 110985783 934 938 15479 24475 1723 0
39 106 131 57585450 755 1212 8265 12114 1211 381
40 11 112 20608644 125 155 1230 1624 98 26
41 13 87 11017818 165 175 723 0 177 88
42 29 121 41110232 328 3568 0 1195 301
43 26 124 32987171 311 600 3795 5122 453 199
44 25 102 29925571 328 963 3518 20964 918 418
45 104 133 127582023 1126 2517 27604 5213 604
46 50 132 32715235 22 416 3487 8201 340 183
47 124 31993465 324 508 2954 4556 457 141
48 17 106 24856133 375 285 1997 6104 289 144
49 39 109 35320048 408 794 1306 15178 400 474
50 412 155 268521593 3991 10668 76140 20702 9672 2089
51 203 80725005 3296 13570 21178 2395 1028
52 477 180 449112700 4771 16467 133867 83229 6918 3669
53 539 184 457043818 4843 21141 119776 55583 2673
54 166 115 72454403 499 922 8876 18711 1423 533
55 15 137 87091348 195 972 15498 21008 2355 0
56 30 135 81495413 431 1198 6815 23313 1198 1092
57 6 94 21660247 222 204 2463 0 389 273
58 6 100 24620386 241 281 2008 0 413 222
59 10 107 23872832 229 491 2667 1117 455 253
60 7 106 28000000 330 375 1800 13370 512 270
61 15 120 37778177 415 455 2510 7333 536 266
62 15 154 96849621 190 1521 11434 10001 2092 0
63 2 101 15783121 90 49 792 5190 187 0
64 100 145 78819659 459 1803 7475 14531 1261 481
65 20 130 43505204 622 907 1183 15825 1257 518
66 3 92 18907920 157 1572 7567 420 0
67 15 122 29631482 321 2139 5676 500 255
68 20 122 70935956 725 1358 16250 15760 1387 493
69 15 124 58431299 158 165 2586 11762 344 420
70 9 123 66906100 198 500 10000 14500 1000 0
71 45 170 194432564 1038 3163 25517 21722 2323 432
72 110 186 165151572 1978 6668 26977 32893 4697 742
73 25 124 51467691 697 933 7918 22939 1560 273
74 52 155 143039315 1198 2749 26983 28938 2960 702
75 40 143 87885000 963 2401 18123 26512 2321 850
76 25 131 80641895 735 1660 12840 23097 2658 527
77 245 177 435805108 3888 15682 75068 52645 1801
78 85 183138339 1066 3768 13643 9846 1634 407

Answer by math-vortex(648) About Me  (Show Source):
You can put this solution on YOUR website!
Have you thought of using SPSS rather than Excel? I think you will have much more flexibility and better information about your data with a true statisitics program. I think you can get a free trial download if your STAT Lab doesn't have the software. (Google SPSS Statistics.)
Good Luck!