Unformatted text preview: Statistics 512: Solution to Homework#5 For the following 6 problems use the computer science data that we have been discussing in class. You can get a copy of the data set csdata.dat from the class website. The variables are: id , a numerical identifier for each student; GPA , the grade point average after three semesters; HSM ; HSS ; HSE ; SATM ; SATV , which were all explained in class; and GENDER , coded as 1 for men and 2 for women. 1. In this exercise you will illustrate some of the ideas described in Chapter 7 of the text related to the extra sums of squares. (a) Create a new variable called SAT which equals SATM + SATV and run the following two regressions: i. predict GPA using HSM , HSS , and HSE ; Solution: Notice the analysis of variance with respect to the next problem. Analysis of Variance Sum of Mean Source DF Squares Square F Value Model 3 27.71233 9.23744 18.86 Error 220 107.75046 0.48977 Corrected Total 223 135.46279 Root MSE 0.69984 R-Square 0.2046 Dependent Mean 2.63522 Adj R-Sq 0.1937 Coeff Var 26.55711 ii. predict GPA using SAT , HSM , HSS , and HSE . Solution: The analysis of variance table is as follows: Analysis of Variance Sum of Mean Source DF Squares Square F Value Model 4 27.88746 6.97187 14.19 Error 219 107.57533 0.49121 Corrected Total 223 135.46279 Root MSE 0.70086 R-Square 0.2059 Dependent Mean 2.63522 Adj R-Sq 0.1914 Coeff Var 26.59603 Calculate the extra sum of squares for the comparison of these two analyses. Use it to construct the F-statistic – in other words, the general linear test statistic – for testing the null hypothesis that the coeﬃcient of the SAT variable is zero in the model with all four predictors. What are the degrees of freedom for this test statistic?four predictors....
View Full Document
Please check this section from time to time for updates to the schedule or other information.
02/27: The way we have framed the formulation of the general linear model, and how we will continue to frame it, is not the most efficient. This can all be formulated very conveniently using vectors, matrices, and the tools from linear algebra. Since few/none of you will have this background, and there isn't time to teach it to you in ST512, I will not discuss this in class. There is a summary in Chapter 12.9 of Ott & Longnecker and, if you're interested, I can tell you more about it in office hours. It's also pretty easy, at least in R, to do these vector/matrix calculations, and here is some sample code.
02/26: The exams have been graded and the scores are recorded on moodle, so I'll return them to you in class tomorrow. As I'll explain, I adjusted the scale to be out of 90 points, instead of 100, and the score you see on moodle is your score out of 90. When I calculate your final grades, I will adjust this scores by multiplying by 1.1 = 100 / 90. A summary of the adjusted scores (scaled to 100) is here. Of the 72 students who took the exam, the average adjusted score is 81 and the standard deviation is 9.8.
02/23: My solutions to Exam 1 are here. I will do my best to have the exams graded and ready to return to you by next Tuesday but, if that doesn't happen, then by next Thursday for sure. Have a nice weekend!
02/18: Just a reminder, for the exam you are allowed a one-page (regular-sized, front and back) sheet of notes with anything you want to write on it. The only condition is that the notes must be handwritten. Also, you should bring a calculator to the exam for some basic arithmetic; nothing fancy is required, being able to do square roots is enough.
02/18: Solutions to the review problems and old exam are here and here, respectively.
02/13: As promised, here are some review problems for Exam 1 and here is the exam I gave last spring. Next week, both in lecture and in lab, you will have an opportunity to ask questions about these materials. Solutions will be posted over the weekend.
02/06: I apologize for the confusion, but there was a minor issue with the grades entered in the moodle gradebook. I asked the grader to score the homework out of 20 total points but the gradebook item was set with a max of 30 points. I have changed the gradebook item to have a max of 20 points, so it should be fine now. The point max points on a homework don't really matter anyway, only the percentage.
02/03: Sorry I'm late, but Homework 02 is posted below.
01/26: I apologize for being a little bit behind schedule; the snow-day messed me up a bit more than I had anticipated, but I hope that we can get caught up on Tuesday. For now, let me give you some quick SAS help with the plots we discussed in class on Thursday—my time ran out before I could show you the plots in SAS.
Go to the SAS code I provided below for Example 11-4 in the text. From the output of the first PROC REG statement there, you get a 3x3 grid of plots labeled "Fit Diagnostics". Of the nine plots displayed there, the only ones relevant to us now are those three in the first column. In particular, from top to bottom, the first plot is residuals versus fitted (for assessing the linearity assumption and, in this one-explanatory-variable case, the constant variance assumption), the Q-Q plot of residuals (for assessing the normality assumption), and a histogram of residuals (also for assessing normality). I'll go through this quickly at the beginning of lecture on Tuesday, but you may need some of these plots for the homework.
01/24: I had meant to ask the lab TA to go over, on the first day, how to get the SAS output in a format that you can easily copy-paste into a Word file for homework submissions, but I must have forgot. Anyway, here is an example of how you can do it. This certainly isn't the only way to do it, and I make no claims that it's the best, but it seems to work reasonably well. The only thing you have to be careful about is keeping track of were the file gets saved so you can find it later; this is machine-specific, so I can't give any general recommendations. Also, I recommend that you write your codes and view the output in the regular output screen in SAS; then, after your satisfied that your code is correct and everything looks good, insert those lines at the beginning and end so that the output gets saved how and where you want it. I found this explanation here and, if you're interested in digging into this a bit more, the navigation bar on the side points you to some additional help pages related to generating document-ready output.
01/22: Hope you enjoyed the snow-day last week. We're back on our regular schedule this week, and the lab assignment is posted below. (The first problem is the same as one from the first lab, which is intentional; when you saw that problem in the first week, you didn't know about the tests and confidence intervals, but now you do so it's a good time to go through that one again quickly.)
01/16: The first homework assignment has been posted below.
01/15: No lab this week (Jan 16th and 17th). I wanted you to get some experience with SAS during the first week but we've not covered enough material yet to do anything interesting in lab. Next week we'll be ready for lab as usual.
01/11: On Tuesday (01/09) I said a few words about philosophy and, in particular, that statistical analysis a powerful tool for gaining scientific insight, but that there is a lot more to it than plugging data into a statistical software package and hitting "run". Here is a paper I finished recently with a friend of mine that gives some more details related to these important points.
01/09: My expectation is that you will be able to pick up virtually all of the SAS that you need for the homework assignments, etc, from your experiences in the lab and from the examples that I go over in lecture. However, this probably is not be enough for you to really learn SAS. I personally don't know SAS all that well—I especially have trouble to remember the abbreviations for the various options—but, fortunately, there is a wealth of information on the web to help. As we proceed, I will post some stuff from the web that seems helpful. For now, here is some general information that I've found that I think is pretty useful:Aside from web resources, there are lots of (relatively inexpensive) books available as well. One that I used a long time back, The Little SAS Book, by Delwiche and Slaughter, is pretty good; not all the stuff there is relevant to ST512, but a good resource to have if learning SAS is important for your future goals.
01/09: I can confirm that the 7th edition of the textbook IS on reserve at the library. So if you don't have your own copy of the book, you can check the book out for an hour or so to take pictures of the details of the exercises assigned for homework.
01/09:If you are registered for ST512, then you can access SAS on your personal computer by following the instructions here to get to the "virtual teaching lab." I set this up on my (Mac) computer in my office and it seems to work fine. If you're off-campus, then you also need to set up a VPN; there's some instructions on the linked page above for setting this up.
01/09: I'm going to keep a "log" of the material covered each day, along with the relevant sections in Ott & Longnecker's book, in the "Course Outline" section below. This is partly to help me keep a record and partly to help students in case you have to miss a lecture day.
01/09: Homework and exam scores for this course will be posted on moodle, which can be accessed from the WolfWare website here. Please check your grades occasionally to be sure that your scores have been recorded correctly.
01/09: Here are a few important dates—I will remind you of these things in class when the time is near. First, Spring Break is March 5th–9th so there will be no class that week. Second, our midterm exams are tentatively scheduled for Thursday, February 22nd and Thursday, April 5th. Third, the final exam will be held on Thursday, May 3rd, 8–11am, which is based on the schedule set by the university.
01/09: Welcome to ST512!