Project 47
Project description

Use regression diagnostics to assess the stengths and weaknesses (and suitability) of the model in Project 43.

Background on the data set

Each semester in all of the Introductory Statistics sections at the University of Puget Sound, a survey is given to the students during the first week of class. The survey is voluntary and is used as an example data set throughout the class.

The data set given here is a compilation of much of the data collected in the period from the Fall 2002 semester through the Spring 2008 semester. Values that have been determined to be incorrect (such as 8-foot tall students) have been removed from this data set.

Variables in the data set
The variables in the data set are as follows:
semester1=Fall 2002, 2=Spring 2003, 3=Fall 2004, ... , 12=Spring 2008 course semester
genderF=female, M=malestudent gender
collegeYear1=first-year student (freshman), 2=second-year student (sophomore), etc.year number in college
heightinchesheight of student
weightpoundsstudent weight
pulsebeats per minutestudent pulse at time of survey
hsGPAtraditional 4-point grade scale pointsstudent grade point average in high school
collegeGPAtraditional 4-point grade scale pointsstudent grade point average to date in college
SATMSAT Mathematics pointsstudent SAT Mathematics score (200-800 range)
SATVSAT Verbal pointsstudent SAT Verbal score (200-800 range)
shoeSizeUS shoe size unitsstudent shoe size
financialAidN=no, Y=yesis the student on financial aid for college?
tvHourshours per weekaverage number of hours per week spent watching television during the school year
statesstatesnumber of US states the student has been to
siblingssiblingsnumber of siblings the student has
motherAgeyearsage of the student's mother
fatherAgeyearsage of the student's father
salaryUS dollarsannual salary that the student realistically expects to earn upon graduation
Link to the data set
The full data set in csv format is at: