Project 45
Project description

Suppose that you would like to develop a method of predicting the height of a 16-year-old, using any of the other numerical variables in the data set. Use linear regression to develop such a method, and interpret your results.

Background on the data set

This data set is excerpted from an anthropometric data set that is posted in full at the Anthrokids website. According to the website, this study was the result of a Consumer Product Safety Commission (CPSC) effort in the mid-seventies. The creation of a publically accessable data base was the result of a joint effort between the Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) and the CPSC. Partial sponsorship is coming from the Systems Integration for Manufacturing Applications (SIMA) project at NIST. Prior to this effort the data did not exist other than on paper.

Again according to the website, the primary goal to "computerize" the data has been accomplished. All of the tabular data was originally entered as spreadsheet data. A variety of conversion techniques were subsequently used to create HTML, plain ASCII (PRN), comma separated values (CSV) and other data tables, all of which are accessable (for free) via the Anthrokids web site.

The excerpt here is taken from the 1977 study on the Anthrokids website. The original documentation (648 pages, about 30 MB) for the data set is available scanned into pdf format at the Anthrokids website.

We at the University of Puget Sound Data Hoard are grateful to those individuals and agencies who have made the data publicly available.

Variables in the data set
The variables in the data set are as follows:
NameUnitsDescription
id(positive integer)child identification number
masskilogramsmass of child
heightcentimetersheight of child
waistcentimeterswaist circumference of child
footcentimeterschild's foot length
sittingHeightmillimeterssitting height of child
upperLegLengthmillimeterslength of child's upper leg
kneeHeightmillimetersheight of child's knee
forearmLengthmillimeterslength of child's forearm
ageyearsage of child
genderF (female) or M (male)gender of child
handednessboth, left, or righthandedness of child
birthOrder(number)child's numerical ranking by age among siblings (1 being first)
Link to the data set
The full data set in csv format is at:
http://hoard.projectivespace.com/datasets/anthrokids.csv