Project 27
In this project, use the model that you set up in Project 13, which you used again in Project 20.
- Perform the standard regression diagnostics on this model to assess its suitability and the correctness of your hypothesis tests and/or confidence intervals. For any of the regression assumptions that do not hold, comment on what changes this should direct you to in your analysis.
- Do these regression diagnostics help you assess to which population of M&Ms your statistical inferences apply? If so, how? If not, what would you use to assess this?
This data set was collected in the summer of 2008. Every M & M candy from three Medium Size bags of M & Ms was measured. One bag was of plain M & Ms, (14.0 oz. or 396.9 g), one bag was of peanut M & Ms (also 14.0 oz. or 396.9 g), and one bag was of peanut butter M & Ms (12.7 oz. or 360 g). As summarized in the table below, the data set has four variables: type, color, diameter, and mass.
The variable diameter refers to the shortest distance from side to side at the candy's widest height when it is placed flat on the table with the "m" facing up. Put otherwise, when the candy is placed in that position, imagine taking horizontal cross-sections of the candy. They will be roughly elliptical. The diameter of the candy is the length of the minor axis of the largest such cross-sectional ellipse (which will generally be the cross-section at half the total height). As you might expect, this axis can be somewhat difficult to determine and was no doubt a source of measurement error, but this definition of diameter does correspond fairly well to the way that an M & M fits into a caliper.
Diameters were measured with a General Tools Ultratech Fraction+ Digital Fractional Caliper (claimed accurate up to plus or minus 0.02mm), and masses were measured with a MyWeigh Durascale 50 (claimed accurate up to plus or minus 0.01 g). The candies were measured in the order given in the data set, which although not entirely random was not intentionally systematic in any way (other than by type).
The variables in the data set are as follows:
Name | Units | Description |
type | peanut, peanut butter, or plain | type of M&M |
color | blue, brown, green, orange, red, or yellow | color of the candy |
diameter | millimeters | diameter of the candy |
mass | grams | mass of the candy |
The full data set in csv format is at: