Regression Inference: Exercises |
The data frame "BP" contains data collected by two scientists
in the 19th century, Forbes and Hooker.
Forbes collected this data in the Alps. He was interested in
finding a way to estimate altitude from atmospheric pressure
without having to carry around a mercury barometer. He travelled to
various locations at different altitudes measuring the boiling
point of water and the atmospheric pressure with a mercury barometer.
Hooker repeated the experiment in the Himalayas, at higher altitude.
"BP" <- structure(.Data = list( "temp" = c(194.5, 194.3, 197.9, 198.4, 199.4, 199.9, 200.9, 201.1, 201.4, 201.3, 203.6, 204.6, 209.5, 208.6, 210.7, 211.9, 212.2, 210.8, 210.2, 208.4, 202.5, 200.6, 200.1, 199.5, 197., 196.4, 196.3, 195.6, 193.4, 193.6, 191.4, 191.1, 190.6, 189.5, 188.8, 188.5, 185.7, 186., 185.6, 184.1, 184.6, 184.1, 183.2, 182.4, 181.9, 181.9, 181., 180.6), "press" = c(20.79, 20.79, 22.4, 22.67, 23.15, 23.35, 23.89, 23.99, 24.02, 24.01, 25.14, 26.57, 28.49, 27.76, 29.04, 29.88, 30.06, 29.211, 28.559, 27.972, 24.697, 23.726, 23.369, 23.03, 21.892, 21.928, 21.654, 21.605, 20.48, 20.212, 19.758, 19.49, 19.386, 18.869, 18.356, 18.507, 17.267, 17.221, 17.062, 16.959, 16.881, 16.817, 16.385, 16.235, 16.106, 15.928, 15.919, 15.376), "Source" = structure( .Data = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), levels = c("Forbes", "Hooker"), class = "factor")), names = c("temp", "press", "Source"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48"), class = "data.frame")
temp: boiling point of water in degrees F.
press: atmospheric pressure in inches of mercury.
Source: Forbes, James D. (1857) "Further experiments and remarks on the
measurement of heights by the boiling point of water." Trans Roy. Soc.
Edinburgh, vol 21 p135-143.
For the following, it will be useful to either assign the columns of the dataframe as individual variables, or use the attach command
attach(BP)to give you direct access to the columns of the dataframe as variables.
1) Develop a formula to predict atmospheric pressure
using the boiling point of water. Make sure
you validate your model: plot the data, plot residuals, etc.
Plot the predicted pressure vs temperature including a confidence interval for the fitted line. You are out hiking, and boil some water for your morning coffee. Taking out your handy thermometer, which you always carry, you discover that the boiling water is 195F. Estimate your altitude. (This will require a little internet research on the relationship between elevation and air pressure.) Give a range for you estimate, and explain how you computed it. |
2) Which scientist (Forbes or Hooker) collected better data? Explain! |
3) For the regression model you selected, compute 95% confidence intervals for the intercept and slope. |
4) For the CPS dataset, develop a regression model relating wages to exper (years of experience). Based on that model, what should we conclude about that relationship? Be sure to repeat the analysis process outlined in the examples, and base your conclusions on CI's or hypothesis tests! Do the data seem appropriate for the model? |