F tests and anova()


The F test is the basic tool for model comparison. Fit a full model, and restricted model, then compare:

   Mf <- lm(y ~ x1+x2+x3+x4)
   Mr <- lm(y ~ x1+x2)
   anova(Mr,Mf)
Note: the models must be fit with the same observations.

Assignment



  1. Load the Brain/Body weight data set. Plot Brain.WT vs Body.WT, fit the regression model Brain.WT~Body.WT. Plot the residuals and other diagnostics. What are the problems with this model?

    Now fit the following linear models in the log scale:

    Use the F-test (anova() function) to compare the models. Which is the best model?

    There are three classes, birds, fish, and mammals. Which group is the baseline group represented by the intercept term? Which group has the highest average log(Brain.WT) after controlling for log(Body.WT)? Which species seem unusually "big brained" or especially "small brained", relative to their groups? It will be helpful to look at plots with different symbols or colors for the different groups!

  2. Load the Florida election dataset with
    FL <- read.csv("http://people.reed.edu/~jones/141/FL.dat")
    
    Read the data description on the 141 website!!

    We are primarily interested in over- and under-voted ballots. Define the new variable NoVote <- over+under. Your mission is to relate the sum of over and under vote counts to explanatory factors including voting technology (Tech), ballot layout (Layout), number of ballots cast, and possible socio-economic correlates of voting efficacy: education (PctHS or PctColGrad), percent elderly population, poverty, unemployment, median household income.