Introduction



Next: Least Squares


Perhaps the single most common data analytic tool in use is the method of least squares for fitting lines to data. The least squares criterion is "choose the paramter values that minimize the sum of the squres of the (vertical) distances from the points to the line". We will explore some of the issues that arise in this context using some data on 1993 auto models, most of it collected by Consumers Union, and made available over the internet on the statlib archive. If you don't have the data, click here to retrieve it.

If you print either of the numeric variables, the case numbers will be displayed above the value of the variable for that case; for example case 1 of Hmpg has the value 31 (miles per gallon). Its good practice to begin by looking at the univariate (or marginal) distributions of the data. You might want to examine stem and leaf diagrams (or histograms) and normal plots, before proceding further.




Math 141 Index
Introduction to S

Albyn Jones
August 2004