Theoretical Q-Q Plots



We can also compare empirical quantiles (i.e., the sorted values of a sample) to the theoretical quantiles of a distribution such as the standard normal distribution ( N(0,1)). This is like plotting the sorted dataset against a sorted sample from a standard normal distribution, except we replace the sorted normal sample with the theoretical quantiles.

We can easily get the corresponding quantiles from the standard normal distribution using the qnorm() function:

	qnorm( (1:10)/(11) )
returns the quantiles for a sample of size 10 (recall that the expression 1:n means the sequence {1, 2, ..., n}). We could then use the qqplot() function to plot the observed data (sorted) against the theoretical quantiles of the standard normal. This is such a common plot, that S has it built in: qqnorm() produces the `normal probability plot' (theoretical Q-Q plot for the N(0,1) distribution).
	qqnorm(A1)

As with empirical Q-Q plots, the normal plot is not reliable with small sample sizes. S has a function that adds roughly the right straight line to a qqplot. After making the qqplot, issue the qqline() command for the same dataset:

	qqnorm(A1)
	qqline(A1)

It takes a bit of practice to develop your ability to interpret qqplots and normal plots.

Exercises



Exercise 1

Generate 5 samples of sizes 10, 20, and 50 from the standard normal distribution, making a normal plot for each. For example:
	z1 <- rnorm(10)        
		# generate 10 obs. from the N(0,1) dist.
	qqnorm(z1)
	qqline(z1)
In each case, use the qqline command to add a line to the plot. You need to get a feel for the natural level of variability in normal plots when the data really have a normal distribution. Don't turn these plots in, just make them and look at them carefully! Then do it again.

Exercise 2

Issue the command below to load the data for this exercise.
Ex2 <- read.table("http://people.reed.edu/~jones/141/QQdata.txt",
                   header=TRUE)
The data.frame (ie table) created has 7 columns labeled x1, x2, ..., x7. For each column, make a density plot with a normal density overlaid (see the notes on "Superimposed Density Plots"), then make the normal quantile plots, adding lines with the qqline() function. Print your plots, commenting briefly on the features of the distributions that are revealed by the plots. You should attempt to relate the features you see in the normal quantile plots to features you see in the density plots.

Math 141 Index

Albyn Jones