## Random Permutations |

A permutation of the integers {1,2,3,4,...,n} is the same numbers rearranged in a (possibly) different order. For example, the set {1,2,3} has 6 possible permutations:

{1,2,3}, {2,3,1}, {3,1,2} {1,3,2}, {3,2,1}, {2,1,3}You can count the number of permutations of a set of n elements in the following way: there are n choices for the first item, then (n-1) choices for the second item, and after choosing those two, (n-2) choices for the third item, and so on. Thus there are n factorial permutations of n items:

n! = n*(n-1)*(n-2)*...*3*2*1.For 3 items this expression gives 3*2*1=6, which is the number of permutations we counted above. As you can see, in a permutation, the order matters. The permutation {1,3,2} is different from the permutation {3,2,1}.

factorial(3)The number of permutations increases rapidly with n. Try a larger value of n, say 20. It will help to display more digits.

factorial(20) # set the digits option to display 18 digits options(digits=18) factorial(20) options(digits=7)

Software using floating point arithmetic has an upper bound for representable numbers. What is the largest integer for which R will return a number (not "Inf")?

Imagine putting the numbers 1,2,3,...,n on ping-pong balls, putting them in a bag, and mixing them thoroughly. Now draw the numbers one at a time, recording the order in which the numbers were selected. You are picking a random permutation!

The sample() function in **R** will do this for you. To pick a random
sample of size n without replacement from the dataset x, use

sample(x,size=n,replace=FALSE)Thus to get a random permutation of the set {1,2,3,4,5}, use

sample(1:5,size=5,replace=FALSE)The "replace = FALSE" is not really necessary, that is the default choice.

Sampling without replacement means each member of the set can be selected only once. This is like dealing from a deck of cards; once you deal the 7 of clubs it isn't in the deck anymore! (Shuffling a deck of cards is just a way of producing a random permutation of the deck.)

## Assignment:There is a classic probability puzzle that involves permutations. Imagine that 20 people go the opera. Each person is wearing a hat. They check their hat at the door as they arrive. The hat-check person gets tipsy during the performance, and returns a random hat to each person as they leave. What is the probability that at least one person gets his or her own hat back? What is the average number of patrons who get their own hats back? |

Note on Lab Assignments
When preparing lab assignments, copy the R commands you used
into a text file or your favorite word processor. Annotate
with comments so that the reader can follow the thread.
Print and submit with your assignment.
######################################################## # All comments embedded in R code should look like this # it will make your R commands much easier to follow ######################################################## |

Hats <- sample(1:20,20) Heads <- sample(1:20,20) sum(Hats==Heads)"Hats == Heads" yields a logical value (TRUE/FALSE), which R codes as 0 for FALSE, 1 for TRUE. Summing logical values is just counting the number which are TRUE. Look at the two datasets and verify the matches, if there were any.

data.frame(Hats,Heads,Match= Heads==Hats)That is one trial. To get an accurate estimate, you will need to repeat this many times, say 10000. You need to store the results in a dataset, which you create first:

n.matches <- rep(0,10000) # create dataset to store results Heads <- 1:20 for(i in 1:10000){ Hats <- sample(1:20,20) n.matches[i] <- sum(Hats==Heads) }Now you can compute the proportion with at least one match, the average number of matches, or any other statistic of interest:

# proportion with at least one match sum(n.matches > 0)/length(n.matches) # average number of matches mean(n.matches) # median number of matches median(n.matches) # standard deviation sd(n.matches) # largest number of matches max(n.matches) hist(n.matches,breaks=seq(-.5,max(n.matches)+.5,1),freq=FALSE)This simulation should run quickly. Repeat it several times to see how reliable your results are.

How do you think the proportion of trials with at least one match will change
if there are 50 people? The average? Test your guess by repeating the
simulation with 50 hats and 50 people.
How about 100 hats and heads? |

Simulation is a powerful tool for attacking probability problems that are too difficult to solve analytically!

Another Simulation Experiment: Runs in Coin Tossing

Albyn Jones

September 2008