Monday, January 28, 2013

The Random-Picker Algorithm

Q: You have a queue of people outside your office door and you want to pick exactly one person at random. However you do not know the length of the queue. You are allowed to accept a person and then reject that person when you see another should you so choose. How do you do it?
Fifty Challenging Problems in Probability with Solutions (Dover Books on Mathematics)

A: At first look, this seems impossible to solve. How do you randomly pick when you don't know what the denominator is? It could be 5, 10 or 100s. Surprisingly, the following strategy works beautifully.
  1. Keep a counter on the number of persons who come in, call this counter \(i\).
  2. For every person coming in, pick that user with probability \(\frac{1}{i}\)
  3. When the \(\{i + 1\}^{th}\) person comes in, with probability \(\frac{1}{i+1}\) replace the existing person.
The above algorithm works by ensuring that the selected person is indeed picked up with probability \(\frac{1}{n}\) where \(n\) is the number of persons in the queue (which we don't know upfront). Here is an inductive proof for why. Think of the last person coming in. That person would have a probability \(\frac{1}{n}\) of being picked. Now move on to the last but one person. He has a probability of \(\frac{1}{n-1}\) of being picked unless the \(n^{th}\) person gets selected. The overall probability for the \(\{n-1\}^{th}\) person to be selected is
$$P(n-1) = \frac{1}{n-1} \times \big(1- \frac{1}{n}\big) = \frac{1}{n}$$

The probability does not change! Extending this shows that the probability is \(\frac{1}{n}\) for all persons.

If you are looking to buy some books in probability and algorithms here are some of the best books to learn them

Introduction to Algorithms
This is a book on algorithms, some of them are probabilistic. But the book is a must have for students, job candidates even full time engineers & data scientists

Introduction to Probability Theory

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)

Introduction to Probability, 2nd Edition

The Mathematics of Poker
Good read. Overall Poker/Blackjack type card games are a good way to get introduced to probability theory

Let There Be Range!: Crushing SSNL/MSNL No-Limit Hold'em Games
Easily the most expensive book out there. So if the item above piques your interest and you want to go pro, go for it.

Quantum Poker
Well written and easy to read mathematics. For the Poker beginner.

Bundle of Algorithms in Java, Third Edition, Parts 1-5: Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition) (Pts. 1-5)
An excellent resource (students/engineers/entrepreneurs) if you are looking for some code that you can take and implement directly on the job.

Understanding Probability: Chance Rules in Everyday Life A bit pricy when compared to the first one, but I like the look and feel of the text used. It is simple to read and understand which is vital especially if you are trying to get into the subject

Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) This one is a must have if you want to learn machine learning. The book is beautifully written and ideal for the engineer/student who doesn't want to get too much into the details of a machine learned approach but wants a working knowledge of it. There are some great examples and test data in the text book too.

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.


  1. I think it should be P(n−1)=[1/(n−1)]*[1−(1/n)], right? Multiply rather than subtract?

  2. This comment has been removed by the author.