Friday, May 3, 2013

Polya's Urn

Q: An urn has $$r$$ red balls and $$b$$ blue balls. Someone draws a ball at random, its colour observed and put back into the urn. You do not know what was observed. However that person puts back $$x$$ balls of the same colour back into the urn. Now, you draw a second ball from this urn. What is the probability that it is red?

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (And Everone Else!)

A: The framing of this puzzle follows directly from the "Polya's Urn" process. It presents yet another surprising result from Bayesian reasoning. Intuitively, it appears that the act of putting in new balls of the same colour would tamper with the probability of drawing a red ball for the second draw. But does it? Lets take a look.

The probability that a red ball is drawn from the urn in the first draw is $$\frac{r}{r+b}$$ and for a blue ball would be $$\frac{b}{r+b}$$. The second draw, if it is a red ball, could be a consequence of either a red ball being drawn the first time or a blue ball.

For the second draw, the probability that a red ball is drawn if a red ball is drawn the first time, would be $$\frac{r+x}{r + b + x}$$. The probability that a red ball is drawn if a blue ball is drawn the first time, would be $$\frac{b}{r+b+x}$$. This layout is shown in the figure below.

The probability that a red ball is drawn on the second draw is
$$P(\text{Red: Draw=2})=\frac{r + x}{r + b + x}\times\frac{r}{r+b} + \frac{r}{r+b+x}\times\frac{b}{r+b}$$
The above simplifies as
$$\frac{(r+b+x)r}{(r+b+x)(r+b)} = \frac{r}{r+b}$$
Note, the probability remains exactly the same!
As with most Bayesian stuff, the result eventually becomes intuitive when we spend more time thinking of the problem. The act of adding $$x$$ balls based on the outcome of the first draw, is really meaningless!

Some good books to learn the art of probability
Fifty Challenging Problems in Probability with Solutions (Dover Books on Mathematics)

This book is a great compilation that covers quite a bit of puzzles. What I like about these puzzles are that they are all tractable and don't require too much advanced mathematics to solve.

Introduction to Algorithms
This is a book on algorithms, some of them are probabilistic. But the book is a must have for students, job candidates even full time engineers & data scientists

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)

Introduction to Probability, 2nd Edition

The Mathematics of Poker
Good read. Overall Poker/Blackjack type card games are a good way to get introduced to probability theory

Bundle of Algorithms in Java, Third Edition, Parts 1-5: Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition) (Pts. 1-5)
An excellent resource (students/engineers/entrepreneurs) if you are looking for some code that you can take and implement directly on the job.

Understanding Probability: Chance Rules in Everyday Life A bit pricy when compared to the first one, but I like the look and feel of the text used. It is simple to read and understand which is vital especially if you are trying to get into the subject

Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) This one is a must have if you want to learn machine learning. The book is beautifully written and ideal for the engineer/student who doesn't want to get too much into the details of a machine learned approach but wants a working knowledge of it. There are some great examples and test data in the text book too.

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.