Thursday, February 7, 2013

The Case of Two Mariners

Q:Two mariners report to the skipper of a ship that they are distances \(d_1\) and \(d_2\) from the shore. The skipper knows from historical data that the mariners A & B make errors that are normally distributed and have a standard deviation of \(s_1\) and \(s_2\). What should the skipper do to arrive at the best estimate of how far the ship is from the shore?

A: At a first look, it appears that the simplest solution would be to take the estimate of the navigator who has the lower standard deviation. If \( s_1 < s_2\) then pick \(d_1\) else pick \(d_2\).

But there is a way to do better than that. Assume you take a linearly weighted sum of the two with weight \(= \omega\).

$$ d_{blended} = \omega\times d_1 + ( 1 - \omega)\times d_2 $$

The variance of the blended estimate would be given by

$$ Var(d_{blended}) = \omega^{2}\times s_{1}^{2} + (1 - \omega)^{2}\times s_{2}^{2} $$

We next proceed to find a value for \(\omega\) that minimizes the variance \(Var(d_{blended})\). For this we find the derivative of \(Var(d_{blended})\) w.r.t \(\omega\) and set it to zero as shown below

$$\frac{d (Var(d_{blended}))}{d\omega} = 2\omega \times s_{1}^{2} - 2(1 - \omega) \times s_{2}^{2} = 0$$

The above equation yields

$$\omega = \frac{s_{2}^{2}}{s_{1}^{2} + s_{2}^{2}}$$

The above value of \(\omega\) minimizes \(Var(d_{blended})\) as the second derivative is always less than zero (I'll not get into that). Plugging this value of \(\omega\) to the estimating equation and some simplification yields

$$ d_{blended} = \frac{s_{2}^{2}\times d_1}{s_{1}^{2} + s_{2}^{2}} + \frac{s_{1}^{2}\times d_2}{s_{1}^{2} + s_{2}^{2}} $$

This estimate is guaranteed to be more accurate than each of the individual estimates. Pretty slick !

Some must buy books on probability
Fifty Challenging Problems in Probability with Solutions (Dover Books on Mathematics)
This book is a great compilation that covers quite a bit of puzzles. What I like about these puzzles are that they are all tractable and don't require too much advanced mathematics to solve.

Introduction to Algorithms
This is a book on algorithms, some of them are probabilistic. But the book is a must have for students, job candidates even full time engineers & data scientists

Introduction to Probability Theory

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)

Introduction to Probability, 2nd Edition

The Mathematics of Poker
Good read. Overall Poker/Blackjack type card games are a good way to get introduced to probability theory

Let There Be Range!: Crushing SSNL/MSNL No-Limit Hold'em Games
Easily the most expensive book out there. So if the item above piques your interest and you want to go pro, go for it.

Quantum Poker
Well written and easy to read mathematics. For the Poker beginner.

Bundle of Algorithms in Java, Third Edition, Parts 1-5: Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition) (Pts. 1-5)
An excellent resource (students/engineers/entrepreneurs) if you are looking for some code that you can take and implement directly on the job.

Understanding Probability: Chance Rules in Everyday Life A bit pricy when compared to the first one, but I like the look and feel of the text used. It is simple to read and understand which is vital especially if you are trying to get into the subject

Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) This one is a must have if you want to learn machine learning. The book is beautifully written and ideal for the engineer/student who doesn't want to get too much into the details of a machine learned approach but wants a working knowledge of it. There are some great examples and test data in the text book too.

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

1 comment:

  1. "This estimate is guaranteed to be more accurate than each of the individual estimates."

    You haven't proved that. You assumed it.