Posts

Showing posts from March, 2014

Linear Regression, Transforms and Regularization

This write up is about the simple linear regression and ways to make it robust to outliers and non linearity. The linear regression method is a simple and powerful method. It is powerful because it helps compress a lot of information through a simple straight line. The complexity of the problem is vastly simplified. However being so simple comes with its set of limitations. For example, the method assumes that after a fit is made, the differences between the predicted and actual values are normally distributed. In reality, we rarely run into such ideal conditions. Almost always there is non-normality and outliers in the data that makes fitting a straight line insufficient. However there are some tricks you could do to make it better.

Statistics: A good book to learn statistics

As an example data set consider some dummy data shown in the table/chart below. Notice, value 33 is an outlier. When charted. you can see there is some non-linearity in the data too, for hig…

The Lazy Apprentice

Q: A shopkeeper hires an apprentice for his store which gets one customer per minute on average uniformly randomly. The apprentice is expected to leave the shop open until at least 6 minutes have passed when no customer arrives. The shop keeper suspects that the apprentice is lazy and wants to close the shop at a shorter notice. The apprentice claims (and the shopkeeper verifies), that the shop is open for about 2.5hrs on average. How could the shopkeeper back his claim?

Statistics: A good book to learn statistics

A: Per the contract, at least 6 minutes should pass without a single customer showing up before the apprentice can close the shop. To solve this lets tackle a different problem first. Assume you have a biased coin with a probability $$p$$ of landing heads. What is the expected number of tosses before you get $$n$$ heads in a row. The expected number of tosses to get to the first head is simple enough to calculate, its $$\frac{1}{p}$$. How about two head…