The Forgotten Geometric Mean.

Often times a lot of people working with data are trying to create an index of some sort. Something that captures a set of key business metrics. If you are a site (or an app) you want to create some sort of an engagement index, which if trending up implies good things are happening, bad if it is trending down. The creators of such metrics (think analysts) tend to prefer a weighted arithmetic mean of the influencing factors. If the influencing factors are f1,f2, f3 (say) with weights w1, w2, w3 then the index would be computed as

However, what does not get factored in are the final consumers of the index (think product managers) and there could be many. They will invariably try to check it with something else they have handy. For example, if clicks on a site went up 20% the index may be up by just 5% (say) or vice-versa. If resources are being allocated based on the movement of such an index, it will invariably lead to contention on what is the right weighting to be given to each factor.

This is meant to be a short write up on some really cool features of the geometric mean. The geometric mean is not meant to replace a simple arithmetic mean based index, but it is definitely worth the thought. To illustrate what this aspect is, lets take a look at a simple two feature index. If the features are X and Y the arithmetic mean index can be represented as

To see how it responds to changes, lets take the derivative.

Clearly the derivative is dependent on the chosen weight. Lets see what happens when we choose the geometric mean.

Again, to see how it responds to change, lets take the derivative.

which can be further simplified to

The result is a useful derivable condition

i.e. the percentage change in the index is directly proportional to the percentage change in the feature.
Note, there are no hand chosen weights here. A five percent change in one of the influencing factors will result in a proportional percent change in the index. Extremely useful !

Yet another aspect consumers like to quantify is growth. If the index went up by x1 and x2 in consecutive years, what is the average quarterly/annual growth? If we took it as the average of x1 and x2, then the growth after two years (say) would be estimated as

Contrast that to the actual growth

Clearly some terms cancel out. We are left comparing

Notice one of them is the arithmetic mean and the other is the geometric mean. We also know from a well established theorem that the arithmetic mean is always greater than the geometric mean described here. So we would always end up overestimating the growth!

So how would we choose a value to project as an average growth rate? We are looking for a beta in the below equation

Yet again stating the average growth as the geometric mean gives the end user a handy metric to work with.

If you are interested in learning probability here are a set of good books to choose and buy from.

The Best Books to Learn Probability

If you are looking to buy some books in probability here are some of the best books to learn the art of Probability

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)
A good book for graduate level classes: has some practice problems in them which is a good thing. But that doesn't make this book any less of buy for the beginner.

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition
This is a two volume book and the first volume is what will likely interest a beginner because it covers discrete probability. The book tends to treat probability as a theory on its own

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

Fifty Challenging Probl…

The Best Books for Time Series Analysis

If you are looking to learn time series analysis, the following are some of the best books in time series analysis.

Introductory Time Series with R (Use R!)
This is good book to get one started on time series. A nice aspect of this book is that it has examples in R and some of the data is part of standard R packages which makes good introductory material for learning the R language too. That said this is not exactly a graduate level book, and some of the data links in the book may not be valid.

Econometrics
A great book if you are in an economics stream or want to get into it. The nice thing in the book is it tries to bring out a oneness in all the methods used. Econ majors need to be up-to speed on the grounding mathematics for time series analysis to use this book. Outside of those prerequisites, this is one of the best books on econometrics and time series analysis.

Pattern Recognition and Machine Learning (Information Science and Statistics)
This is excelle…

The Best Books for Linear Algebra

The following are some good books to own in the area of Linear Algebra.

Linear Algebra (2nd Edition)
This is the gold standard for linear algebra at an undergraduate level. This book has been around for quite sometime a great book to own.

Linear Algebra: A Modern Introduction
Good book if you want to learn more on the subject of linear algebra however typos in the text could be a problem.

Linear Algebra (Dover Books on Mathematics)
An excellent book to own if you are looking to get into, or want to understand linear algebra. Please keep in mind that you need to have some basic mathematical background before you can use this book.

Linear Algebra Done Right (Undergraduate Texts in Mathematics)
A great book that exposes the method of proof as it used in Linear Algebra. This book is not for the beginner though. You do need some prior knowledge of the basics at least. It would be a good add-on to an existing course you are doing in Linear Algebra.

Linear Algebra, 4th Edition
This is good book …