Skip to main content

The Forgotten Geometric Mean.

Often times a lot of people working with data are trying to create an index of some sort. Something that captures a set of key business metrics. If you are a site (or an app) you want to create some sort of an engagement index, which if trending up implies good things are happening, bad if it is trending down. The creators of such metrics (think analysts) tend to prefer a weighted arithmetic mean of the influencing factors. If the influencing factors are f1,f2, f3 (say) with weights w1, w2, w3 then the index would be computed as


However, what does not get factored in are the final consumers of the index (think product managers) and there could be many. They will invariably try to check it with something else they have handy. For example, if clicks on a site went up 20% the index may be up by just 5% (say) or vice-versa. If resources are being allocated based on the movement of such an index, it will invariably lead to contention on what is the right weighting to be given to each factor.

This is meant to be a short write up on some really cool features of the geometric mean. The geometric mean is not meant to replace a simple arithmetic mean based index, but it is definitely worth the thought. To illustrate what this aspect is, lets take a look at a simple two feature index. If the features are X and Y the arithmetic mean index can be represented as


To see how it responds to changes, lets take the derivative.


Clearly the derivative is dependent on the chosen weight. Lets see what happens when we choose the geometric mean.


Again, to see how it responds to change, lets take the derivative.


which can be further simplified to

The result is a useful derivable condition

i.e. the percentage change in the index is directly proportional to the percentage change in the feature.
Note, there are no hand chosen weights here. A five percent change in one of the influencing factors will result in a proportional percent change in the index. Extremely useful !

Yet another aspect consumers like to quantify is growth. If the index went up by x1 and x2 in consecutive years, what is the average quarterly/annual growth? If we took it as the average of x1 and x2, then the growth after two years (say) would be estimated as


Contrast that to the actual growth


Clearly some terms cancel out. We are left comparing


Notice one of them is the arithmetic mean and the other is the geometric mean. We also know from a well established theorem that the arithmetic mean is always greater than the geometric mean described here. So we would always end up overestimating the growth!

So how would we choose a value to project as an average growth rate? We are looking for a beta in the below equation


Yet again stating the average growth as the geometric mean gives the end user a handy metric to work with.

If you are interested in learning probability here are a set of good books to choose and buy from.

Comments

Popular posts from this blog

The Best Books to Learn Probability

If you are looking to buy some books in probability here are some of the best books to learn the art of Probability

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)
A good book for graduate level classes: has some practice problems in them which is a good thing. But that doesn't make this book any less of buy for the beginner.

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition
This is a two volume book and the first volume is what will likely interest a beginner because it covers discrete probability. The book tends to treat probability as a theory on its own

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

Fifty Challenging Probl…

The Three Magical Boxes



Q: You are playing a game wherein you are presented 3 magical boxes. Each box has a set probability of delivering a gold coin when you open it. On a single attempt, you can take the gold coin and close the box. In the next attempt you are free to either open the same box again or pick another box. You have a 100 attempts to open the boxes. You do not know what the win probability is for each of the boxes. What would be a strategy to maximize your returns?

Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series)

A: Problems of this type fall into a category of algorithms called "multi armed bandits". The name has its origin in casino slot machines wherein a bandit is trying to maximize his returns by pulling different arms of a slot machine by using several "arms". The dilemma he faces is similar to the game described above. Notice, the problem is a bit different from a typical estimation exercise. You co…

The Best Books for Linear Algebra

The following are some good books to own in the area of Linear Algebra.

Linear Algebra (2nd Edition)
This is the gold standard for linear algebra at an undergraduate level. This book has been around for quite sometime a great book to own.

Linear Algebra: A Modern Introduction
Good book if you want to learn more on the subject of linear algebra however typos in the text could be a problem.

Linear Algebra (Dover Books on Mathematics)
An excellent book to own if you are looking to get into, or want to understand linear algebra. Please keep in mind that you need to have some basic mathematical background before you can use this book.


Linear Algebra Done Right (Undergraduate Texts in Mathematics)
A great book that exposes the method of proof as it used in Linear Algebra. This book is not for the beginner though. You do need some prior knowledge of the basics at least. It would be a good add-on to an existing course you are doing in Linear Algebra.


Linear Algebra, 4th Edition
This is good book …