Tuesday, February 21, 2017

What is up with all of the country ranking "Indexes"?

Link to Video related to this Blog Post

A fairly recent, and continuing rage among data wonks is the creation of ranking indices (or indexes). Of course, we have had  indexes for many,  many years—  an old historical example is the Dow Jones Industrial Average. And of course, we use similar techniques to compute FIFA World Rankings, university rankings, and more. What is more recent is the creation of myriad indices for countries around the world (as well as US states) which are used to rank them on every conceivable topic. A random selection of these are:

The Economic Freedom Index
The Global Peace Index
The Good Country Index
The Global Competitiveness Index
Rule of Law Index
World Press Freedom Index
Happiness Index
Global Innovation Index
Global Entrepreneurship Index
Social Progress Index
Corruption Perceptions Index
Animal Protection Index
The Big Mac Index

Global Hunger Index
Quality of Life Index
Quality of Death Index

I could go on listing these for days! Basically, if you want to create an index,  sit down and decide on some factors that you think would go into explaining an idea such as "Happiness". Collect some data on these factors, and then figure out a way to create a sort of weighted average of these factors, adjusting them to a similar scale. A BurkeyAcademy subscriber Wrote in to ask me about the United Nations Development Programme's "Human Development Index". He wanted to know how was calculated. Let's have a look.

Looking at their 2015 report, they list the variables used to construct their index: Life expectancy, expected years of schooling (for young children now), mean years of schooling for all in the country, GNI per capita, and though it is unclear if (or how) they use this in their ranking, the difference in GNI ranking and their own Human Development Ranking (they could include this difference in their ranking in a recursive fashion using several steps).

Excerpt from Table 1, page 208, Human Development Report 2015, UN Development Programme

In order to figure out how this was calculated, my go-to tool is a regression. Regression attempts to discover the formula for a relationship, if you assume the function's basic form. Using a subset of 34 countries gives the following results:

For the uninitiated, the important things are the "Multiple R-squared" of 0.9966 tells us that our equation is an almost perfect fit; and the "Estimates" give us the following formula:

HUMDEV= -0.154 + 0.00767*LE + 0.0114*EYSB+    0.0187*MYS + 0.0000005936*GNI - 0.001154*GNIHDI

It appears that the "GNI rank - HDI rank" does help predict their ranking index. Exactly how they did it, I don't know — they might explain it in the document, but I did not read it fully. So now, we understand how they calculated it — what we do not know is exactly why they chose these particular numbers as weights. Again, perhaps they mention it in the 250 page report, but generally  these weights are chosen subjectively  based on how important the creator believes each factor to be, and then normalizing the result to be an index between zero and one or zero and 100. 

Go and read about some other indices, and have a go at creating your own! Here is a link to an excellent article on indexes from The Economist magazine.