Complexity is not a cause of confusion. It is a result of it.

Statistics: When Numbers Lie

Mark Twain once said,

There are three kinds of lies: lies, damned lies, and statistics.

Another quote by Steven Wright says like this,

47.3% of all statistics are made up on the spot.

And there is a detail account of how to lie with statistics. We all love numbers. Any talk with references to numbers is considered to be correct and appropriate.

How to Understand Numbers

Most of the time, it is people’s mistake in understanding the numbers presented to them. For example, understanding what numbers mean. Consider the following cases.

A corporation was able to announce the following the statistics: Total number of shareholders: 3003. Average shares per shareholder: 660. Looks nice, thats more like a democracy, every one has equal say in the proceedings, but in reality, it is just three people holding 3/4th of shares and remaining people holding 1/4th share. What looked like a happy number is actually not so.

Share Holding pattern - Average cheats

Another example is stated below. Is this a easy way of becoming rich? If an average per captia income is 1500\$. Then can you make 22K by just adding more people to your family(?!) ?.

Per-Captia Income - Average Income fails

Now, if you do that, it’s like shooting yourself in head. One can always draw parallel with batting average of Indian team.  And it has always been a case that we never get decent score baring few occasions. With cricket it is always obvious, one huge not-out score can skew the average big way.

Infinite Monkeys

The most fascinating argument involves infinite monkeys and a type writer. There is a theorem which says:

Given enough time, a hypothetical monkey typing at random would, as part of its output, almost surely produce all of Shakespeare’s plays.

“All of Shakespeare’s ?”, you could be screaming? There is a even stronger argument says, if we have an army of monkeys typing infinitely, they could type almost all of British library. Now, the point here is to note a point – “infinite time”.  Any thing could happen, given enough time.

Another point to ponder is, assume you toss a coin randomly, you are supposed to get head and tails equally, but it hardly happens like that. If you have tossed 10 times, chances are that you have the break up as 7-3. What does it mean? There is some thing know as Law of large numbers. It says

The average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

Now, even if by fluke the monkeys typed Shakespeare, then it means, there could be a sequence of strings which are meaning less. Again the point to note is, “large number of trails”.

Catching A Bus or Train

You know in a bus stop near your home, the average time between two buses is 10 mins, assume at every tenth min of an hour. But  experience could be telling you a different story. Assume you are going to a bus stop at 3rd minute. By average, you need to wait for 7 minutes to get next bus. How many times you have got bus after 7 minutes? In reality the bus arrives late.

The phenomenon is explained as below.

The event which has already started (before you could started observing) will take more time to finish than it is expected.

In our case, the arrival of bus. The event of interval time has started even before you arrive at the bus stop. You need to be in bus stop when a bus leaves and wait for the next bus to arrive, that way, you can check the average waiting time. And always note, there is a notion of average. And also, our intuition would not allow us to agree. Consider a better example below.

America: Land of Fertility, Is it?

In United states, the average life expectancy is higher and is often noted as a proof of better life style. In Americas case, life expectancy is  78.4 years. How to interpret this?

Consider this, in any country with huge migration the life expectancy will be high. For suppose if US has a law saying only people above 50 can migrate, then the average life expectancy will be for sure more than 50. Reason? Because, any one who migrates at his 50th age is certainly not going to die before 50. And hence the chance of someone dieing at 10, 23, 37 or 48 is ruled out, so the average age is skewed.

The same is the applicable for bus arrival. If the bus has already left then the chances of bus arrival will take longer.  In general, the average time of an event which has already started before observation will take more time to complete then expected. Confusing? Forget it. 🙂

