Statistics

“If You cannot dazzle them with Brilliance Baffle Them with Bullshit.”  W.C. Fields

 

One of my mother’s favorite quotes is” There are lies, there are damn lies and there are statistics.” This quote was made popular by Mark Twain but not attributed to him. The origin of the original quote is not known according to Wikipedia.   

 

The implication of this quote is obvious.  Statistics are worse than Damn Lies.  David Huff wrote a book titled “How to Lie with Statistics” in 1954.  Because it appears to be a book on how to recognize scams based on number manipulation, it could also be seen as a how-to book.

 

In which case the title could be Liar Liar, Pants on Fire.  If I may paraphrase from “Guns do not kill people, people kill people.” Statistics do not lie statisticians, or more likely the people who employ them, do. You can also substitute a statistician with a politician if you wish as the media is full of examples during the romancing of voters.

 

I never did know the difference between lies and damn lies, or where other quotes about lies fall in the pantheon of lies. Another of her favorites is “Everyone will forgive a little white lie but no one will forgive the dark truth’* I could not find the origin of this quote. Maybe my mother made it up because the topic under discussion could create a worse situation than if it was passed over. But this did create some confusion in my mind.

 

Where does the little white lie stand to other lies?  Is it before lies and damn lies?

 

 And what about the dark truth? It must be not good.  Maybe it is worse than statistics. This means Statistics are OK if they hide the dark truth. 

 

But then we are confronted with “” know the truth and the truth will set you free.” John 8:23 from the Bible New Testament.   Now I need to know the difference between the truth and the dark truth.

 

 If I tell the truth and it turns out to be the dark truth I will never be forgiven. If I do not tell the dark truth I will not have to worry about forgiveness but I will remain enslaved because I can never be set free. 

 

I have decided that statistics is the answer to the above conundrum because statistics provides me with the tools to “dazzle them with brilliance or baffle them with bullshit”. Since most people cannot tell the difference they always assume it is the truth. After all, statistics, do not lie.

 

Statistics has a large application to any system that has data. I have used statistical modeling in my evaluation of loan applicants and in creating several financial products.

 

For most, probability is a component of statistics that is generally known. The word probability probably comes from the word probable, which probably means likely to happen. 

 

There are two types of likely to happen. One where an event is confined to a definite chance of happening such as rolling a 7 with two die or picking a card out of a deck.  The probabilities are fixed for each set of numbers. The other is based on historical events which reduce the prediction from random guessing to a statistical science like the weather forecast or who will win a football contest.

 

The most familiar of the fixed probability system is the coin toss. The only event in a football game with a fixed probability outcome is the coin toss.   “Heads or Tails” are the two expected outcomes when you flip a coin. The statistic is a 50/50 probability of either heads or tails. These are exclusive outcomes because you cannot have heads and tails. You could theoretically get the coin to land and stay on its edge. But I cannot ever remember someone calling “Edges”

 

In the football game, the winner of the toss now engages in the other type of statistical model. The choice between kicking or receiving. This choice is based on historical precedents and the coach's desire to gain knowledge of the other team’s offense or defense during the first quarter.

 

 I am guessing this is an extremely complicated analysis, or maybe it is just a wild guess. One could argue if it is a wild guess, it is the same as the coin toss. The guess indeed is 50-50 because one excludes the other but unlike the coin toss where the odds are known, in this case, the odds are not known. If the odds were known the winner of the toss would always pick the most likely outcome. 

 

 Below are some examples of each.

 

Many games of chance are determined by the number of possible outcomes.  Coin flip has two if you do not count edges. A single roll of one die has 6 possible outcomes and two dice have 36 possible combinations.  There are 52 cards in a deck of cards so it is one chance in 52 that you draw a specific card and one chance in 4 of drawing a specific suit because there are 13 cards of 4 suits.

 

There can be a difference in outcomes in cards however if a card drawn is not replaced before a second is drawn. For this example, instead of 52 cards, let's use a jar with 5 red and 5 green marbles. This will make the math easier for me to calculate but the concept is the same/ If I reach into the jar and pick one marble, I cannot see the marble, there is an equal chance of getting a green or a red, If I put the marble back and you draw you also have an equal chance of getting a red or a green. However, if I draw a red and do not put it back then the odds of you drawing red are lower than drawing a green. Since there are now only 9 marbles in the jar 5 green and red there are 4 chances in 9 of drawing a Red and 5 chances in 9 of drawing a green. Or 55.6% versus 44.4% ( I rounded up for green)  (Counting cards here).

 

Some games do not have set statistical outcomes. Chess for example is based on a set of rules, and winning is based on strategy. Every move of a chess game changes the potential outcome. However, if two players always make the best move the game will end in a tie. If there is a set of moves that always win no matter what the other player does, I do not know it. Winning is based on taking advantage of mistakes. Tic tak toe analysis The player with the first move does have an initial advantage however there are so many variables that the second to move can win or more often force a tie. ( Get some data here)

 

I am not a good chess player. Against my computer program when I am white, I beat level four 50% of the time; with rarely a tie. There are 10 levels. When I am black, I lose more often than I win, but I can force a draw about 30% of time.

 

 When I move up to level 5 my win falls to 20%.    For those of you who know chess, I have never beaten the computer when black employs the Sicilian Defense.

 

However, when my wife and I were in Italy we had to share a room with a tall

Italian who claimed to be one of the true Romans. He was a bit arrogant

He asked me if I played chess and I said I know the rules but I have not played

much.

 

He challenged me to a game and said he would be black since I was a novice. After the first 5 moves I had no idea what to do so I let my subconscious mind direct me. It may have been my first experience with Zen.

 

I could tell he was very methodical while I just looked at the board and

sort of guessed what to do.  I never forgot his face when I made a move that baffled him. It was a random move but he must have thought I had a strategy that he never saw before. He may have even thought that I lied about my level of play.

 

When I made a second non-rational move he got rattled and made an impulsive move that left his queen exposed to my bishop hiding about 7 squares away. I sacrificed my bishop for his queen.

 

Because I now had a material advantage, I started a blitzkrieg where I would

sacrifice my players for his to reduce the options he had.

 

I set up the win by sacrificing my queen for both of his castles

I won the game with two castles a knight and 2 pawns to his Bishop a Knight

and one pawn.

 

I did not win because I was a good player. I won because he was an arrogant

player who assumed that I was a better player than I led on to be and that I had

tricked him. Or he lied and did not know the game at all. This was my wife’s

 theory.

 

So either I won on a bluff and a lot of luck or he was not that good. But he did say

he was a ranked player and that for me to beat him was not rational (to him)

I never saw him again. 

 

I doubt that my strategy of random moves would have worked again. However,

the laws of probability only applied to my chance of winning against a ranked

player. This assumes of course he was ranked player which no doubt “rankled”

 him.

 

Between these two extremes is the game of Poker.  Poker is a game that has

both strategy and preset probabilities. Initially you can determine the

probabilities of winning based on what you have and what your

opponent does not have, (your cards). You do not know what he has.

 

You can calculate the probability that your initial hand is stronger or weaker than his hand. Because, like dice these probabilities are known.

 

However, as cards are drawn the probabilities change. Each card like each

chess move changes the odds.  This is a random event. And luck is more a

factor than strategy.

 

Assume you have a 6 of spades and a King of diamonds.  The flop shows a 7

8 and 9 of spades.

 

Your opponent has two Jacks, the Jack of Spades and the jack of

diamonds.

 

You have two chances of drawing a straight flush if the river card is a

5 or 10 of spades and you have less than one chance in four of drawing

another spade because you know 4 spades have been removed from the deck.

You do not know if the other player has any spades,

 

The river card is a 10 of spades giving you a 10 high spade strait flush. Usually a

winning hand. But this card which you both needed gives him a Jack high

straight flush.

Good luck for him bad luck for you

 

It is important to know that for a statistic to be useful it cannot be corrupted by cheating.  Loaded dice, marked cards and spit balls are the most familiar.  False claims based on biased marketing studies are less obvious.

 

Why the rich get richer. If every body in a game of chance has unlimited resources eventually, they will all end up where they started. This also assumes that the game cannot end until everyone agrees it is over.

 

However, if one person has limited resources and the other unlimited resources the one with unlimited resources will eventually have all the resources and the game will end because the limited resource player cannot play anymore.

 

An example of farming laws in Switzerland were designed to eliminate the accumulation of resources at the expense of another.

 

When we were in Europe, I was able to locate the history of my mother’s grandmother in the Canton of Glarus. I obtained a genealogy document from the hall of records in the town of Ennetbuhl where my great grandmother was born. The family tree went back to 1550.  The person in charge of maintaining the records was so happy to meet me, he introduced me to relatives still living there. It was my meeting my relatives that I was able to obtain the information on Farming Laws

 

They gave me the family history, of the struggle to maintain the farm and emigration to the US. When I asked about the farm, they said the family owned three parcels. None of the parcels were connected.

 

When I asked why they were so far apart this was their answer.  Local laws prevented any owner to farm contiguous properties to prevent economies of scale. 

Economies of scale is a primary requirement to create wealth in a competitive economy. Economies of scale are discussed fully in Tales from the Vault Book Two. For here all you need to know is the Farm Law prevented a farmer from underpricing everyone in the market which would lead to a monopoly. In the US this is the foundation of all great wealth.

 

One reason a strong corporation can win even when they are in the wrong, is using relatively unlimited legal resources to eliminate a lawsuit from a limited resource complaint. This practice is called gutting. They have accumulated great wealth through economies of scale.   You notice I said win and not win the law suit. It rarely goes to court because the plaintive is drained of resources to fight before a jury can nail the corporation. Many lawyers will refuse to represent a client who has limited resources.

 

The Erin Brockovich story is good story telling and probably true. But it is false advertising who think they will win because justice prevails. . It does not.  Most of the time money prevails. Unless a large supporter comes to the aid of the underdog for a percentage of the win. Then it becomes an even fight usually ending in a compromise.

 

Let’s play a game. We have two players and one non player.  The non player is the coin flipper and each player takes a turn calling Heads or Tails. But first the person who is going to make the call must place a bet. Say $5.  If he calls the right side of the coin the other player gives him or her $5. If not he loses the $5 to the other player Now it is the second persons turn and the same scenario follows.

 

Although the odds are always 50/50 every time the coin is tossed the probability of betting that the coin will land on the same side for two flips in a row  is 1 chance in four.

1 chance in 2 on the first flip and one change in 2 on the second flip. The probability map looks like this. For 2 flips of the coin you can get 2 heads  2 tails or 1 head and 1 tail twice.  Since there are four possible outcomes It is 1 chance in four of getting Heads twice , One chance in four of getting tails twice and 2 chances in four of getting one head and one tail. To get the probability of event one happening and the probability of event two happening the odds are multiplied. To get the aggregate odds.  This is an important concept when we visit the decision tree.

 

 

Now to finish the game:  for a player to win 5 times in a row would require beating the odds of 1 chance in 32.  This is perfectly doable but his odds of winning ten times in a row has a chance of one  in 1024.  If the other player continues to double his bet every time he losses  and keeps the earnings every time he wins he eventually  will recover all his losses and win all of the limited players resources.

 

 

This could take a while if the limited resource player makes small bets.so lets assume he is all in on every bet. The odds of winning and losing do not change by the size of the bet.

 

Example  Limited resource player has $1000.  Unlimited resource player has 5 million dollars. The limited player would have to win 20 times in a row betting his entire holdings if the unlimited player doubled every time he lost. The unlimited player would only have to win once. This is why gambling casinos have table limits. Harrah’s story here.

 

 

Other statistical information

When I was 10 my neighbor gave me a baseball card collection which was lost when my folks moved to San Diego. There were about 500 cards mostly from the 30s and 40s and on the back they had a lot of data that I never read. Except the batting average. I did not understand ERA I only knew that a high number for batting average and low number for pitchers were good.

 

I do not think there is another sport that is more focused on statistics than baseball.  Batting averages for hitters and ERA for pitchers are the most common.  Baseball coaches are always looking for a statistical edge during a game.

 

Decisions made by coaches during baseball games based of statistics are more akin to business than to games of chance. In games of chance you have know probabilities. In baseball the statistics are based on prior data only approximate the true statistic.

 

 

One decision that will be debated for years is the decision by the Coach of Alabama to kick a field goal rather than try for a touch down on the last play of regulation in a tie game. We will visit this again when I talk about decision trees.  You do a decision tree every time you decide where to go, how to go what to buy who to invite etc. I used a decision tree every time I evaluated a loan. Talk about risk here.

 

One of the most difficult problems that a banker encounters is evaluating Character. No matter how structured a loan is to protect it from a loss Character is usually the deciding factor when trouble occurs. I will relate several stories regarding character both good and bad. A person is not issued a character license. There is no statistical number that can be assigned to an individual. Today Credit Scores are used as a measure of a person’s history of credit performance. It implies character but is really a system of statistics that can be manipulated by someone who knows how to game the system. It is not character. Actually someone who games the system says something about their character but the system doesn’t know it is being gamed  

A high credit score is a measure of your credit behavior against others. You can have the highest character and a low credit score because of bad luck. Eg lost your job right after buying a new car, and having a baby.  And you can have a high credit score by making sure all your payment are on time as you pyramid debt. You put all the borrowings in several banks and when you have $900,000 in the bank and $1,000,000 in loans you fall in love with a Persian beauty and escape with her to a remote island paradise. This is an extreme hypothetical example but as you will see it is tame in relation to some real bad actors I encountered. 

 

 

 

 

 

Note to editor