This site page contains selected extracts from the book. The bold text in the quotes is emphasis added by me. I have also added commentary in italics in some places where I attempt to summarise key points or to call out parallels with other books and papers I am reading. The primary purpose is as a tool to help me digest the content but it also offers a quick introduction to a book which I think is well worth reading.
Introduction: Sorting the Sources of Success
The Boundaries of Skill and Luck
“Much of what we experience in life results from a combination of skill and luck”
“And yet we aren’t very good at distinguishing the two. Part of the reason is that few of us are well versed in statistics. But psychology exerts the most profound influence on our failure to identify what is due to skill and what is just luck. The mechanisms that our minds use to make sense of the world are not well suited to accounting for the relative roles that skill and luck play in the events we see taking shape around us.”
The sources of Success
“The purpose of this book is to show you how you can understand the relative contributions of skill and luck and how to use that understanding in interpreting past results as well as making better decisions in the future. Ultimately, untangling skill and luck helps with the challenging task of prediction, and better predictions leads to greater success.”
Skill, Luck and Prediction
A paper titled “on the Psychology of Prediction” co-written by Kahneman and Tversky argues that “… intuitive judgements are often unreliable because people base predictions on how well an event seems to fit a story. They fail to consider either how reliable the story is or what happened before in similar situations. More formally, Kahneman and Tversky argue that three types of information are relevant to statistical prediction. The first is prior information, or the base rate… The second … is specific evidence about an individual case. The third … is the expected accuracy of the prediction, or how precise you expect it to be given the information you have.”
Summary: People seem to be pre-programmed to fit events into a narrative based on cause and effect. Some of the evidence we use to create these narratives will be what happened in specific examples of the activity while we may also have access to data averaged over a larger sample of events. The specific evidence often weighs more heavily in our intuitive judgement than the base rate overaged over many events. Consequently, we need tools and processes to help manage the tendency for our intuitive judgements to lead us astray.
One such process is to
- first form a generic judgement on what the expected accuracy of our prediction is likely to be (i.e. make a judgement on where the activity sits on the skill-luck continuum),
- and distinguish between the base rate for this type of activity and any specific evidence to hand
- Then follow the following rule:if the expected accuracy of the prediction is low, you should place most of the weight on the base rate
- if the expected accuracy is high, you can rely more on the specific case.
- Use the data to test of the activity conforms to your original judgement of how skill and luck combine to generate the outcomes
Note the role of skill and luck in the ability to make accurate predictions
- when skill plays the prime role in determining the outcome of the event you are attempting to predict, you can rely on specific evidence, but
- that where luck is more important, the base rate for that activity should be your guide
However, one problem with this rule of thumb is that a reliable base rate may not always be available, especially where the activity is one that does not have simple linear relationships between cause and effect (which unfortunately is likely to the part of the skill-luck continuum where the rule says you need the base rate to guide you).
Quantifying Luck’s Role in the Success Equation
“The starting place for this book is to go beyond grasping the general idea that luck is important. Then we can begin to figure out the extent to which luck contributes to our achievements, successes, and failures. The ultimate goal is to determine how to deal with luck in making decisions.”
“The argument here is not that you can precisely measure the contributions of skill and luck to any success or failure. But if you take concrete steps toward attempting to measure those relative contributions, you will make better decisions than people who think improperly about those issues or who don’t think about them at all.”
- 1 through 3 set up the foundations for thinking about the problem
- 4 through 7 develop the analytical tools necessary to understand luck and skill.
- 8 through 11 offer concrete suggestions about how to take the findings from the first two parts of this book and put them to work.
Chapter 1: Skill, Luck, and Three Easy Lessons
Defining Skill and Luck
“Luck is a chance occurrence that affects a person or a group.. [and] can be good or bad [it] is out of one’s control and unpredictable”
Skill is defined as the “ability to use one’s knowledge effectively and readily in execution or performance.”
Mauboussin adds some qualifying detail to these definitions:
- The concept of luck is related to randomness but you can think of luck operating at the level of the individual while randomness operates at the level of the system in which the individual operates.
- It is often assumed that doing something for a long time is sufficient to be an expert. Mauboussin argues that in activities that depend on skill, real expertise only comes about via deliberate practice based on improving performance in response to feedback on the ways in which the input generates the predicted outcome.
The Luck-Skill Continuum and Three Lessons
The definitions of skill and luck introduced by Mauboussin define two ends of a continuum. Applying the process that Mauboussin proposes requires that we first distinguish where a specific activity or prediction fits on the continuum. One of the challenges is to figure out how large a sample size you need to determine if there is a reliable relationship between actions and the outcome that evidences skill.
Take Sample Size into Account
To assess past events properly, consider the relationship between where the activity is on the luck-skill continuum and the size of the sample you are measuring. One common mistake is to read more into an outcome than is justified
Here’s the main point, if you have an activity where the results are nearly all skill, you don’t need a large sample to draw reasonable conclusions. A world-class sprinter will beat an amateur every time, and it doesn’t take a long time to figure that out. But as you move left on the continuum between skill and luck, you need an ever-larger sample to understand the contributions of skill (the causal factors) and luck.
Where skill is the dominant force, history is a useful teacher… Where luck is the dominant force … history is a poor teacher.
At the heart of making this distinction lies the issue of feedback. On the skill side of the continuum, feedback is clear and accurate, because there is a close relationship between cause and effect. Feedback on the luck side is often misleading because cause and effect are poorly correlated in the short run.
“An understanding of where an activity is on the luck-skill continuum also allows you to estimate the likely rate of reversion to the mean. Any activity that combines skill and luck will eventually revert to the mean.”
“The important point is that the expected rate of reversion to the mean is a function of the relative contributions of skill and luck to a particular event. If what happens is mostly the result of skill, then reversion toward the mean is scant and slow.”
Interactions Vary, but the Lessons Remain
Limits of the Methods
Mauboussin uses Nassim Taleb’s four quadrant approach to determine where statistical tools are useful and where they are likely to fail
Taleb’s four quadrants
- Extremely Safe – Simple payoff / Narrow outcomes
- Safe – Simple Payoff / Extreme outcomes
- Sort of Safe – Complex Payoff / Narrow Outcomes
- Black Swan Domain – Complex Payoff / Extreme Outcomes
Statistical methods tend to work well in quadrants one through three, and most of what we deal with in life falls into one of those quadrants. Dealing with quadrant four is far more difficult, and there is a natural and frequently disastrous tendency to apply naively the methods of the first three quadrants to the last.
Chapter 2: Why We’re So Bad at Distinguishing Skill from Luck
“Our minds have an amazing ability to create a narrative that explains the world around us, an ability that works particularly well when we already know the answer.”
Stories, Causality, and the Post Hoc Fallacy
The need to connect cause and effect is deeply ingrained in the human mind.
We often assume that if event A preceded event B, then A caused B. … That faulty association is known as the post hoc fallacy. The name comes from the Latin, post hoc ergo propter hoc, “after this, therefore because of this.”
“Knowing the end of the story also leads to another tendency, one that Baruch Fischhoff, a professor of psychology at Carnegie Mellon University, calls creeping determinism. This is the propensity of individuals to “perceive reported outcomes as having been relatively inevitable.”
Mauboussin highlights the way in which psychology works against our ability to distinguish skill and luck even when we have taken care to compensate for the tricks we know our mind will play on us; i.e.
- The desire to fit events in to a narrative based on cause and effect means that, once we know how things turned out, we have a tendency to downplay or ignore the role played by luck.
- It is especially tempting to believe that what happened was pre-ordained by the existence of our own skill
Some companies try to institutionalise guards against this kind of thinking. Pixar is one that comes to mind.
Undersampling and Sony’s Miraculous Failure
“The performance of a company alway depends on both skill and luck, which means that a given strategy will only succeed part of the time. So attributing success to any strategy may be wrong simply because you’re sampling only the winners. The more important question is: How many of the companies that tried that strategy actually succeeded?
Jerker Denrell, a professor of strategy at Oxford, calls this the undersampling of failure. …. He says you need to consider a full sample of strategies and the results of those strategies in order to learn from the experiences of other organizations. When luck plays a part in determining the consequences of your actions, you don’t want to study success to learn what strategy was used but rather study strategy to see whether it consistently led to success.”
Mauboussin argues that just because an outcome was successful does not mean the strategy that drove it was a result of skill. Mauboussin cites the example of Sony’s MiniDisc as a strategy that had everything going for it but was undone by a random technology change (a significant improvement in the availability of computer memory) that made the MiniDisk obsolete before it started.
“One of the main reasons we are poor at untangling skill and luck is that we have a natural tendency to assume that success and failure are caused by skill on the one hand and a lack of skill on the other. But in activities where luck plays a role, such thinking is deeply misguided and leads to faulty conclusions.”
Most Research Is False
Mauboussin also argues that research results can be an unreliable guide in part because of the researcher’s bias and also due to the common problem of mistaking correlation for causation
“In 2005, Dr John Ioannidis published a paper titled “Why Most Published Research is False” that shook the foundations of the medical research community. Ioannidis … argues that the conclusions drawn from most research suffer from the fallacies of bias, such as researchers wanting to come to certain conclusions or from doing too much testing”
“His analysis showed a stark difference between randomized trials and observational studies. In a randomized trial, subjects are assigned at random to one treatment or another (or none). These studies are considered the gold standard of research, because they do an effective job of finding genuine causes rather than simple correlations. They also eliminate bias in many cases because the people running the experiment don’t know who is getting which treatment. In an observational study, subjects volunteer for one treatment or another and researchers have to take what is available. Ioannidis found that more than 80 percent of the results from observational studies were either wrong or significantly exaggerated, while about three-quarters of the conclusions drawn from randomised studies proved be true.”
“The dual problems of bias and conducting too much testing are substantial… While scientists generally believe themselves to be objective, research in psychology shows that bias is most often subconscious and nearly unavoidable.
Doing too much testing can cause just as much trouble. … scientists lean heavily on tests of statistical significance [which] are supposed to indicate the probability of getting a result by chance (more formally, when the null hypothesis is true). There is a standard threshold that allows a researcher to claim that a result is significant. Here is where the trouble starts: if you test enough relationships, you will eventually find a few that pass the test but are not really related as cause and effect”
Where Is the Skill? It’s Easier to Trade for Punters Than Receivers
Many organizations, including businesses and sports teams, try to improve their performance by hiring a star from another organization. They often pay a high price to do so. The premise is that the star has skill that is readily transferable to the new organization. But the people who do this type of hiring rarely consider the degree to which the star’s success was the result of either good luck or the structure and support of the organization where he or she worked before. Attributing success to an individual makes for good narrative, but it fails to take into account how much of the skill is unique to the star and is therefore portable.
Stories Can Obscure Skills
We re-create events in the world by creating a narrative that is based on our own beliefs and goals. As a consequence, we often struggle to understand cause and effect, and especially the relative contributions of skill and luck in shaping the events we observe.
Chapter 3: The Luck-Skill Continuum
In this chapter,
- Mauboussin develops a simple (“two jar”) model that offers a more in-depth look at the relative contributions of luck and skill.
- and provides a framework for thinking about extreme outcomes and how to anticipate the rate of reversion to the mean.
Sample Size, Not Time
“Visualizing the continuum between luck and skill can help us to see where an activity lies between the two extremes, with pure luck on one side and pure skill on the other. In most cases, characterizing what’s going on at the extremes is not too hard.”
“Most of the action is in the middle, and having a sense of where an activity lies will provide you with an important context for making decisions.”
“As you move from right to left on the continuum, luck exerts a larger influence. It doesn’t mean that skill doesn’t exist in those activities. It does. It means that we need a large number of observations to make sure that skill can overcome the influence of luck.”
“We’re naturally inclined to believe that a small sample is representative of a larger sample… This fallacy can run in two directions. In one direction, we observe a small sample and believe, falsely, that we know what all of the possibilities look lie. This is the classic problem of induction, drawing general conclusions from specific observations.”
“The greater the influence luck has on an activity, the greater the risk of using induction to draw false conclusions”
“We can err in the opposite direction as well, unconsciously assuming that there is some sort of cosmic justice, or scorekeeper in the sky who will make things even out in the end. This is known as the gambler’s fallacy”
“… anythings in nature do even out, which is why we have evolved to think that all things balance out… But in cases … where outcomes are independent of one other, or close to being so, the gambler’s fallacy is alive and well”
Two jar model
Mauboussin uses two imaginary jars filled with balls to illustrate the difference between skill and luck
- Balls in Jar 1 = luck
- Balls in Jar 2 = skill
- A higher number represents more of the quality – can be positive or negative
- The outcome of each event represents the combined value of a draw from both jars
This modelling of the interaction of luck and skill shows that, if the variance of the distribution of luck is larger than the variance of the distribution of skill, then luck can overwhelm skill in the short term.
Paradox of skill – More Skill Means Luck is More Important
All other things being equal, as the overall level of skill increases, luck becomes more important but this is not a universal law or iron rule
“When everyone in business, sports, and investing copies the best practices of others, luck plays a greater role in how well they do.
For activities where little or no luck is involved, the paradox of skill leads to a specific and testable prediction: over time, absolute performance will steadily approach the point of physical limits, such as the speed with which one can run a mile. And as the best competitors confront those limits, the relative performance of the participants will converge.”
The Ingredients of an Outlier
“… great success combines skill with a lot of luck. You can’t get there by relying on either skill or luck alone. You need both. This is one of the central themes in Malcolm Gladwell’s book, Outliers.”
“Gladwell argues that the lore of success too often dwells on an individual’s personal qualities, focusing on how grit and talent paved the way to the top. But a closer examination always reveals the substantial role that luck played. [However] Luck is boring as the driving force in a story. So when talking about success, we tend to place too much emphasis on skill and not enough on luck.””
Reversion to the Mean and the James-Stein Estimator
“The position of the activity on the continuum defines how rapidly your score goes toward an average value, that is, the rate of reversion to the mean.”
“In activities that are all luck, there is complete reversion to the mean”
“In real life, we don’t know for sure how skill and luck contribute to the results when we make decisions. We can only observe what happens. But we can be more formal in specifying the rate of reversion to the mean by introducing the James-Stein estimator with a focus on what is called the shrinking factor.”
“For activities that are all skill, the shrinking factor is 1.0, which means that the best estimate of the next outcome is the prior outcome.”
“For activities that are all luck, the shrinking factor is 0, which means that the expected value of the next outcome is the mean of the distribution of luck.”
“If skill and luck play an equal role, then the shrinking factor is 0.5, halfway between the two. So we can assign a shrinking factor to a given activity according to where that activity lies on the continuum.”
“So far, I have assumed that the jars contain numbers that follow a normal distribution, but in fact, distributions are rarely normal. Furthermore, the level of skill changes over time, whether you’re talking about an athlete, a company, or an investor. But using jars to create a model is a method that can accommodate those different distributions.”
Chapter 4: Placing Activities on the Luck-Skill Continuum
“When an activity is mostly skill, we need not worry much about the size of the sample unless the level of skill is changing quickly. For activities with a good dose of luck, skill is very difficult to detect with small samples. As the sample increases in size, the influence of skill becomes clearer. So you can actually place the same activity at different points along the continuum based on the size of the sample alone. Larger samples do a better job of revealing the true contributions of skill and luck.”
In this chapter, Mauboussin explores three methods for placing activities on the continuum between luck and skill.
- He starts with some basic questions about the activity
- Then uses simulation to make the placement on the luck skill continuum more precise
- He concludes by reviewing the method that is popular among sports statisticians that estimates skill by subtracting luck from the observed outcomes
Placing Activities by Answering Three Questions
Cause and Effect
“First, ask if you can easily assign a cause to the effect you see. In some activities, the relationship of cause and effect is clear. You can repeat the behavior and get the same result. These are activities that are generally stable and linear. Stable means that the basic structure of the activity doesn’t change over time, and linear means that a particular action leads to the same reaction every time. If you can easily identify the cause of a given effect, you’re most likely on the skill side of the continuum. If it’s hard to tell, you’re on the luck side.”
The Rate of Reversion
“The second question relates to a topic this book has already discussed in some detail: What is the rate of reversion to the mean? To answer this question you need some way to measure performance.”
“Slow reversion is consistent with activities dominated by skill, while rapid reversion comes from luck being the more dominant influence.”
Where Prediction Is Useful
The third and final question is: Where can we predict well? In other words, where are experts useful?
When the predictions of experts tend to be uniform and accurate, skill is the driving factor. When experts have wide disagreement and predict poorly, lots of luck is generally involved.
Simulation: Blending Distributions to Match the Results
In the next method for estimating the relative contributions of skill and luck, the first step is to specify what would happen if only luck were involved and then ask what would happen if what we’re seeing were completely the result of skill. We can then observe the real data that has been gathered and see where it ought to fall on the continuum.
Skill = Observed Outcome − Luck
The final method for placing activities on the continuum is based on what is known as true score theory. That theory provides a method for measuring the relative contributions of skill and luck.
“The equation for true score theory is as follows: Variance (observed) = Variance (skill) + Variance (luck)”
More People Help Explain the Paradox—Unless They Need to Be Really Tall
This section deals with an exception to the paradox of skill rule in basketball where there has been an expansion of the people playing the game but they all tend to be very tall. So the variations in skill are drawn from a narrow pool. This distinguishes the game from others where a variety of body types can be successful.
The Luck-Skill Continuum in Business and Investing
The paradox of skill is an effective way to explain why markets are so hard to beat consistently.
Over time, “… investing went from being dominated by individuals to being dominated by institutions. As the population of skilled investors increased, the variation in skill narrowed, and luck became more important.
There is a big difference between saying that the short-term results of investment managers are mostly luck and saying they are all luck. Research shows that most active managers generate returns above their benchmark on a gross basis, but that those excess returns are offset by fees, leaving investors with net returns below those of the benchmark. Considering the evidence on balance, it is reasonable to conclude that there is evidence of skill in investing. However, only a small percentage of investors possess enough skill to offset fees. As a result, investing, especially over relatively short periods of time, is more a matter of luck than of skill.
Chapter 5: The Arc of Skill
So far, Mauboussin has treated skill as largely static,. In this chapter he looks at how skill changes over time. Specifically, he examines trends in performance in athletics, cognitive tasks, and business.
“The general pattern in all three areas is the same and is easy to summarize: old age is not your friend.”
Cognitive Performance: The Battle Between Fluid and Crystallized Intelligence
Fluid intelligence refers to the ability to solve problems that you’ve never seen before.
Crystalised intelligence is the ability to use the knowledge accumulated through learning.
“… fluid intelligence peaks around the age of twenty and declines consistently and steadily throughout life”
“… crystallised intelligence tends to improve with age”
Intelligence Quotient Versus Rationality Quotient: Why Smart People Do Dumb Things
Intelligence tests measure certain cognitive capabilities, but it is also clear that they fail to assess other important mental faculties that reflect cognitive skill. Perhaps the most important among these is the ability to make good decisions. Keith Stanovich … distinguishes between an individual’s intelligence quotient (IQ) and rationality quotient (RQ).
“The attributes of RQ, as Stanovich lists them, include “adaptive behavioral acts, judicious decision making, efficient behavioral regulation, sensible goal prioritization, reflectivity, [and] the proper calibration of evidence.”
The natural tendency when solving problems is to rely on cognitive mechanisms that are fast, low in computational power, and require little concentration, rather than recruiting those mechanisms of the mind that are slow, computationally intensive, and that require effort. To use Stanovich’s phrase, we are “cognitive misers.”
A low RQ also stems from the issue of what we don’t know. Most of us ought to learn how to think properly about problems related to probability, statistics, and the best ways to test hypotheses.
we do know that older adults rely more on rules of thumb, which would suggest that the cognitive processes behind RQ decline with age.
The Aging of Organizations
People lose skill with age, but so do organizations.
“Managerial skill—the ability to allocate financial, human, and organizational capital—explains some fraction of corporate performance. Luck also plays a prominent role. Outcomes of strategic decisions are probabilistic by nature, and companies (like people) don’t live a long time without enjoying some good luck.”
“Probably the best explanation for why companies decline is that they fall prey to organizational rigidities.”
Chapter 6: The Many Shapes of Luck
“… we vastly underestimate the role of luck in what we see happening around us”
Gauging Luck: Independent and Dependent Outcomes
One way to learn about the distribution of luck in the real world is by asking whether events are dependent on or independent of one another. Independent means that what happened before doesn’t affect what happens next; dependent means the first event influences the next one.
- If events are independent, a simple model, such as tossing a coin or picking numbers from a jar, will work.
- If they are dependent, as many social interactions are, then the distribution of luck is skewed.
- In a skewed distribution, good and bad luck are not balanced, on average. Rather, a few benefit from extremely good luck.
- This means that skill and success are only loosely connected. Events in such systems are not random, but they are nonetheless unpredictable.
Distribution of luck
Statisticians have a name for the normal ups and downs that you should expect when the distribution of luck is known: common-cause variation.
“In economics, common-cause variation is akin to risk. Frank Knight defined risk in its economic sense as a case where “the distribution of the outcome in a group of instances is known.” You don’t know what the outcome will be, but you do know all of the possible outcomes.”
Power Laws and the Mechanisms That Generate Them
“In some realms, independence and bell-shaped distributions of luck can explain much of what we see. But in activities such as the entertainment industry, success depends on social interaction. Whenever people can judge the quality of an item by several different criteria and are allowed to influence one another’s choices, luck will play a huge role in determining success or failure.”
“For example, if one song happens to be slightly more popular than another at just the right time, it will tend to become even more popular as people influence one another. Because of that effect, known as cumulative advantage, two songs of equal quality, or skill, will sell in substantially different numbers. … skill does play a role in success and failure, but it can be overwhelmed by the influence of luck. In the jar model, the range of numbers in the luck jar is vastly greater than the range of numbers in the skill jar.”
The process of social influence and cumulative advantage frequently generates a distribution that is best described by a power law.
The term power law comes from the fact that an exponent (or power) determines the slope of the line. An astonishingly diverse range of socially driven phenomena follow power laws,
One of the key features of distributions that follow a power law is that there are very few large values and lots of small values. As a result, the idea of an “average” has no meaning.
TR Comment: Mauboussin does not offer this example but the idea that the average is meaningless is also true of loan losses when you are trying to measure expected loss over a full loan loss cycle. What we will observe is lots of relatively small values when economic conditions are benign or strong and a few very large losses when the cycle turns down. It would be interesting to explore the extent to which power laws have been applied in loan loss and economic capital. My guess is not very much.
Since skill alone clearly cannot explain power laws, we need to turn to the mechanisms that generate those lopsided outcomes.
The distinction between independent and dependent outcomes is crucial.
A path-dependent process is one in which what happens next depends on what happened before. It’s a process that, in effect, has memory.
In these kinds of systems, initial conditions matter. And as time goes on, they matter more and more.
A number of mechanisms are responsible for this phenomenon. A simple one is known as preferential attachment.
Critical points and phase transitions are also crucial … . A phase transition occurs when a small incremental change leads to a large-scale effect. This is known colloquially as a tipping point.
Once an innovation reaches a certain level of popularity, its success is virtually assured. By the same token, great innovations can fail because the domino effect doesn’t kick in.
In economics, the lopsided outcomes are frequently the result of increasing returns and network effects. Much of conventional economic theory is based on diminishing returns. If the demand for a commodity exceeds supply, prices will rise, and whoever makes that product will earn more money. Those high profits will attract competitors, who will increase production and effectively push prices back down. This is called negative feedback, a mechanism that promotes stability. The strong get weaker and the weak get stronger.
In some sectors of the economy, this framework does a poor job of explaining outcomes because positive feedback, which promotes change, takes hold.
The effect of increasing returns is especially pronounced when it is accompanied by high up-front costs followed by low incremental costs and by network effects, where the value of a product or service increases as more people use it.
Inequality and Unpredictability
We have seen that path dependence and social interaction lead to inequality. Technology and competition also contribute to this phenomenon. But there is a crucial assumption underlying all of these models of superstardom: that we know exactly who is most skillful. That assumption, as we will see, is false. Social influence leads not just to inequality, but to a fundamental lack of predictability as well. More skill gives people an edge in attaining success, but like the red marbles that started out in the majority, that edge offers no assurance that they will end up on top.
There are a variety of ways to assess skill or quality. … But no matter how we assess someone’s skill, luck will also help to shape our opinion through social influence.
So luck is not only behind the inequality of outcomes, it determines what we perceive to be skill. If a product is judged through social processes, there is an inherent lack of predictability. The process doesn’t generate random results, but the specific results are unknowable before the fact.
This is essential to bear in mind any time you see a ranking, or ordering, of anyone or anything that can be evaluated in different ways. The ranks reflect the methods. Unless the method the researchers use exactly matches your own criteria, which is highly unlikely, you should take the ranking with a grain of salt.
The second defect is that some of the variables themselves are based on perception.
At this point, it should be clear that when the forces of social influence are at work, we get positive feedback that makes the strong get stronger and the weak get weaker. Who benefits from this process of amplification has a lot to do with luck.
But what is more unsettling is that if you can evaluate skill in many different ways (think of art, music, literature, movies, etc.), luck also manipulates our perception of quality or skill. Better products do have a higher probability of succeeding. But there is an enormous amount of latitude in sorting out who or what will do well. …. It’s as if the luck in the luck jar reaches across and changes the distribution in the skill jar.
Why It’s So Hard to Live with the Many Shapes of Luck
At this point, you probably accept the intellectual case that it is possible for skill, or quality, to play only a minor role in commercial success.
We are very good at fooling ourselves about our own success, a phenomenon that psychologists call the self-serving attribution bias. It is common for us to attribute success to our own terrific skill, even in endeavors that are determined mostly by luck. Part of the explanation is that we see ourselves as capable agents. We can do things. We can make things happen. So we assume that our skill caused the success we experience. On the other hand, we readily attribute failure to external causes, including bad luck.
Likewise, when we observe the success of others, we fall victim to the fundamental attribution error. In this context, the error is the tendency to base our explanation of what happens on an individual’s skill rather than the situation. Once we create a narrative that explains success, we tend to suppress other explanations and see what happened as inevitable.
Chapter 7: What Makes for a Useful Statistic?
Few people take the time to distinguish between statistics that are truly helpful and those that are not. A sense of the relative contributions of luck and skill in shaping the outcome of our efforts is essential to understanding how valuable a statistic is likely to be.
Useful statistics have two features. First, they are persistent, which means what happens in the present is similar to what happened in the past. … In statistics, this persistence is called reliability…. Good statistics are also predictive of the goal you seek. Statisticians call this validity
Statisticians assess persistence and predictive value by examining the coefficient of correlation, a measure of the degree of linear relationship between two variables in a pair of distributions.
The process of determining which statistics are useful begins with a definition of your objective: What do you want to use the statistics for?
Next, you have to determine what factors contribute to achieving your objective.
To do so, you have to translate a theory of cause and effect into quantities that you can observe and measure. This allows you to assess how skill, measured as high persistence, translates into your objective, measured as high predictive value.
Business Statistics: Going with the Crowd and Making It Up
The most widely accepted objective [of a business] is to maximize the value of the company’s shares. Practically speaking, this means that each dollar that a company invests should create more than one dollar in value. With that in mind, the next step is figuring out what to measure to find the actual causes of success.
“Companies generally measure both financial and nonfinancial values. We can figure out which financial measures are most popular by finding out how executives are paid and listening to what executives say are the most the important measures. As it turns out, the answer is the same in either case: earnings per share (EPS).
The immediate question to ask is whether the growth of EPS serves the objective of creating value for shareholders. The answer is: it depends. The growth of earnings and the creation of value can go together, but it is also possible to deliver higher EPS while destroying value … Theory tells us that the causal relationship between the growth of EPS and the creation of value is tenuous at best.
“… if you can forecast the rate of earnings growth, you can earn relatively attractive returns, even considering the important caveat that not all growth in earnings creates value. The problem is that forecasting earnings is difficult because of the lack of persistence.”
The growth of sales involves a different trade-off. While more persistent than the growth of EPS, the growth of sales has a weaker correlation with relative total returns to shareholders (r = .27). So the two most popular measures of performance have limited value. This should come as no surprise, given that the movement of stock prices over time reflects changes in expectations. Naturally, a company’s performance helps shape those expectations, but fundamentals and expectations can get out of step with each other. For that reason, thoughtful executives and investors strive to understand the expectations that are reflected in the price of a stock and seek to determine whether those expectations are reasonable.
Sadly, the most popular measurements companies use to track and communicate their own performance don’t exert much influence in how much value those companies create for shareholders. Carelessly chosen nonfinancial measures are even worse. Too many companies select statistics for their ubiquity rather than their utility. Thoughtful executives clearly state their governing objective. They then try to identify the causes that reliably lead to that objective. That persistence suggests that skill is exerting more influence than luck. Enlightened executives will actually draw maps that allow them to understand, track, and manage the cause-and-effect relationships that determine the value of the company and its stock.
Investing: Past Returns Are No Guarantee of Future Returns
The business of investing is filled with institutional practitioners who are smart and motivated. As we have already seen, the industry is highly competitive, and reversion to the mean is a strong force. That it is difficult for fund managers to deliver returns in excess of the market, adjusted for risk, is a testament to the efficiency of the market. The idea behind efficiency is that the prices of assets reveal all known information. This is not strictly true, and markets are notorious for going to extremes. But only a small percentage of investors have shown an ability to systematically beat the market over time. It’s not that investors lack skill, it’s the paradox of skill: as investors have become more sophisticated and the dissemination of information has gotten cheaper and quicker over time, the variation in skill has narrowed and luck has become more important.
Because investing involves so much luck in the short term, it would stand to reason that short-term success or failure is not a reliable test of skill. But all of us effortlessly find causes for the effects we see, and making money appears to be clear evidence that the investment manager knew what he was doing. Investing is a field where this fallacy is very costly.
Individual investors who do not have access to the sophisticated analytical tools that professionals use tend to rely on rating agencies to guide their decisions. The best known of these agencies … rates mutual funds on a five-star scale.
The star rating system is a forced normal distribution based on prior, risk-adjusted returns. For example, the top 10 percent of funds earn a rating of five stars,
Morningstar weights the results according to the longevity of the fund, so the full track record of a fund with a long history has a greater weight than its recent performance. The star system is not really a measure that leads to making money; it tells you what the past performance of the fund looks like.
The primary reason individuals and institutions invest in a fund is that they like the way it performed in the past. But those figures give little information about what the fund will do in the next three years.
The search for a satisfactory statistic in investing may appear futile; indeed, I will make the case later in chapter 8 that a focus on an investor’s process is more useful than dwelling on past outcomes. But there is a statistic in investing, called active share, that is worth considering. Developed by two economists, Martijn Cremers and Antti Petajisto, active share measures the fraction of a portfolio that is different from the benchmark index. The measure has a range from 0 percent, which means the fund is identical to the benchmark, to 100 percent, meaning the fund is completely different from the benchmark.
Comparing Across Domains
Statistics are widely used in a range of fields. But rarely do the people who use statistics stop and ask how useful they really are. The simple test of persistence and predictive value goes a long way in judging how practical a measure is likely to be.
Chapter 8: Building Skill
“The approach you take to developing skill depends on where an activity lies on the luck-skill continuum. For activities that take place in environments that are stable and in which luck plays a small role, deliberate practice improves skill. Under those conditions, people can develop true expertise.” …
“When activities are more influenced by luck, you won’t get that kind of feedback, at least in the short term. What you do is not connected strongly to the result. So the best approach is to focus on the process you’re using.”
The topic of skill and expertise has received a lot of attention in the popular press in recent years, but it’s not clear that the attention has improved the quality of our thinking regarding how we gain skill. Few authors have been careful to specify the conditions under which deliberate practice is useful, and many seem to accept the notion that hard work can overcome any innate differences between individuals. But the main problem remains that people use their intuition in situations where they shouldn’t.
Deliberate Practice: Structure, Hard Work, and All About Feedback
Kahneman suggests that it is useful to consider two systems of decision making.
System 1, the experiential system, “operates automatically and quickly, with little or no effort and no sense of voluntary control.” System 2, the analytical system, “allocates attention to the effortful mental activities that demand it, including complex computations.” System 1 is difficult to modify or direct, while you can deliberately engage System 2. The distinction between these two systems of thinking is useful in considering how deliberate practice can shape performance.
You can become an expert by using deliberate practice to train your System 1.
Deliberate practice and the concept of expertise apply only near the skill side of the luck-skill continuum. You can train System 1 only for activities that are stable and linear.
In a linear system, a particular cause always has the same effect.
Deliberate practice is powerful in domains where it applies, including chess, music, and sports. But acknowledging its limits is crucial.
Feedback is the glue that holds together the elements of deliberate practice.
Your performance improves only if you receive accurate and timely feedback. Elite performers often use coaches for that purpose. A strong link between cause and effect is essential, of course, and when it is difficult to get quality feedback, deliberate practice is less effective.
Another misleading idea is that innate talent plays no role in determining performance. This view holds that expert performance is a function only of the amount of deliberate practice. Recent research does not support a claim that strong.
Hard work explains a great deal, but not everything. … In activities that are largely a matter of skill, basic ability counts.
Checklists: A Structured Way to Manage Attention
A checklist is a series of steps that must be carried out accurately and on time. Where cause and effect can be clearly established, checklists have been widely embraced. Examples include aviation and construction.
Checklists are highly effective but underutilized in jobs that combine probabilistic tasks with tasks that follow a set of rules or set procedures. Here’s the reason: professionals in these fields think of themselves as practicing a craft and actually find it demeaning to resort to a checklist. They think they have the knowledge to do the job and do not need any aids. They are wrong, and their attitude is costly.
The Checklist Manifesto [by Atul] Gawande looks at the use of checklists in various fields and provides some guidelines on how to write ones that are effective. Specifically it is essential to involve the people who are doing the work.
“… checklists have been used the longest and most reliably in aviation … Daniel Boorman, an engineer at Boeing … advises that a checklist should be short. The rule of thumb is that it should be five to nine items and fit on one page, although the exact length depends on the context. Further, the language should be simple, exact, and familiar to the users.
The checklist must also be free of distracting colors and graphics, and use a typeface that is easy to read. A good checklist also prompts communication among coworkers. These items encourage team members to try to identify, prevent, or solve problems. Finally, those who are going to use the checklist should test and refine it as necessary. Checklists are living documents that evolve.
Boorman describes two types of checklists: DO-CONFIRM and READ-DO.
With DO-CONFIRM checklists, pilots do their jobs from memory but pause from time to time to ensure that everything is complete and has been done properly. …. READ-DO checklists typically deal with an emergency or an abnormal situation. In these cases, the pilot is likely to be unfamiliar with the situation, so the READ-DO checklist offers a recipe for action. The main virtue of a READ-DO checklist is that it allows the pilot to focus on concrete steps to address the problem. … READ-DO checklists prescribe correct actions under stressful conditions when it’s easy to forget things and make mistakes.
The final element is collecting and analyzing data properly … reliable data collection is what separates mumbo jumbo from science, hope from reality … Accurate data forms the foundation for high quality feedback.
When Success Is Probabilistic, Focus on Process
“… when your undertaking involves a dose of luck, the link between cause and effect is broken.In the short term, even when you do everything right, the outcome of your effort can be bad. Moreover, you can succeed even when you do everything wrong.
Whether you’re managing a sports team, running a business, or investing in stocks, a skillful process will tend to have three parts: analysis, psychology, and the influences exerted by your organization. Coming up with a process that accommodates any one of these isn’t easy. Being good at all three is rare. We’ll look at each part and use the process of investing as an example throughout.
To begin your analysis, you need to identify the real causes of success, in investing such as supply and demand, economic profit, and sustainable competitive advantage. For this purpose, markets can provide a window on the world. So if you see a discrepancy between what a stock ought to be worth and what people are paying, you need to develop a theory about why price and value have diverged. What’s going on in the world that’s making people pay more or less for the stock? Your analytical edge is embodied in your theory of what determines the fundamentals and why the price is wrong.
Your edge should also include what Benjamin Graham, the father of security analysis, called a margin of safety.
Finding a discrepancy between value and price is only part of your analysis. Next, you have to build a portfolio that takes advantage of the opportunities. There are two common mistakes you can make when building a portfolio. The first is a failure to match how much of each stock you buy with the attractiveness of the opportunity. In theory, you should allocate more of the portfolio to the most attractive stocks
At the opposite end of the spectrum is a mistake called overbetting. An investment manager who’s watching her edge dwindle may attempt to boost her returns by using debt. … Overbetting occurs when the size of an investment is too large for the opportunity that it represents and therefore introduces the possibility of failure.
The second part of a skillful process is psychological. This part deals with Kahneman and Tversky’s work on biases. These include overconfidence, anchoring, confirmation, and relying on what is most recent. Kahneman and Tversky emphasize that these biases arise automatically and are therefore very difficult to overcome.
Kahneman and Tversky also developed the idea of prospect theory, or how people make decisions when they are uncertain about gains and losses. Prospect theory reveals behavior that is at odds with classical economic theory”
Loss aversion is another feature of prospect theory. We suffer roughly two times more from a loss than we enjoy a gain of the same size.
The third part of the process of skill addresses organizational and institutional constraints. In the case of investing, The most important job is to manage agency costs, or the costs that arise because the money manager (the agent) may have interests that differ from those of the investor (the principal).
Because all three parts of a skillful process are difficult, they stand in the way of achieving good performance.
Match the Technique to the Situation
Whether or not you can improve your skill depends a great deal on where the activity lies on the luck-skill continuum. In cases where there is a clear relationship between cause and effect, and in activities that are stable and linear, deliberate practice is the only path to improvement….
If an activity involves luck, then how well you do in the short run doesn’t tell you much about your skill … For activities near the luck side of the continuum, a good process is the surest path to success in the long run.
Accurate feedback is essential no matter where you are on the continuum.
Chapter 9: Dealing with Luck
This chapter is about coping with luck. The first approach deals with
reducing the advantage of a skilled opponent if you are the underdog and
improving your advantage if you are the favorite. The second approach involves reducing the influence of luck by more effectively tying cause to effect. Finally, it is critical to understand the limits of what you can understand. The strategy here is to learn how to define your limits and cope with events that have small and incomputable probabilities, but very large consequences.
What Colonel Blotto Can Teach You About Coping with Luck
When competing one-on-one, follow two simple rules: If you are the favorite, simplify the game. If you are the underdog, make it more complicated.
Mauboussin uses the game called Colonel Blotto to illustrate how these rules can be applied.
Colonel Blotto also has parallels to business. One illustration is the theory of disruptive innovation developed by Clayton Christensen at Harvard Business School. Christensen studies why great companies with smart managements and substantial resources consistently lose to companies with simpler, cheaper, and inferior products. He calls these upstarts “disruptors” and distinguishes between what he calls sustaining and disruptive innovation.
Sustaining innovation involves steadily improving a product that already exists. Those improvements can be substantial, but … they build on the existing business model. … You can think of sustaining innovations as putting more soldiers on the same set of battlefields. Christensen’s work shows that when a new company comes along and tries to beat the leading company at its own game by introducing another version of the same product, the attempt will fail.
When disruptors succeed, they approach the market using a completely new business model. A disruptor can introduce a product at the low end of the market that is neither profitable for the big guys nor in demand from their customers.
Strong and Weak Nations
The two rules that emerge from playing Colonel Blotto make sense when you think about them, but surprisingly they are often ignored. While there is no way to change luck, the main goal is to increase or decrease the relative importance of skill in a toe-to-toe conflict, depending on whether you’re the stronger or weaker opponent.
Teasing Out Causality Through Little Bets
In his book Everything is Obvious, Duncan Watts suggests that we shift our way of thinking. Tradition in the ad business dictates that we “predict and control”; that is, we try to predict how people will respond to an advertisement or product. Watts suggests that instead, we “measure and react”; that is, we do carefully controlled experiments and let our actions guided by the results.
Randomness and luck are the result of insufficient information—an inability to pinpoint cause and effect. Controlled experiments can be a quick and effective way to improve our understanding of cause and effect. This approach reiterates a broader point: whenever you evaluate results, ask if you have considered the outcomes that would come from a proper null model, the simplest model that might explain the result. … This is why randomized trials lead to much more reliable findings than observational studies do.
How to Live in a World That We Do Not Understand
One of my main points is that most people have trouble untangling skill and luck even in cases where it is possible to do so. But it is still important to acknowledge the limits of the methods at our disposal, and Taleb’s work effectively illustrates that point.
The fourth quadrant is the world of black swans. Taleb argues that having no theory or model of events in the fourth quadrant is preferable to having a theory or model, because the errors we make are huge and often lead to bad results. In practice, we try to use policy and models to manage a part of the world that defies control and understanding. In a paper coauthored with Mark Blyth, Taleb argues that efforts by political leaders and economic policy makers to stabilize the economic system by inhibiting fluctuations actually create fragility and even greater vulnerability to extreme events.
Comment: I think this point about efforts to stabilise systems being counter productive is really important for the financial stability and bank capital adequacy debate.
Moral hazard refers to a person or organization taking an action on behalf of others without suffering the consequences if the outcome is bad.
In dealing with events in the fourth quadrant, Taleb recommends that we avoid optimization and allow for redundancy. You can optimize your behavior to serve a specific goal in a stable system. … But optimization in a changing system is a setup for failure.
Comment: Similar to the point above but bank risk managers are naturally inclined to look for optimal solutions. This may also be a path that mathematically trained risk managers gravitate to.
Because optimization works well in some systems, there is a temptation to optimize in the fourth quadrant during times of relative stability. In building a portfolio of investments, for example, optimization would mean getting the highest return you can for a given level of risk. … Where expert predictions are poor, optimization is a bad idea, because you can get locked in to one approach and can’t adjust to change.
Comment: Would be interesting to examine how clearly the Risk Appetite Statements of banks identity and engage with 4th quadrant concerns
There are two kinds of payoffs in the fourth quadrant. In the first kind, shocks and mistakes lead to painful losses. This is what Taleb suggests avoiding, because the small gains over time do not compensate for the infrequent but sizeable losses ….
Taleb’s message is that if you are dealing with events in the fourth quadrant, you should seek payoffs that are equivalent to buying options, even though they have a small but steady cost and you don’t know if or when you will get a big payoff.
When it comes to managing luck, Taleb has two very useful messages. The first is to understand the limits of your knowledge about events that have probabilities that you can’t compute and consequences that are significant. In other words, know what you don’t know. The second is to take steps to make sure that whatever exposure you do have in the fourth quadrant is the result of buying, or acquiring, options and not the result of selling options… you do not define your ultimate success by the frequency of gains, but rather by how much you make when you are right versus how much you lose when you are wrong.
Learning to Live with Luck
By definition, luck is something that no one can control. But there are ways that you can manage it more effectively. The main lesson from Colonel Blotto is that in competitive interactions, the strong should seek to simplify to emphasize their advantage in skill and the weak should try to add randomness to dilute the stronger player’s advantage.
In some case, it is helpful … to consider luck as equivalent to a lack of knowledge. In particular, in an activity dominated by luck, the relations between cause and effect is difficult to discern. But organisations can get a clearer sense of cause and effect … through a combination of applied scientific methods and technology, thereby reducing the role of luck.
Statistical methods are of great value in measuring a wide range of activities. But understanding the limitations of models is as important as applying them properly. Nassim Taleb has developed a useful map that shows the limits within which we can apply statistics usefully. Specifically, he suggests that in the fourth quadrant, an area characterized by complex payoffs and extreme outcomes, we’re better off using no model than using a faulty one. The key is to sidestep activities that have small gains and large losses and to gain exposure to payoffs that have small costs but large gains.
Comment: The point made here is I think the same as the concept of a “zone of validity” in The Money Formula by Wilmott and Orrell. It is also worth reiterating the point that your ability to assess skill rises as you gather more information. Small samples are notoriously unreliable. So make sure your sample is large enough so that you can draw reliable conclusions. Mauboussin also emphasises the point that, when you estimate true skill, you must keep in mind that you are doing just that: estimating. You can’t calculate an exact and objective answer. This is another reminder to be aware of the limitations of the tools you are using.
Chapter 10: Reversion to the Mean
Reversion to the mean is an idea that most people believe they understand …. Yet the concept is actually very hard to grasp and even harder to employ in making decisions. Specifically, reversion to the mean creates three illusions. The first is the illusion of cause and effect. Our natural inclination is to look for what is causing a given measurement to regress toward the mean, an exercise that is frequently fruitless. There is also the illusion of feedback, which makes it seem like favorable feedback leads to worse results and unfavorable feedback leads to better results. Finally, there’s the illusion of declining variance, the idea that reversion to the mean implies that everything we can measure converges on the same average value over time.
Mean Reversion Mistakes
One of the themes of this book is that we have a difficult time untangling skill and luck because of our basic desire to find cause and effect in every situation, whether or not that view represents reality. Reversion to the mean is a statistical artifact that produces an itch that our causal minds yearn to scratch. When skill remains constant, we see reversion to the mean because of the randomness (luck from an individual’s viewpoint) in the activity. There is no cause, so there is nothing to explain.
One principle of competitive markets is that excess returns get competed away.
A company will attract competition if it earns a 20 percent return on invested capital and has a 10 percent opportunity cost of capital (a measure of the minimum required return, or what you’d expect to earn if you were doing something else with the money). A competitor with the same opportunity cost may be willing to sell the product or service at a lower price and accept a 15 percent return on invested capital. Since that return is still above the opportunity cost, another competitor may be willing to lower its price and accept a 12 percent return. And so on. Eventually, in theory, all of the competitors will earn nothing more than the opportunity cost of capital. This process sounds a lot like reversion to the mean, with the mean being the opportunity cost of capital.
in 1933, a statistician … named Horace Secrist wrote a book called The Triumph of Mediocrity in Business.
Secrist’s conclusion that results converge to the mean is a famous example of falling for the illusion of declining variance. As we have already seen, reversion to the mean does not imply that results will cluster closer to the average. As long as return on invested capital is not perfectly correlated with itself from year to year, you will see reversion to the mean. The coefficient of variation—the standard deviation divided by the mean—shows that the distribution of corporate performance, measured by return on invested capital, has been fairly consistent over time. This mistake is sufficiently common among prominent economists that it prompted Milton Friedman, who won the Nobel Memorial Prize in Economics in 1976, to write an essay on the topic.
None of this is to say that results cannot exhibit a decline in variance over time.
A declining variance is the key idea behind the paradox of skill. But just because you observe reversion to the mean, that doesn’t suggest that outcomes are converging toward the average. You must be careful to distinguish between a change of the system and change within the system. Those are easy to conflate.
Francis Galton figured out that correlation and reversion to the mean are two takes on the same concept. This is an essential insight for sorting out skill and luck. The coefficient of correlation between two variables determines the rate of reversion to the mean and provides valuable guidance for forecasting.
Base Rates, Persistence, and c
We are now ready to bring together a few of the ideas we’ve been considering so that we can see how to make practical use of reversion to the mean. The first idea comes from the paper, “On the Psychology of Prediction,” published in 1973 by Daniel Kahneman and Amos Tversky. They said that there are three types of information that are relevant for a statistical prediction: the prior information, or base rate; the specific evidence about the individual case; and the expected accuracy of the prediction. The trick is determining how to weight the information.
Avoiding the Trap
A final note. For reversion to the mean to be relevant, there has to be some sense of the mean, or average. For distributions that follow a power law, where there are a few extremely large values and lots of small values, there is no meaning in the idea of an average.
Comment: I think this point is relevant to understanding bank loan losses
The lesson of this chapter is that effective forecasts require you to consider carefully where you are on the luck-skill continuum, estimate an appropriate shrinkage factor, and incorporate reversion to the mean into your decision. Even the simple and approximate equation c ≈ r, which says that the appropriate shrinkage factor is roughly equal to the correlation between two outcomes, can get you into the right frame of mind.
Chapter 11: The Art of Good Guesswork
Maubouissin’s suggestions for untangling and navigating the divide between luck and skill
- Understand where you are on the luck skill continuum
- Assess sample size, significance and swans
- Always consider a null hypothesis
- Think carefully about feedback and rewards; High quality feedback is key to high performance. Where skill is more important then deliberate practice is essential to improving performance. Where luck plays a strong role, the focus must be on process
- Make use of counterfactuals; To maintain an open mind about the future, it is very useful to keep an open mind about the past. History is a narrative of cause and effect but it is useful to reflect on how outcomes might have been different.
- Develop aids to guide and improve your skill; On the luck side of the continuum, skill is still relevant but luck makes the outcomes more probabilistic. So the focus must be on good process – especially one that takes account of behavioural biases. In the middle of the spectrum, the procedural is combined with the novel. Checklists can be useful here – especially when decisions must be made under stress. Where skill matters, the key is deliberate practice and being open to feedback
- Have a plan for strategic interactions. Where your opponent is more skilful then try to inject more luck into the interaction
- Make reversion to the mean work for you; Understand why reversion to the mean happens, to what degree it happens, what exactly the mean is. Note that extreme events are unlikely to be repeated and most importantly, recognise that the rate of reversion to the mean relates to the coefficient of correlation
- Develop useful statistics (i.e.stats that are persistent and predictive)
- Know your limitations; we can do better at untangling skill and luck but also must recognise how much we don’t know. We must recognise that the realm may change such that old rules don’t apply and there are places where statistics don’t apply