Recently in a conversation with Walter Tackett (whom I used to work with at
NativeMinds) he clued me in to the fact that no less a personage than William
Poundstone had quoted this article of mine in one of his books! The book is:
For the sake of full disclosure, that’s an Amazon Affiliate Link. So if you
follow the link and buy it, I might get paid a little.
This is one of the main precipitating events that made me realize I needed to
start a blog to keep my writing and stuff on. This essay is just one of many
things I’d written (in this case about seven years ago) and then more-or-less
forgotten about. I didn’t even have a saved copy of it. Walter had put it up
on his site for a while; I just dug it out of the Wayback machine.
What Poundstone quoted me on directly was the line about the Kelly Criterion
being “the bright clear line between Aggressive investing and Insane investing.”
Which, assuming you’re out to make money, it is.
Analysis of Multiple Simultaneous Non-Independent Investment Opportunities with Multiple Possible Outcomes
By Ray Dillinger ( firstname.lastname@example.org )
There are a lot of papers that have been published about figuring out how much money to invest in a business opportunity given your level of wealth, its odds of success, and its estimated returns in the events of success and failure. Usually the further assumption is made that your decision to invest or not doesn’t affect the price of the issue or the odds of success.
Frequently I’ve seen papers analyzing two or more such opportunities in terms of finding an optimal strategy for investing in both, and these are good work too as far as they go; they’re accurate in the way that physicists models that involve frictionless surfaces and model planetary masses as a point are accurate; within the limitations of their simplifying assumptions.
The problem is, that’s not how business opportunities look in the real world.
In this paper I’m going to review the math that governs the simple cases and then move on to introduce the math that governs the more complicated cases.
Everybody knows, or everybody should know, the Kelly Criterion for making optimal investments in a single two-valued opportunity. It’s very simple; you invest your wealth, times the edge, divided by the variance of the bet.
“The edge”, in this case, is the return you’d get per dollar invested in the average case.
So let’s say I offer you a simple cointoss where I offer to double the money you bet on it if you win the toss. Since you win or lose exactly the amount bet, the edge is zero. The Kelly Criterion says you don’t take this bet because there is no long-term edge to justify the risk of losing your money.
So I’ll get a little stupid in order to make a point, and make it sweeter. Let’s say I triple your money if you win and take your money if you lose. Now you’ll win one bet of every two and lose one bet of every two, winning twice the amount bet and losing the amount bet. So, you come out ahead by the amount bet every two bets, and the edge is 1/2. The variance though is bigger; if you can be up two or down one times the amount bet, the difference between them, or the variance, is three. Since 1/2 divided by 3 is 1/6, the Kelly Criterion says you bet one-sixth of your money on on this bet.
Let’s make it even sweeter. Now I offer to quadruple your money if you win (I must be really amazingly stupid to make you such an offer) and take your money if you lose. Now your edge is 3/2 since, on average, you get back one and a half times the amount bet. But the variance is 5. Since 3/2 divided by five is 3/10, the Kelly Criterion says you bet three-tenths of your money on this bet.
Now let’s get ridiculous. Let’s say I offer you a hundred times your money back if you win a coin toss, and take your money if you lose. Now how much should you risk? Well, your edge is now 99/2 and the variance is 100. So you wind up betting 99/200 of your money on this bet.
By now you’ve probably noticed the point I’m heading for: No matter how big your edge gets, the Kelly Criterion says you never *EVER* invest a fraction of your wealth greater than the probability of losing it. Even if you are fifty percent likely to win a billion dollars for every dollar bet, you don’t bet more than half your money. State lotteries that cost a dollar and have chances of winning of one in ten million are bad investments for anyone who has less than ten million dollars, no matter how many billions large the jackpot may be.
It turns out that the Kelly Criterion tells you EXACTLY how much of your wealth you should invest in order to maximize long-term growth of your money. If you bet more, you have more risk but don’t make as much money. If you bet less, you have less risk and don’t make as much money. Now, if you need to actually take income out of your investing wealth every so often, then you should be investing less than the Kelly Criterion says; the money you take out isn’t contributing to longer term growth, so it doesn’t justify as much risk as the Kelly Criterion accepts. But there is never, under any circumstances, any reason to bet more than the Kelly Criterion suggests; in terms of money management, the Kelly Criterion is the bright clear line between aggressive long-term investing that undertakes exactly as much risk as necessary to absolutely maximize growth, and insane investing that accepts more risk than needed and by doing so impairs the long-term growth of funds.
All this is very simple when there are just two possible outcomes. More gamblers than investors know the Kelly Criterion, because they’re more familiar with the simplified, two-valued kind of investment opportunities where it’s easy to calculate.
But real business opportunities don’t look like that.
How do you calculate the Kelly Criterion when you’re looking at Acme Widgets and the government is seeking bids on a big widget contract, and you figure they have about:
a 15% chance of landing the contract and making a 50% return,
a 20% chance of being a supplier to the company that lands the
contract and making a 30% return,
a 55% chance of having no contract awarded and doing business
as usual making a 10% return, and
a 10% chance of having a competitor get the contract and losing
70% of the money invested?
What’s the most you should put into this company? The math is a bit more complicated now, and there isn’t a straightforward way to find an answer. But there is a straightforward way (well, only mildly complicated) to check how good a possible answer is.
What the Kelly Criterion does is to maximize the logarithm of the expected wealth. By maximizing the logarithm repeatedly and compounding your earnings, maximum growth of wealth is achieved. So, while it’s no longer straightforward to directly calculate the Kelly threshold for this more complicated situation, you can still iteratively maximize the logarithm of expected wealth to find the optimal Kelly-Criterion investment. Here’s an example.
Let’s say you have a million dollars to manage.
The natural logarithm of 1000000 is 13.8155, so that’s the benchmark for making no investment at all.
Now, if you contemplate putting all of your money into acme widgets, then you have to figure the different outcomes and likelihoods and take the weighted average of their logarithms. So….
0.15 * ln(1000000 * 1.50) +
0.20 * ln(1000000 * 1.30) +
0.55 * ln(1000000 * 1.10) +
0.10 * ln(1000000 * 0.30) = 13.8608.
Since making no investment at all gave a logarithm of 13.8155, investing all your money in Acme Widgets is seen as being better than investing none of it.
But is that all there is to the story? What if you only invest half your money?
0.15 * ln (500000 + 500000 * 1.50) +
0.20 * ln (500000 + 500000 * 1.30) +
0.55 * ln (500000 + 500000 * 1.10) +
0.10 * ln (500000 + 500000 * 0.30) = 13.8607.
This is very close to being as good as investing all your money. Let’s try three-quarters:
0.15 * ln (250000 + 750000 * 1.50) +
0.20 * ln (250000 + 750000 * 1.30) +
0.55 * ln (250000 + 750000 * 1.10) +
0.10 * ln (250000 + 750000 * 0.30) = 13.8692.
That’s better than either all or half, so let’s see what happens if we invest seven-eighths of our wealth, which is halfway between the two best scores we’ve seen so far:
0.15 * ln (125000 + 875000 * 1.50) +
0.20 * ln (125000 + 875000 * 1.30) +
0.55 * ln (125000 + 875000 * 1.10) +
0.10 * ln (125000 + 875000 * 0.30) = 13.8679.
That’s not as good as investing three-quarters of our wealth, so it’s too much. We can back off a little bit and try investing 13/16 of our wealth:
0.15 * ln (187500 + 812500 * 1.50) +
0.20 * ln (187500 + 812500 * 1.30) +
0.55 * ln (187500 + 812500 * 1.10) +
0.10 * ln (187500 + 812500 * 0.30) = 13.8691.
And this is the best score we’ve seen so far. The optimal amount to invest is going to be right around here somewhere; we could carry this regression out ten more steps and have it correct to within one part in about 20000. But we don’t have to; I’ve just shown the first few steps to illustrate the process.
The problem is real business opportunities don’t look like that either.
In the real world, you’re never looking at a situation where you’re deciding how much money to put into your only investment opportunity. At the very least, the money you don’t invest in that opportunity may usefully be placed in a risk-free investment like gold or a low-risk investment like Treasury Bills. You should also be considering Acme Widgets’ competitor, the Klein Brush & Bottle Company, because if Acme doesn’t get that contract, Klein is a whole lot more likely than otherwise to get it. You know they won’t both get the contract, although they may both be suppliers if someone else gets it. And you also know that if you invest in both companies, you run less risk because the downside risk at Acme is coincident with a much higher probability of a high return at Klein and vice versa. And finally, since Acme is a fairly small company and you’re looking at a medium-sized fund, the amount you invest may drive up the price you have to pay for its stock, which will drive down your effective return.
But, the technique above turns out to be something you can generalize:
The General Form of the Kelly Criterion is:
Sum for all X of
(probability of X * ln (ending wealth if X happens))
This is how you can calculate the degree to which your growth opportunities are being maximized. Now, if you’ve done a lot of math, you’re already looking at the generalized form of the Kelly Criterion, above, and setting up integrals in your head to deal with continuous probability distribution functions and reward levels and differentials to help find the optimum points, but it turns out that depending on what the probability and return formulas are like, it may not be generally or easily integrable or differentiable. In fact it’s usually not.
This is a most excellent formula, because you can use it to evaluate investment strategies involving making lots of different investments
simultaneously: For example, you might try different investment levels in Acme and Klein and Gold, and account for such things as the difference in the tax bite that depend on how well you do.
But this complicates things, because if we substitute in complicated formulas that depend on our investment for the probability of X, and we substitute in complicated formulas that depend on our investment for the rate of return if X happens, we wind up with nonmonotonic functions in multiple variables.
And with nonmonotonic functions in multiple variables, you can’t easily just “home in on it” the way I did above, because the function may have several local maxima, local minima, and discontinuities.
Optimization as Search and Simplifying Assumptions
In this case, optimization becomes a search, and the more complex the set of outcomes you’re looking at and the greater the number of investments you’re trying to optimize the distribution of money between, the harder the search becomes. Here is where you make a lot of simplifying assumptions, aggregating companies into industries and risk profiles to try to reduce the number of variables you have to work with. Here is where you assume things operate independently, even though sometimes they may not, because analysis of independent variables can be carried out separately from each other.
But very complex search spaces are in fact, what genetic algorithms, stochastic searches, and multivariate regressions are for, and computer code can be your tool to cut through a whole lot of fog here seeking the best investment levels in all these different opportunities.
Usually you have to pick and choose which simplifying assumptions you’re making and which you’re throwing out. Analyzing the situation of Acme and Klein and that big contract, clearly you shouldn’t assume that they’re independent. But analyzing, say, an oil-rig firefighting company and a boot and shoe dealer, you can be pretty comfortable assuming that their relative performances have nothing to do with each other. It may turn out that the boot and shoe company makes a lot of its money manufacturing protective boots for the rig firefighters so it may not be true — but it’s a pretty comfortable assumption, and I’d make it in a heartbeat. There’s really no way you can capture every detail of every possible interdependence; you just have to wind up ignoring some of them.
The Value of Conservative Assumptions
Remember what I said earlier about the Kelly Criterion being the clear bright line between aggressive investing and insane investing?
If you overestimate the amount you should invest, you expose yourself to more risk, and simultaneously reduce the long-term growth of your wealth. That is insane. If you underestimate the amount you should invest, you make less money, which is bad, but you also expose yourself to less risk and adjust for eventually taking some income from your wealth, which is good. Investing more than the Kelly Criterion says is clearly insane, but there are good reasons why most people should want to invest less.
That is why conservative assumptions — those which would lead you to invest less — are generally better than assumptions which would lead you to invest more. It’s clear that accurate assumptions are the best assumptions of all, but when you are forced to deviate from accuracy, it’s best to deviate in a conservative direction.
The Art of Picking Your Assumptions
And this is where hard math meets art, science, and experience. We have a tool we can use to analyze any investment strategy, under a given set of assumptions. But we have to make assumptions which we know aren’t completely true all the time to control the complexity of the analysis, and the assumptions we make and don’t make govern the accuracy of our analysis. And this, traditionally, is the part of securities analysis you can’t automate; there’s no way an automaton can adjust for things it just plain doesn’t know.
Closing The Loop
Or is there? We have been talking about optimizing systems based on predictions; and we’re already used to the idea of optimizing ex ante prediction systems based on ex post performance. What if we make a dozen different systems that use a dozen different sets of assumptions, and turn them all loose trying to figure out optimal investment strategies in hundreds or thousands of ex ante scenarios drawn from real life? Then we could come back with the ex post performance numbers from those scenarios and figure out how well each of the robot portfolio managers would have done.
Clearly, the robot whose portfolio did the best must, ipso facto, have been the one whose predictions were most useful on the scenarios presented.
Now, what if we do it iteratively, performing a cluster analysis on the ex-ante information and ex-post performance pairs to find out which set of assumptions tends to do best in what clusters? As a cluster analysis on regression criteria, it’s going to be an expensive computation; it could tie up a good workstation for several weeks. But the results would continue to be useful for years.
But anyway, that is a topic for a different paper. In this paper I have introduced the Kelly Criterion itself and how to apply it in complicated situations. This creates a need to make simplifying assumptions, but they don’t have to be the same assumptions that the authors of so many other papers have made. My point is that you have to pick which simplifying assumptions you make based on the business situations you’re presented with, and you can frequently do better in some situations by using different sets of assumptions more appropriate to those situations.
But picking the assumptions is not, as usually presented, a problem that completely defies analysis either. There is a possibility of “closing the loop” and creating a system that picks and chooses its assumptions without human input, based on the business situations presented.