HOW CAN THE PERFORMANCE & RISK OF EARLYBIRD BE MEASURED?
There are many ways to measure the performance of a system. Much of how a person sees this is dependant upon his/her risk philosophy. One common way is to look at the profits generated in relation to the maximum drawdown, because it is this maximum which should at least partly determine your capitalization. If one wants no more than a potential 50% drawdown, then one way to capitalize is at 2 times the maximum drawdown. This would require an account of around $30,000/S&P contract (approx. $6,000 for the e-mini). This method of capitalization assumes a
trader starts trading the system at the worst possible moment, immediately
incurring a maximum historical drawdown. Considering where one is in
the equity cycle is also helpful. If you are starting during a drawdown, then
it is possible to capitalize at a lower amount. Historic performance could
then be measured in relation to this capitalization.
Another consideration regarding risk is the amount of time one is
exposed to uncertainty by being in a trade. The less time one is trading,
the less risk one has an adverse event may occur. EarlyBird is in the
market approximately 10% of the total time the S&P day session is open,
a small amount of time, considering the reward potential.
Certainly, as far as estimating the health and expected profitability
of a system, it is the most telling to compare its real-time* out-of-sample
performance to hypothetical in-sample results. It is an easy matter to
come up with a purely hypothetical track record that looks good on paper,
but the challenge, of course, is to do well in real-time. As mentioned on
the home page, since EarlyBird was developed & first traded in April, 1999,
monthly hypothetical profit has been over50% of the study period (about $1670), with a drawdown of about 3 times as great. This compares reasonably well with the study period.
Other methods of measuring performance and risk involve statistical analysis. One measure is a number called the K-ratio, developed by Lars Kestner, and discussed in his article in the March, 1996 volume of Technical Analysis of Stocks and Commodities (pgs. 46-50). He designed it to answer the question, "How will the system perform in the future," and to remedy flaws in the popular Sharpe Ratio. Kestner states that "The K-ratio detects inconsistency in returns." Essentially, the ratio quantifies the "swinginess" or volatility of the equity curve in relation to a regression line: "The K-ratio uses linear regression techniques to measure the consistency of results through time." It is therefore a measure of return compared to risk. The higher the K-ratio, the better reward for the risk. Kestner states that typical values of the ratio fall between -5 and +5, & that he looks "for systems with an average K-ratio of 1.0 or better for individual commodities..." The K-ratio of EarlyBird's real-time performance is currently a very respectable 2.7.
Another method of looking at risk is the "risk of ruin" calculation detailed by Perry Kauffman in Smarter Trading (from Ralph Vince). This formula looks at average win/loss and percent win/loss to determine what the chances are a certain loss ("ruin") will be incurred. Assuming a $30000 size account and using $200 slippage/commission for the period of real-time S&P results yields the following:



drawdown chance of this drawdown occurring


$3000 (10% of account)
47%
$18000 (60%) 1%
A final method has to do with an analysis of losing trades, and losing trade runs or drawdowns. I am not trained in statistics, but one method which a statistician taught me and which makes sense is to make several lists of losing trades, first single losers, then drawdowns over 2 trades, 3 trades, and so on. Each list is analyzed for average loser and standard deviation. Then the standard deviation of each list is doubled and added to its average loser. This gives a number that statistically should encompass 95% of the occurrences (2 standard deviations); put another way, we have a 95% confidence level that this number will not be surpassed. Also included are the 3 standard deviation numbers, which give a 99.5% confidence level. This approach overcomes the limitation of other measures (e.g. Sterling, Sharpe, and K-ratio) which include drawups, rather than just drawdowns; these other methods penalize a system for having sudden upswings in equity. The current results (using $200 slippage/ commission, and again using only "real-time" S&P trades) are as follows:


# of trades in drawdown run avg. loss 2 std dev. Drawdown
3 std dev. Drawdown
These numbers indicate that it is reasonable to expect a drawdown would rarely exceed $13463 on closed trades (5% chance). (Taking all drawdowns from 6 to 28 trades long, the average loss is 3619, the 2 sd is 10721, and the 3 sd is 14272.) They are also very close to the results of the Kaufmann numbers (5% chance of a $12000 drawdown on a $30000 account).
Again, please note that the above measures are all based upon the out-of-sample real-time* performance period, a fact which makes them much more reliable than using the in-sample performance.
Statistics make no guarantee, but they are suggestive. With a reasonable idea of drawdown, it is possible to determine capitalization, and then return/risk.