It's been a decades-long experiment to throw darts at a dartboard and see if it outperforms the picks of experts. Darts win and lose at the same rate as the average of experts.

Some people clearly make money in financial markets but the old saying among brokers = 'we make money selling stocks, not buying them' - still applies.

Yet someone must know what they are doing. And backtesting is a good way to seem empirical. Example: Your financial advisor calls you up to suggest a new
investment scheme. Drawing on 20 years of data, he has set his
computer to work on this question: If you had invested according to
this scheme in the past, which portfolio would have been the best?
His model assembles thousands of such simulated portfolios and
calculated for each one an industry-standard measure of return on
risk. Out of this gargantuan calculation, your advisor has chosen the
optimal portfolio. After briefly reminding you of the oft-repeated
slogan that "past performance is not an indicator of future results",
the advisor enthusiastically recommends the portfolio, noting that it
is based on sound mathematical methods. Should you invest?

If that isn't a metaphor for any numerical model-driven projection, it's hard to know what is.

The answer is, probably not. This backtesting - examining a huge
number of sample past portfolios - might seem
like a good way to zero in on the best future portfolio, but if the
number of portfolios in the backtest is so large as to be out of
balance with the number of years of data in the backtest, the
portfolios that look best are actually just those that target extremes
in the dataset. When an investment strategy "overfits" a backtest in
this way, the strategy is not capitalizing on any general financial
structure but is simply highlighting vagaries in the data.

"Recent computational advances allow investment managers to
methodically search through thousands or even millions of potential
options for a profitable investment strategy," the authors write. "In
many instances, that search involves a pseudo-mathematical argument
which is spuriously validated through a backtest."

Unfortunately, the overfitting of backtests is commonplace not only in
the offerings of financial advisors but also in research papers in
mathematical finance. One way to lessen the problems of backtest
overfitting is to test how well the investment strategy performs on
data outside of the original dataset on which the strategy is based;
this is called "out-of-sample" testing. However, few investment
companies and researchers do out-of-sample testing.

The design of an investment strategy usually starts with identifying a
pattern that one believes will help to predict the future value of a
financial variable. The next step is to construct a mathematical
model of how that variable could change over time. The number of ways
of configuring the model is enormous, and the aim is to identify the
model configuration that maximizes the performance of the investment
strategy. To do this, practitioners often backtest the model using
historical data on the financial variable in question.

They also rely
on measures such as the "Sharpe ratio", which evaluates the
performance of a strategy on the basis of a sample of past returns.

But if a large number of backtests are performed, one can end up
zeroing in on a model configuration that has a misleadingly good
Sharpe ratio. As an example, the authors note that, for a model based
on 5 years of data, one can be misled by looking at even as few as 45
sample configurations. Within that set of 45 configurations, at least
one of them is guaranteed to stand out with a good Sharpe ratio for
the 5-year dataset but will have a dismal Sharpe ratio for
out-of-sample data.

The authors note that, when a backtest does not report the number of
configurations that were computed in order to identify the selected
configuration, it is impossible to assess the risk of overfitting the
backtest. And yet, the number of model configurations used in a
backtest is very often not revealed---neither in academic papers on
finance, nor by companies selling financial products.

"[W]e suspect
that a large proportion of backtests published in academic journals
may be misleading," the authors write. "The situation is not likely
to be better among practitioners. In our experience, overfitting is
pathological within the financial industry." Later in the article
they state: "We strongly suspect that such backtest overfitting is a
large part of the reason why so many algorithmic or systematic hedge
funds do not live up to the elevated expectations generated by their
managers."

Probably many fund managers unwittingly engage in backtest overfitting
without understanding what they are doing, and their lack of knowledge
leads them to overstate the promise of their offerings. Whether this
is fraudulent is not so clear. What is clear is that mathematical
scientists can do much to expose these problematic practices---and
this is why the authors wrote their article.

"[M]athematicians in the
twenty-first century have remained disappointingly silent with regard
to those in the investment community who, knowingly or not, misuse
mathematical techniques such as probability theory, statistics, and
stochastic calculus," they write. "Our silence is consent, making us
accomplices in these abuses."

Article: "Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance", which will appear in the May 2014 issue of the NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY. The authors are David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, and Qiji Jim Zhu. Source: American Mathematical Society