Machine Learning in Trading – Down the rabbit hole

Blog entry by newly appointed Head of Analysis, Kevin Wong

During the last few years, there has been an increased attention to machine learning. As seen in the graph, interest really took off in the summer of 2016.

At Lind Capital, we see ourselves as open-minded, and therefore we had a natural curiosity to find out whether there was anything useful to us within this field.

Operating within the financial markets means operating in a very challenging environment. Lind Capital does not have external clients, hence we trade our own capital. We find small inefficiencies, trade frequently with a high degree of leverage in order to generate both high and stable returns. This requires high demands for our systems, analysis and traders.

Technology serves as the basis for our trades. We develop our own models and are very data-driven. Therefore, it is only natural to us that we continuously extend our expertise and competencies to new technologies and techniques, such as e.g. machine learning.

So what is this machine learning thing? To quote an expert in machine learning:

“Machine learning is the science of getting computers to act without being explicitly programmed” – Andrew Ng

The general definition of the field is quite broad and encompasses many other areas that are not directly related to trading.

To narrow it down, we define it as:

“The study for which to find a function, or a relationship, that maps from input X to output Y.”

In this way, it is similar to applying statistical and econometric models to forecast future price movements in financial markets, and therefore one can think of machine learning as expanding the toolbox to more exotic models.

One big difference between machine learning in trading and machine learning in places as Google, Facebook etc., is the amount of data and the signal quality in data. For a given stock, one might only have around 5,000 daily observations. If you combine that with a low signal-to-noise ratio the job suddenly becomes quite challenging. Pictures of cats have little noise compared to financial markets.

We began our journey by constructing a benchmark for which to evaluate the machine learning models on. On our first attempt, we simply used our trading signals as features in the model, turned a few nobs, and compared its predictions to our own.

The initial results piqued our interest. The results were positive. However, it was difficult to improve the initial results outside the training set.

It was time to read up.

We began with the general machine learning literature for advice. Much of it was irrelevant to our specific problems. Only after sorting through the hype, did we find guidance.

You see, most machine learning models assume a structured dataset with predefined labels. This is called supervised learning.

For the structured dataset approach, a rough stepwise process is as follows:

1. Define your features, aka. independent variables, X’s so to speak.
2. Define your label, aka. the dependent variable, Y.
3. Choose your model, e.g. Random Forest, XGBoost, ANN etc.
4. Choose your objective function, mean squared errors, accuracy, log-loss etc.
5. Calibrate your model. Use time-series cross-validation, ensuring no data leakage with minimum overlapping labels. In short, make sure the model is correctly calibrated and not overfit the sample data.
6. The final test, verify results on virgin data.

Many articles enjoy discussing step 3. Deep Learning, XGBoost! Hype anyone?

Our experience tells us that if you want to have a genuinely working model, steps 1, 2, 4 and 5 must not be overlooked.

When the features are rubbish, the model will be garbage. If the label is inherently hard to predict, you are making life harder than it has to be. Remember to find a reason to believe in the dataset and leave wishful thinking to those that can afford it. Test and carefully select the objective function that reflects your expectations. If you are careless, you will be wasting time and you might let an otherwise solid model escape you.

Calibrate your model while making sure not to commit common modeling mistakes such as data leakage and selection bias. Proper practice is fundamental to create a truly successful machine learning model.

The steps are outlined sequentially, but are in practice continuously revisited, analyzed, transformed and tuned. In truth, the process is all but linear.

In the end, proper modeling in trading takes effort. To quote an industry expert:

“Finance is not a plug-and-play subject as it relates to ML applications. Anyone who tells you otherwise will waste your time and money.” – Marcos Lopez de Prado, Advances in Financial Machine Learning

At Lind Capital, we are always on the lookout to gain an edge. In every business area of the company, we always strive to challenge the existing and find new and better ways of doing business. Whether it is  trading, compliance, technology, or analysis. It is a constant battle to be ahead in the financial markets and necessitates a will to win in all aspects of Lind Capital.

Share now