Machine Learning for Trading

Logo

A comprehensive introduction to how ML can add value to the design and execution of algorithmic trading strategies

View the Project on GitHub stefan-jansen/machine-learning-for-trading

Chapter 23 - Next Steps

In this concluding chapter, we will briefly summarize the key tools, applications, and lessons learned throughout the book to avoid losing sight of the big picture after so much detail. We will then identify areas that we did not cover but would be worthwhile to focus on as you expand on the many machine learning techniques we introduced and become productive in their daily use. In sum, in this chapter, we will

Content

  1. Key Takeaways and Lessons Learned
  2. Machine Learning for Trading in Practice

Key Takeaways and Lessons Learned

Important insights to keep in mind as you proceed to the practice of machine learning for trading include:

Data is the single most important ingredient

A key insight is that state-of-the-art ML techniques like deep neural networks are successful because their predictive performance continues to improve with more data. On the flip side, model and data complexity need to match to balance the bias-variance trade-off, which becomes more challenging the higher the noise-to-signal ratio of the data. Managing data quality and integrating data sets are key steps in realizing the potential value.

Domain expertise: separate the signal from the noise

We emphasized that informative data is a necessary condition for successful ML applications. However, domain expertise is equally essential to define the strategic direction, select relevant data, engineer informative features, and design robust models.

ML is a toolkit for solving problems with data

Machine learning offers algorithmic solutions and techniques that can be applied to many use cases. Parts 2, 3 and 4 of the book have presented machine learning as a diverse set of tools that can add value to various steps of the strategy process, including

Beware of backtest overfitting

We covered the risks of false discoveries due to overfitting to historical data repeatedly throughout the book. Chapter 5, on strategy evaluation, lays out the main drivers and potential remedies. The low noise-to-signal ratio and relatively small datasets (compared to web-scale image or text data) make this challenge particularly serious in the trading domain. Awareness is critical since the ease of access to data and tools to apply ML increases the risks significantly.

There are no easy answers because the risks are inevitable. However, we presented methods to adjust backtest metrics to account for repeated trials such as the deflated Sharpe ratio. When working towards a live trading strategy, staged paper-trading, and closely monitored performance during execution in the market need to be part of the implementation process.

How to gain insights from black-box models

Deep neural networks and complex ensembles can raise suspicion when they are considered impenetrable black-box models, in particular in light of the risks of backtest overfitting. We introduced several methods to gain insights into how these models make predictions in Chapter 12, Boosting Your Trading Strategy.

In addition to conventional measures of feature importance, the recent game-theoretic innovation of SHapley Additive exPlanations (SHAP) is a significant step towards understanding the mechanics of complex models. SHAP values allow for the exact attribution of features and their values to predictions so that it becomes easier to validate the logic of a model in the light of specific theories about market behavior for a given investment target. Besides justification, exact feature importance scores and attribution of predictions allow for deeper insights into the drivers of the investment outcome of interest.

Machine Learning for Trading in Practice

As you proceed to integrate the numerous tools and techniques into your investment and trading process, there are numerous things you can focus your efforts on. If your goal is to make better decisions, you should select projects that are realistic yet ambitious given your current skill set. This will help you to develop an efficient workflow underpinned by productive tools and gain practical experience.

Data management technologies

The central role of data in the ML4T process requires familiarity with a range of technologies to store, transform, and analyze data at scale, including the use of cloud-based services like Amazon Web Services, Microsoft Azure, and Google Cloud.

Machine learning tools

We covered many libraries of the Python ecosystem in this book. Python has evolved to become the language of choice for data science and machine learning. The set of open-source libraries continues to both diversify and mature, and are built on the robust core of scientific computing libraries NumPy and SciPy.

There are several providers that aim to facilitate the machine learning workflow:

There are also several open-source initiatives led by companies that build on and expand the Python ecosystem:

Online trading platforms

The main options to develop trading strategies that use machine learning are online platforms, which often look for and allocate capital to successful trading strategies.

Popular solutions include

In addition, Interactive Brokers offers a Python API that you can use to develop your own trading solution.

Alpaca offers commission-free execution of algorithmic trading strategies. Several libraries provide integration:

Backtrader is intended for both backtesting and trading with multiple broker integrations.