Machine Learning for Trading

Logo

A comprehensive introduction to how ML can add value to the design and execution of algorithmic trading strategies

View the Project on GitHub stefan-jansen/machine-learning-for-trading

Efficient data storage with pandas

The notebook storage_benchmark compares the main storage formats for efficiency and performance.

In particular, it compares:

It uses a test DataFrame that can be configured to contain numerical or text data, or both. For the HDF5 library, we test both the fixed and table format. The table format allows for queries and can be appended to.

Test Results

In short, the results are:

The notebook illustrates how to configure, test, and collect the timing using the %%timeit cell magic. At the same time demonstrates the usage of the related pandas commands required to use these storage formats.