Machine Learning for Trading

Logo

A comprehensive introduction to how ML can add value to the design and execution of algorithmic trading strategies

View the Project on GitHub stefan-jansen/machine-learning-for-trading

How to process AlgoSeek intraday NASDAQ 100 data

You can download a sammle of Algoseek’s NASDAQ100 Minute Bar data with Trade & Quote information for 2015-2017 from Algoseek’s website here. The notebook algoseek_minute_data contains the code to extract and combine the data that we will use in Chapter 12 to develop a Gradient Boosting model that predicts one-minute returns for an intraday trading strategy.

Unzip the directory, rename it 1min_taq, and move it into a new nasdaq100 folder in the data directory. It contains around 5GB worth of NASDAQ 100 minute bar data in trade-and-quote format. See documentation for details on the definition of the numerous fields. The following information is from the Algoseek Trade & Quote Minute Bar data linked above.

Trade & Quote Minute Bar Fields

The Quote fields are based on changes to the NBBO (National Best Bid Offer) from the top-of-book price and size from each of the exchanges.

The enhanced Trade & Quote bar fields include the following fields:

id Field Q/T Type No Value Description
1 Date   YYYYMMDD Never Trade Date
2 Ticker   String Never Ticker Symbol
3 TimeBarStart   HHMM
HHMMSS
HHMMSSMMM
Never For minute bars: HHMM.
For second bars: HHMMSS.
Examples
- One second bar 130302 is from time greater than 130301 to 130302.
- One minute bar 1104 is from time greater than 1103 to 1104.
4 OpenBarTime Q HHMMSSMMM Never Open Time of the Bar, for example one minute:
11:03:00.000
5 OpenBidPrice Q Number Never NBBO Bid Price as of bar Open
6 OpenBidSize Q Number Never Total Size from all Exchanges with
OpenBidPrice
7 OpenAskPrice Q Number Never NBBO Ask Price as of bar Open
8 OpenAskSize Q Number Never Total Size from all Exchange with
OpenAskPrice
9 FirstTradeTime T HHMMSSMMM Blank Time of first Trade
10 FirstTradePrice T Number Blank Price of first Trade
11 FirstTradeSize T Number Blank Number of shares of first trade
12 HighBidTime Q HHMMSSMMM Never Time of highest NBBO Bid Price
13 HighBidPrice Q Number Never Highest NBBO Bid Price
14 HighBidSize Q Number Never Total Size from all Exchanges with HighBidPrice
15 AskPriceAtHighBidPrice Q Number Never Ask Price at time of Highest Bid Price
16 AskSizeAtHighBidPrice Q Number Never Total Size from all Exchanges with AskPriceAtHighBidPrice
17 HighTradeTime T HHMMSSMMM Blank Time of Highest Trade
18 HighTradePrice T Number Blank Price of highest Trade
19 HighTradeSize T Number Blank Number of shares of highest trade
20 LowBidTime Q HHMMSSMMM Never Time of lowest Bid
21 LowBidPrice Q Number Never Lowest NBBO Bid price of bar.
22 LowBidSize Q Number Never Total Size from all Exchanges with LowBidPrice
23 AskPriceAtLowBidPrice Q Number Never Ask Price at lowest Bid price
24 AskSizeAtLowBidPrice Q Number Never Total Size from all Exchanges with AskPriceAtLowBidPrice
25 LowTradeTime T HHMMSSMMM Blank Time of lowest Trade
26 LowTradePrice T Number Blank Price of lowest Trade
27 LowTradeSize T Number Blank Number of shares of lowest trade
28 CloseBarTime Q HHMMSSMMM Never Close Time of the Bar, for example one minute: 11:03:59.999
29 CloseBidPrice Q Number Never NBBO Bid Price at bar Close
30 CloseBidSize Q Number Never Total Size from all Exchange with CloseBidPrice
31 CloseAskPrice Q Number Never NBBO Ask Price at bar Close
32 CloseAskSize Q Number Never Total Size from all Exchange with CloseAskPrice
33 LastTradeTime T HHMMSSMMM Blank Time of last Trade
34 LastTradePrice T Number Blank Price of last Trade
35 LastTradeSize T Number Blank Number of shares of last trade
36 MinSpread Q Number Never Minimum Bid-Ask spread size. This may be 0 if the market was crossed during the bar.
If negative spread due to back quote, make it 0.
37 MaxSpread Q Number Never Maximum Bid-Ask spread in bar
38 CancelSize T Number 0 Total shares canceled. Default=blank
39 VolumeWeightPrice T Number Blank Trade Volume weighted average price
Sum((Trade1SharesPrice)+(Trade2SharesPrice)+…)/TotalShares.
Note: Blank if no trades.
40 NBBOQuoteCount Q Number 0 Number of Bid and Ask NNBO quotes during bar period.
41 TradeAtBid Q,T Number 0 Sum of trade volume that occurred at or below the bid (a trade reported/printed late can be below current bid).
42 TradeAtBidMid Q,T Number 0 Sum of trade volume that occurred between the bid and the mid-point:
(Trade Price > NBBO Bid ) & (Trade Price < NBBO Mid )
43 TradeAtMid Q,T Number 0 Sum of trade volume that occurred at mid.
TradePrice = NBBO MidPoint
44 TradeAtMidAsk Q,T Number 0 Sum of ask volume that occurred between the mid and ask:
(Trade Price > NBBO Mid) & (Trade Price < NBBO Ask)
45 TradeAtAsk Q,T Number 0 Sum of trade volume that occurred at or above the Ask.
46 TradeAtCrossOrLocked Q,T Number 0 Sum of trade volume for bar when national best bid/offer is locked or crossed.
Locked is Bid = Ask
Crossed is Bid > Ask
47 Volume T Number 0 Total number of shares traded
48 TotalTrades T Number 0 Total number of trades
49 FinraVolume T Number 0 Number of shares traded that are reported by FINRA.
Trades reported by FINRA are from broker-dealer internalization, dark pools, Over-The-Counter, etc.
FINRA trades represent volume that is hidden or not public available to trade.
50 UptickVolume T Integer 0 Total number of shares traded with upticks during bar.
An uptick = ( trade price > last trade price )
51 DowntickVolume T Integer 0 Total number of shares traded with downticks during bar.
A downtick = ( trade price < last trade price )
52 RepeatUptickVolume T Integer 0 Total number of shares where trade price is the same (repeated) and last price change was up during bar.
Repeat uptick = ( trade price == last trade price ) & (last tick direction == up )
53 RepeatDowntickVolume T Integer 0 Total number of shares where trade price is the same (repeated) and last price change was down during bar.
Repeat downtick = ( trade price == last trade price ) & (last tick direction == down )
54 UnknownVolume T Integer 0 When the first trade of the day takes place, the tick direction is “unknown” as there is no previous Trade to compare it to.
This field is the volume of the first trade after 4am and acts as an initiation value for the tick volume directions.
In future this bar will be renamed to UnkownTickDirectionVolume .

Notes

Empty Fields

An empty field has no value and is “Blank” , for example FirstTradeTime and there are no trades during the bar period. The field Volume measuring total number of shares traded in bar will be 0 if there are no Trades (see No Value column above for each field).

No Bid/Ask/Trade OHLC

During a bar timeframe there may not be a change in the NBBO or an actual Trade. For example, there can be a bar with OHLC Bid/Ask but no Trade OHLC.

Single Event

For bars with only one trade, one NBBO bid or one NBBO ask then Open/High/Low/Close price,size andtime will be the same.

AskPriceAtHighBidPrice, AskSizeAtHighBidPrice, AskPriceAtLowBidPrice, AskSizeAtLowBidPrice Fields

To provide consistent Bid/Ask prices at a point in time while showing the low/high Bid/Ask for the bar, AlgoSeek uses the low/high Bid and the corresponding Ask at that price.

FAQ

Why are Trade Prices often inside the Bid Price to Ask Price range?

The Low/High Bid/Ask is the low and high NBBO price for the bar range. Very often a Trade may not occur at these prices as the price may only last a few seconds or executions are being crossed at mid-point due to hidden order types that execute at mid-point or as price improvement over current Bid/Ask.

How to get exchange tradable shares?

To get the exchange tradable volume in a bar subtract Volume from FinraVolume.

When a trade is done that is off the listed exchanges, it must be reported to FINRA by the brokerage firm or dark pool. Examples include: