Skip to main content

Python interface to financial data provided by Norgate Data

Project description

alt text

Python interface to financial market data provided by Norgate Data.

Installation

pip install norgatedata

Upgrades

To receive upgrades/updates

pip install norgatedata --upgrade

Requirements

  • Python 3.5 or above
  • Microsoft Windows
  • Either NumPy or Pandas
  • Active Norgate Data subscription
  • Writable local user folder named .norgatedata (or defined in environment variable NORGATEDATA_ROOT)

Usage

import norgatedata

Timeseries data

Price

Price data is provided in multiple formats: NumPy recarray, NumPy ndarray or Pandas DataFrame. This is determined through the format parameter. If not specified, the default is NumPy recarray.

Dates are determined by passing in any (or none) of the following named parameters:

# Date can be provided as a string (YYYY-MM-DD or YYYYMMDD format), datetime, Pandas Timestamp, or NumPy datetime64.  All types work!
start_date = '1990-01-01'   
end_date = '2000-01-01'   # If not specified, the end date is today
limit = 50  # This provides the last X records

Price & Volume adjustment allows you to adjust historical stock prices and volumes to account for the effect of capital events and dividends.

priceadjust = norgatedata.StockPriceAdjustmentType.NONE
priceadjust = norgatedata.StockPriceAdjustmentType.CAPITAL
priceadjust = norgatedata.StockPriceAdjustmentType.CAPITALSPECIAL
priceadjust = norgatedata.StockPriceAdjustmentType.TOTALRETURN # Default

Date padding allows you to repeat the prior close on days where no price record would otherwise exist.

padding_setting = norgatedata.PaddingType.NONE  # Default
padding_setting = norgatedata.PaddingType.ALLMARKETDAYS
padding_setting = norgatedata.PaddingType.ALLWEEKDAYS
padding_setting = norgatedata.PaddingType.ALLCALENDARDAYS

Columns returned include Date, Open, High, Low, and Close. For certain instruments, additional columns are provided where applicable including: Volume (Stocks, some Indices, some Indicators, Futures), Turnover (Stocks, some Indices, some Indicators), Unadjusted Close (Stocks), Dividend (Stocks), Open Interest (Futures, ASX Exchange Traded Options), Delivery Month (Continuous Futures).

Examples

import norgatedata
priceadjust = norgatedata.StockPriceAdjustmentType.TOTALRETURN 
padding_setting = norgatedata.PaddingType.NONE   
symbol = 'AAPL'
start_date = '1990-01-01'
timeseriesformat = 'numpy-recarray'

# Provides data on GOOG from 1990 until today in 
# a NumPy recarray format, with explicitly set stock price 
# adjustment and padding settings
pricedata_recarray = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    format=timeseriesformat)

# Now in a Pandas-compatible format
timeseriesformat = 'pandas-dataframe'
start_date = pd.Timestamp('1900-01-01') # we can also provide dates as a Pandas Timestamp
pricedata_dataframe = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    format=timeseriesformat)

# Now in a Numpy Ndarray
timeseriesformat = 'numpy-ndarray'
start_date = np.datetime64('1900-01-01') # we can also provide dates as a Numpy datetime64
pricedata_ndarray = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    format=timeseriesformat)

# Now limiting results to final 500 bars
pricedata_dataframe = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    limit=500,
    format=timeseriesformat)

# Now limiting results to a specific date range
end_date='1999-12-31'
pricedata_dataframe = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    end_date = end_date,
    format=timeseriesformat)

# Now using a numerical unchanging assetid instead of a symbol
timeseriesformat = 'pandas-dataframe'
assetid = 136817
pricedata_dataframe = norgatedata.price_timeseries(
    assetid,
    limit=500,
    format=timeseriesformat)

Dividend Notes

Dividends shown in the price_timeseries Dividend column depending on whether or not they have already been accounted for in the data by the "Price Adjustment" method.

  • If the data is adjusted for capital reconstructions only, then the sum of both special and ordinary dividends for that day are shown.
  • If the data is adjusted for capital reconstructions and special dividends, then only the sum of ordinary dividends for that day are shown.
  • If the data is adjusted for capital reconstructions, special dividends and ordinary dividends, then no information is shown.

Dividend/Distribution information is shown as of the day before the ex-date - i.e. if you are holding the security at the close, you will be entitled to the dividend/distribution.

Index Constituent

To determine whether a stock was an index constituent on a particular date, you can use the index constituent timeseries function. You can also pass in an existing NumPy ndarray or Pandas Dataframe and a new column will be added and returned

symbol = 'AAPL'
indexname = 'Russell 3000'
indexname = 'S&P 500'  # Can also be an index symbol, such as $SPX, $RUI etc.

idx = norgatedata.index_constituent_timeseries(
    symbol,
    indexname,
    format = "numpy-recarray")

idx = norgatedata.index_constituent_timeseries(
    symbol,
    indexname,
    padding_setting = padding_setting,
    start_date = start_date,
    limit = -1,
    format = "numpy-ndarray")

idx = norgatedata.index_constituent_timeseries(
    symbol,
    indexname,
    padding_setting = padding_setting,
    start_date = start_date,
    limit = -1,
    format = "pandas-dataframe")

pricedata_recarray2 = norgatedata.index_constituent_timeseries(
    symbol,
    indexname,
    padding_setting = padding_setting,
    start_date = start_date,
    limit = -1,
    numpy_recarray = pricedata_recarray,
    format = "numpy-recarray")

Major Exchange Listed

majexch = norgatedata.major_exchange_listed_timeseries(
    symbol,
    format = "numpy-recarray")

Provides indication about US stocks on whether they are listed on a major exchange (e.g. NYSE, Nasdaq, NYSE American, NYSE Arca, Cboe BZX, IEX) (value = 1) or as an OTC/Pink Sheet stock (value = 0) for each trading date is available for all Equities and ETPs.

Note: Data is only available for this item from 2000 onwards.

Capital Event

capevent = norgatedata.capital_event_timeseries(
    symbol,
    format = "numpy-recarray")

This indicator will show when a capital event occurred. Effective on holding the security at the close on the day prior to the ex-date. Events include splits, reverse splits, bonus issues, stock dividends (dividends paid as stock) and complex reorganizations of capital (value = 1, otherwise if there is no event, value = 0)

Dividend Yield

divyield = norgatedata.dividend_yield_timeseries(
    symbol,
    format = "numpy-recarray")

This indicator uses a trailing 12 month sum of all split-adjusted ordinary dividends and is calculated daily against the close price. New dividends are incorporated on the entitlement date (the trading day prior to the ex-dividend date) after which the trailing 12 month sum of dividends is recalculated. Special dividends, distributions and spin-offs are not included. The lookback period is adaptive to take into account slight variations in ex-dividend dates from year-to-year.

Padding Status

paddingstatus = norgatedata.padding_status_timeseries(
    symbol,
    format = "numpy-recarray")

This indicator will show when a price record has been padded in accordance with the Date Padding setting. If the Date Padding is set to "No padding" then this indicator will not return any values.

Unadjusted Close

This is not normally needed, as Unadjusted Close is provided in the price timeseries.
It is provided here to be used as a helper routine for other Python libraries such as zipline-norgatedata.

unadjclose = norgatedata.unadjusted_close_timeseries(
    symbol,
    format = "numpy-recarray")

Watchlists

The symbols of a watchlist can be retrieved into a Python list using the watchlist_symbols function

watchlistname = 'S&P 500'
symbols = norgatedata.watchlist_symbols(watchlistname)

watchlistname = 'Russell 3000 Current & Past'
symbols = norgatedata.watchlist_symbols(watchlistname)

If you want the symbol, assetid and name of each security, use the watchlist function

wlcontents = norgatedata.watchlist(watchlistname)

To retrieve the names of all of the watchlists within Norgate Data's watchlist library, use the watchlists function

allwatchlistnames = norgatedata.watchlists()

Security metadata

symbol = 'AMZN'
assetid = norgatedata.assetid(symbol)

Provides a unique unchanging ID generated by Norgate.

assetid = 129769 
symbol = norgatedata.assetid(symbol)

Translates assetid to the current symbol.

domicile = norgatedata.domicile(symbol)

Provides the country code for the domicile of the company.

currency = norgatedata.currency(symbol)

Currency that the security trades in.

exchange_name = norgatedata.exchange_name(symbol)

Short name of the exchange for the security (e.g. NYSE, Nasdaq, NYSE Arca, NYSE American, ASX etc.)

exchange_name_full = norgatedata.exchange_name_full(symbol)

Provides long name of the exchange (e.g. New York Stock Exchange, Australian Securities Exchange etc.)

security_name = norgatedata.security_name(symbol)

Provides the name of the security. e.g. GE would return General Electric Co Common.

base_type = norgatedata.base_type(symbol)

Provides the base type of a security. Values include Stock Market, Futures Market, Commodity Cash & Fowards, Foreign Exchange, Economic.

subtype1 = norgatedata.subtype1(symbol)

Provides subtype1 of the security. Values include Equity, Hybrid, Derivative, Debt, Exchange Traded Product, Business Activity, Employment, Prices, Money, National Accounts, Index, Currency Cross, Bullion Cross, Cryptocurrency.

subtype2 = norgatedata.subtype2(symbol)

Provides subtype2 of the security. Values include Operating/Holding Company, Investment Company, Special Purpose Copmany, Exchange Traded Note, Structured Product, Exchange Traded Fund, Exchange Traded Managed Fund, Third Party Trust Preferred, Corporate Unit, Convertible Corporate Unit, Convertible Preferred, Preferred, Convertible Debt, Exchange Traded Option, Right, Company OPtion, Warrant, Senior Debt, Junior Debt, Coin, Token.

subtype3 = norgatedata.subtype3(symbol)

Provides subtype3 of the security. Values include Master Limited Partnership, Royalty Trust, Infrastructure Fund, Closed End Fund, Other Listed Managed Investment, Busindess Development Company/Pooled Development Fund, Other Listed Investment Vehicle, Absolute Return Fund, Equity Unit, Right, Contingent Value Right, Litigation Trust, Liquidation Trust, Special Purpose Acquisition Company.

financial_summary = norgatedata.financial_summary(symbol)

Provides a few paragraphs summarising the current financial status of the company.

business_summary = norgatedata.business_summary(symbol)

Provides a few paragraphs summarising the operations of the company.

last_quoted_date = norgatedata.last_quoted_date(symbol)
last_quoted_date = norgatedata.last_quoted_date(symbol,format = 'iso')
last_quoted_date = norgatedata.last_quoted_date(symbol,format = 'pandas-timestamp')
last_quoted_date = norgatedata.last_quoted_date(symbol,format = 'numpy-datetime64')
last_quoted_date = norgatedata.last_quoted_date(symbol,format = 'datetime')

Provides a date for the last day of trading that have a finite lifespan (such as futures). For delisted instruments, this provides the final day of trading. By default the format is an ISO string (YYYY-MM-DD), but you can also specify other formats too. Will provide None if there is no last quoted date.

second_last_quoted_date = norgatedata.second_last_quoted_date(symbol)
second_last_quoted_date = norgatedata.second_last_quoted_date(symbol,format = 'iso')
second_last_quoted_date = norgatedata.second_last_quoted_date(symbol,format = 'pandas-timestamp')
second_last_quoted_date = norgatedata.second_last_quoted_date(symbol,format = 'numpy-datetime64')
second_last_quoted_date = norgatedata.second_last_quoted_date(symbol,format = 'datetime')

Provides a date for the second last day of trading that have a finite lifespan (such as futures). For delisted instruments, this provides the trading day prior to the last quoted day. Will provide None if there is no last quoted date.

Futures metadata

symbol='CL-2017X'
tick_size = norgatedata.tick_size(symbol)

Provides the current tick value (i.e. the change in value of one contract for a single tick that the futures contract moves.) for a given market.

symbol='CL-2017X'
point_value = norgatedata.point_value(symbol)

Provides the point value (i.e. the change in value of one contract for whole point that the futures contract moves.) for a given futures market.

symbol='CL-2017X'
margin = norgatedata.margin(symbol)

Provides the current margin for a given futures contract, futures market session symbol or futures market symbol.

symbol='CL-2017X'
first_notice_date = norgatedata.first_notice_date(symbol)
first_notice_date = norgatedata.first_notice_date(symbol,format = 'iso')
first_notice_date = norgatedata.first_notice_date(symbol,format = 'pandas-timestamp')
first_notice_date = norgatedata.first_notice_date(symbol,format = 'numpy-datetime64')
first_notice_date = norgatedata.first_notice_date(symbol,format = 'datetime')

For deliverable commodity contracts that permit delivery prior to the end of trading, this is the first date that delivery notices can be provided for a given futures contract. For futures contracts that are not deliverable (or only deliverable at the end of trading) then None is provided.

symbol='CL-2017X'
lowest_ever_tick_size = norgatedata.lowest_ever_tick_size(symbol)

Provides the lowest ever tick size for a given futures market.

symbol='FDAX-2019Z'
sessiontype = norgatedata.session_type(symbol)
# Returns "Electronic"

symbol='FDAX9-2019Z'
sessiontype = norgatedata.session_type(symbol)
# Returns "Day (Last)"
market_symbol='CL'
marketname = futures_market_name(market_symbol)

Provides the name of the futures market

session_symbol='FDAX9'
session_name = futures_market_session_name(session_symbol)

Provides the name of the futures market session

Note: For all futures metadata the "symbol" can be individual futures contract, a continuous contract symbol, the futures market session symbol or the futures market symbol. (e.g. FDAX9-2019Z, &FDAX9, FDAX9, FDAX respectively). The only exception to this is any date-related metadata, which can only be performed on the individual futures contract.

Fundamental data

symbol = 'GE'
# A selection of field examples - many more are available
fieldname = 'mktcap'
fieldname = 'ttmepsxlcx'
fieldname = 'peexclxor'
fieldname = 'projepsq'
fundavalue,fundadate = norgatedata.fundamental(symbol,fieldname)
fundavalue,fundadate = norgatedata.fundamental(symbol,fieldname,format = 'iso')
fundavalue,fundadate = norgatedata.fundamental(symbol,fieldname,format = 'pandas-timestamp')
fundavalue,fundadate = norgatedata.fundamental(symbol,fieldname,format = 'numpy-datetime64')
fundavalue,fundadate = norgatedata.fundamental(symbol,fieldname,format = 'datetime')

Provides the current fundamentals

Returns the field and the date applicable to the field (e.g. the last day of the quarter to which it applies, or for current ratios the most recent date of change). By default, the date is provided as a string in YYYY-MM-DD format, but other date formats can be specified.

Returns None,None if the fieldname is not available for that security.

Classifications

schemename = 'NorgateDataFuturesClassification'
schemename = 'TRBC'
schemename = 'GICS'
classificationresulttype = 'ClassificationId'
classificationresulttype = 'Name'
classification = norgatedata.classification(
    symbol,
    schemename,
    classificationresulttype)

Provides the classification for a given security as a string. Returns None if there is no classification available.

schemename = 'TRBC'
schemename = 'GICS'
classificationresulttype = 'ClassificationId'
classificationresulttype = 'Name'
level = 1
level = 4
classificationatlevel = norgatedata.classification_at_level(
    symbol,
    schemename,
    classificationresulttype,
    level)

Provides the classification for a given security, at a given level of the classification scheme as a string. Returns None if there is no classification available.

symbol='GE'
indexfamilycode = '$SPX'
indexfamilycode = '$SP1500'
level = 3
indexreturntype = 'PR'
indexreturntype = 'TR'
indexsymbol = norgatedata.corresponding_industry_index(
    symbol,
    indexfamilycode,
    level,
    indexreturntype)

Provides the symbol of a corresponding index, at a given level of the classification scheme. Returns None if there is no classification for the current symbol or no currently-trading index that matches.

Other informational functions

  • norgatedata.last_database_update_time(database) - returns a datetime object when information was updated for a given database, where database is the shortened form for each database. Database names include au, aueto, auwarrant, auindex, ca, caindex, us, usindex, cashcommodity, economic, future, forex, contfuture, worldindex

  • norgatedata.last_price_update_time(symbol) datetime when price was last updated for the given symbol

  • norgatedata.status() - shows whether NDU is running (returns True if running, or False if not)

Accessing data by assetid instead of symbol

Instead of using a security's symbol, you can obtain its unique Norgate-provided identity known as assetid. This is an unchanging number and is therefore useful when storing positions/orders into data files where the symbol may change in the future.

All of the calls above that reference 'symbol' (as a string) can also take an assetid (as an integer). For example, symbol MSFT = assetid 134016. symbol AMZN = assetid 129769.

Error Handling

If an invalid symbol or invalid parameters are specified, then the ValueError exception will be raised.

If there is no data available, then None is returned, except in the case where multiple values are expected eg. (None, None).

Multithreading / Multiprocessing compatibility

The norgatedata package is compatible with multithreading and multiprocessing libraries/packages, to take advantage of multiple CPU cores. This can result in a significant reduction in runtime by running operations in paralle.

Python Quant/Backtesting Package Integration

All of the routines developed in this module provide the data that can be used for backtesting/scanning of data. There are many backtesting packages developed that use Python. In this section, we provide more information about how to use with popular backtesting python packages with sample code and/or links to Norgate-developed integration packages.

Zipline

The package: zipline-norgatedata provides a tight integration between Zipline and Norgate Data

Key Features

  • Survivorship bias-free bundles
  • Incorporates time series data such as historical index membership and dividend yield into Zipline's Pipeline mechanism
  • No modifications to the Zipline code base (except to fix problems with installation and obsolete calls that crash Zipline)
  • Simple bundle creation
  • Easy ingestion of equities data into Zipline with a simple list of symbols you want to incorporate and/or import your own symbol lists and/or use pre-built or user-created watchlists from Norgate Data
  • Easy ingestion of futures data into Zipline with a simple list of symbols you want to incorporate, futures market sessions you want to incorporate, and/or use pre-built or user-created watchlists from Norgate Data

Backtrader

To create a "data feed" of a given symbol:

import backtrader as bt
import norgatedata

# ... your code here ...
cerebro = bt.Cerebro()  # create a "Cerebro" engine instance
# ... your code here ...

# Obtain a Pandas dataframe for a given security from Norgate Data
priceadjust = norgatedata.StockPriceAdjustmentType.TOTALRETURN 
padding_setting = norgatedata.PaddingType.NONE   
symbol = 'AAPL'
start_date = '2010-01-01'
timeseriesformat = 'pandas-dataframe'
pricedata_dataframe = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    format=timeseriesformat)

# Rename columns to suit Backtrader
pricedata_dataframe.rename(
    columns={ 'Open':'open',  'High':'high', 'Low':'low', 'Close':'close', 'Volume':'volume', 'Open Interest':'open},
    inplace=True)

# Backtrader can convert this dataframe into its own internal format with:
data = bt.feeds.PandasData(dataname=pricedata_dataframe)
cerebro.adddata(data)  # Add the data feed
# ... your code here ...

pybacktest

Integration with pybacktest just requires us to change the column headers in a Pandas dataframe.

After installing pybacktest and norgatedata, we also need to install pandas_datareader:

pip install pandas_datareader

Here's a sample piece of code that uses data from Norgate data.

import norgatedata
import pybacktest
import pandas

# Obtain data from Norgate Data first, in Pandas dataframe format
priceadjust = norgatedata.StockPriceAdjustmentType.TOTALRETURN 
padding_setting = norgatedata.PaddingType.NONE   
symbol = 'AAPL'
start_date = '2010-01-01'
timeseriesformat = 'pandas-dataframe'
ohlc = norgatedata.price_timeseries(
    symbol,
    stock_price_adjustment_setting = priceadjust,
    padding_setting = padding_setting,
    start_date = start_date,
    format=timeseriesformat)

# Change column names
ohlc.rename(
    columns={'Open':'O', 'High':'H', 'Low':'L', 'Close':'C', 'Volume':'V'},
    inplace=True)

# sample pybacktest code:
ms = ohlc.C.rolling(50).mean()
ml = ohlc.L.rolling(100).mean()
buy = cover = (ms > ml) & (ms.shift() < ml.shift())
sell = short = (ms < ml) & (ms.shift() > ml.shift())

backtestresults = pybacktest.Backtest(locals())

Tensorflow

Use the convert_to_tensor() function to convert a Numpy Array for use with the TensorFlow machine learning platform

Also: TF Quant Finance

Keras

Use the tensorflow.keras.backend.variable() to convert a Numpy Array for use with the Keras deep learning library

Future possible Python Quant/Backtesting/Machine Learning/Neural Network Integration

Here is a list of Python packages that might be worthwhile examining to incoroporate our data. Some of these might be abandonware too. Let us know if there's more worthy of integration.

Books/publications that use Zipline, adapted for Norgate Data use

We have adapted the Python code in the following books to use Norgate Data. Let us know if you'd like a copy of the source code.

Trading Evoled: Killer Trading Strategies for Python

If there are other publications worthwhile considering, let us know.

Support

Norgate Data support

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

norgatedata-1.0.30-py3-none-any.whl (24.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page