Skip to main content

era-level data for Numerai

Project description

Numerai Era Data

Numerai Era Data is a Python project dedicated to enriching the Numerai tournament experience by providing supplemental era-level data. This data may offer valuable insights to enhance the modeling capabilities of participants in the Numerai tournament.

Table of Contents

Introduction

The Numerai tournament provides an innovative platform for data scientists to build predictive models for financial data. Numerai Era Data takes this a step further by offering supplemental data at the era level. These data enhancements can be seamlessly integrated into participants' modeling pipelines, enabling them to explore new approaches and potentially improve their model performance.

Era-level data could be incorporated into a model pipeline in many ways. The simplest approach would be to add the supplemental columns as new feature columns directly to the Numerai data. Initial tests on this approach have not shown any benefit. Another avenue would be to use the era-level data to help predict and respond to changes in regime, which has seemed to plague Numerai participants periodically, including during the heavy drawdowns of Q2 2023. One approach along this avenue would be to cluster eras based on era-level feature similarity and then train a separate model or models on each cluster. These models could then be used in an ensemble or mixture-of-experts system.

Installation

To start utilizing Numerai Era Data, install it from PyPI using the following command:

pip install numerai-era-data

Usage

Numerai Era Data can be incorporated into your Numerai modeling process to enhance your models' predictive power. Here's how you can use it:

from numerai_era_data.era_data_api import EraDataAPI

# Get data for all eras and latest daily data for live era
era_data_api = EraDataAPI()
era_data = era_data_api.get_all_eras()
daily_data = era_data_api.get_current_daily()

# Exclude raw columns
era_feature_columns = [f for f in era_data.columns if f != "era" and not f.startswith("era_feature_raw_")]

# Merge era data with Numerai data
all_data = all_data.merge(era_data[["era"] + era_feature_columns], on="era", how="left")
live_data = live_data.merge(daily_data[["era"] + era_feature_columns], on="era", how="outer")

Data Types

Numerai Era Data provides two types of columns: normal and raw. Raw features, indicated by the prefix "era_feature_raw_", require additional processing to be useful in modeling. These features encompass data like the S&P500 closing price. Incorporating these columns can potentially contribute to more accurate and sophisticated models. Extending Data Sources

Contributing

Numerai Era Data welcomes contributors to expand its capabilities by implementing new data sources. To add a new data source, follow these steps:

  1. Create a new class that extends numerai_era_data.data_sources.base_data_source.BaseDataSource.
  2. Implement the get_data() function in the new class, returning a Pandas DataFrame. The DataFrame should have a "date" column and one or more columns starting with either "_BASE_PREFIX" or "_BASE_PREFIX_RAW". These columns should contain the values available at noon UTC for each date in the DataFrame's range.
  3. Implement the get_columns() function to return the list of data columns provided by the new data source.

License

Numerai Era Data is released under the MIT License. You are free to use, modify, and distribute the code according to the terms of the license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numerai_era_data-0.1.1.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

numerai_era_data-0.1.1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file numerai_era_data-0.1.1.tar.gz.

File metadata

  • Download URL: numerai_era_data-0.1.1.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for numerai_era_data-0.1.1.tar.gz
Algorithm Hash digest
SHA256 012cf280a7a6df962224fe5f4e407694b6dbe0d556152dacd5031ad2f0ef440b
MD5 c0af8716fd80ee85ff7aeddd99ac9cd3
BLAKE2b-256 ab9d152aa903b4a5d3bb546b2ffca6bed57fe7d8c2a4123497e6b5cbd4265493

See more details on using hashes here.

File details

Details for the file numerai_era_data-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for numerai_era_data-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b715a8f3963a5baccf17c8b6fac82b8e3b280e96ec606425b0493389546027d3
MD5 5a56827e942132e879b8bd4a2a359b0d
BLAKE2b-256 3057170deded9c00dab10c3afea0002e8991b8b7f57f17290fe82e5d43502c65

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page