Skip to main content

era-level data for Numerai

Project description

Numerai Era Data

Numerai Era Data is a Python project dedicated to enriching the Numerai tournament experience by providing supplemental era-level data. This data may offer valuable insights to enhance the modeling capabilities of participants in the Numerai tournament.

Table of Contents

Introduction

The Numerai tournament provides an innovative platform for data scientists to build predictive models for financial data. Numerai Era Data takes this a step further by offering supplemental data at the era level. These data enhancements can be seamlessly integrated into participants' modeling pipelines, enabling them to explore new approaches and potentially improve their model performance.

Era-level data could be incorporated into a model pipeline in many ways. The simplest approach would be to add the supplemental columns as new feature columns directly to the Numerai data. Initial tests on this approach have not shown any benefit. Another avenue would be to use the era-level data to help predict and respond to changes in regime, which has seemed to plague Numerai participants periodically, including during the heavy drawdowns of Q2 2023. One approach along this avenue would be to cluster eras based on era-level feature similarity and then train a separate model or models on each cluster. These models could then be used in an ensemble or mixture-of-experts system.

Installation

To start utilizing Numerai Era Data, install it from PyPI using the following command:

pip install numerai-era-data

Usage

Numerai Era Data can be incorporated into your Numerai modeling process to enhance your models' predictive power. Here's how you can use it:

from numerai_era_data.era_data_api import EraDataAPI

# Get data for all eras and latest daily data for live era
era_data_api = EraDataAPI()
era_data = era_data_api.get_all_eras()
daily_data = era_data_api.get_current_daily()

# Exclude raw columns
era_feature_columns = [f for f in era_data.columns if f != "era" and not f.startswith("era_feature_raw_")]

# Merge era data with Numerai data
all_data = all_data.merge(era_data[["era"] + era_feature_columns], on="era", how="left")
live_data = live_data.merge(daily_data[["era"] + era_feature_columns], on="era", how="outer")

Data Types

Numerai Era Data provides two types of columns: normal and raw. Raw features, indicated by the prefix "era_feature_raw_", require additional processing to be useful in modeling. These features encompass data like the S&P500 closing price. Incorporating these columns can potentially contribute to more accurate and sophisticated models. Extending Data Sources

Contributing

Numerai Era Data welcomes contributors to expand its capabilities by implementing new data sources. To add a new data source, follow these steps:

  1. Create a new class that extends numerai_era_data.data_sources.base_data_source.BaseDataSource.
  2. Implement the get_data() function in the new class, returning a Pandas DataFrame. The DataFrame should have a "date" column and one or more columns starting with either "_BASE_PREFIX" or "_BASE_PREFIX_RAW". These columns should contain the values available at noon UTC for each date in the DataFrame's range.
  3. Implement the get_columns() function to return the list of data columns provided by the new data source.

License

Numerai Era Data is released under the MIT License. You are free to use, modify, and distribute the code according to the terms of the license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numerai_era_data-1.0.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

numerai_era_data-1.0.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file numerai_era_data-1.0.0.tar.gz.

File metadata

  • Download URL: numerai_era_data-1.0.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for numerai_era_data-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ad727b85974b00aebfd56fc92cecc96e449d81a3479cb17f2d51e90e5d1f9d9f
MD5 85c5eee0c8c5d1dda29667c2fd8afe10
BLAKE2b-256 dbf1b12e239c362d489a90fd66f8bce029c80d99d04c1b1d91f35845ab57fdc3

See more details on using hashes here.

File details

Details for the file numerai_era_data-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for numerai_era_data-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54ef3c48ce48981505ed88589e8a1c0bb3985a918f38ecd0d33cdb05fc4effec
MD5 aaeb0b101e77786aebed87dd27db3f72
BLAKE2b-256 8d50842ee9f1092d5e47eda92d2e7c42562c9c0464988b8edf0a22cf5f2064c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page