Skip to main content

Tool for retrieving and combining financial and related data for informing security investments

Project description

KaxaNuk Data Curator

Python Build Status

Tool for building a structured database for market, fundamental and alternative data obtained from different financial data provider web services.

Allows for easy creation of additional calculated feature functions.

Requirements

The system can run either on your local Python, or on Docker.

Requirements for Running on Local Python

  • Python 3.12 or 3.13

Supported Data Providers

  • Financial Modeling Prep
  • Yahoo Finance (requires installing a separate extension package, and doesn't support most data types)

Running on Python

Installation

  1. Make sure you're running the required version of Python, preferably in its own virtual environment.

  2. Open a terminal and run:

    pip install --upgrade pip
    pip install kaxanuk.data_curator
    
  3. If you want to use the Yahoo Finance data provider, install the extension package:

    pip install kaxanuk.data_curator_extensions.yahoo_finance
    

Configuration

  1. Open a terminal in any directory and run the following command:
    kaxanuk.data_curator init excel
    
    This should create 2 subdirectories, Config and Output, as well as the entry script __main__.py in the current directory.
  2. Open the Config/parameters_datacurator.xlsx file in Excel, fill out the fields in all the sheets, save the file and close it.
  3. If your data provider requires an API key, open the Config/.env file in a text editor, and paste the key after the = sign of the provider's corresponding API_KEY variable. Don't add any quotes or spaces before or after the key.

*If on MacOS, the .env file will be hidden in Finder by default. Just use the keys Command + Shift + . to toggle the visibility of hidden files.

Usage

Now you can run the entry script with either:

kaxanuk.data_curator run

or by executing the __main__.py script directly with Python:

python __main__.py

The system will download the data for the tickers configured in the file, and save the data to the Output folder.

Running on Docker

Pull the Docker image:

docker pull ghcr.io/kaxanuk/data-curator:latest

Docker Configuration

Volumes

You need to mount the following volume to the container:

  • Path on the host: (select the directory on your PC where you want the Data Curator configuration and output files to be created)
  • Path inside the container: /app

Environment Variables

If your data provider requires an API key, you need to pass it as an environment variable when running the container.

  • Name: KNDC_API_KEY_FMP
  • Value: API key for the Financial Modeling Prep data provider, as a string.

Running the Container

  1. On the first run, the container will create the Config and Output subdirectories in the mounted volume, as well as the entry script __main__.py.
  2. Open the Config/parameters_datacurator.xlsx file in Excel, fill out the fields in all the sheets, save the file and close it.

Now that the configuration is set up, each time you run the container again, it will download the data for the tickers/identifiers as configured in the parameters file, and save it to the Output folder.

Customization

The __main__.py entry script is customizable, so you can implement your own data providers and configuration and output handlers, and inject them from there.

You can also create your own calculated feature functions by adding them to the Config/custom_calculations.py file, and adding the function's name (which start with the c_ prefix) to the Columns sheet in the Config/parameters_datacurator.xlsx file.

Check the API Reference to learn how to easily implement your own calculated features.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kaxanuk_data_curator-0.40.2.tar.gz (505.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kaxanuk_data_curator-0.40.2-py3-none-any.whl (107.6 kB view details)

Uploaded Python 3

File details

Details for the file kaxanuk_data_curator-0.40.2.tar.gz.

File metadata

  • Download URL: kaxanuk_data_curator-0.40.2.tar.gz
  • Upload date:
  • Size: 505.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for kaxanuk_data_curator-0.40.2.tar.gz
Algorithm Hash digest
SHA256 84a30383b752c0836cf1a620257e4a1386cb05582e8421d1e0e06ab953549e02
MD5 6c4c3068c816babebcbcd553e9439224
BLAKE2b-256 c3aed38c7293e76891d5b2269fbade1ce6ce640dbd42396b3f1872c3a6641e3c

See more details on using hashes here.

File details

Details for the file kaxanuk_data_curator-0.40.2-py3-none-any.whl.

File metadata

File hashes

Hashes for kaxanuk_data_curator-0.40.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1aeb74bf85e12a871f73865691bc7ac202fd2986674b63e5456a9c68483b7fa4
MD5 944483925a6d1e5e733f620799fa417b
BLAKE2b-256 6440ac0cc229c1a9e3bb31754d898b94744d34b9342a8aefa7afaa0f75d253be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page