Skip to main content

Project Eden is a fundamental analysis engine, empowering investors to make smarter investment decisions.

Project description

Project Eden Logo

Project Eden is a fundamental analysis engine that empowers investors to make smarter investment decisions by ingesting and analyzing financial data from publicly traded companies.

Features

  • Financial Data Ingestion: Automatically fetch financial data for any publicly traded company

  • Database Management: Create and manage database tables for financial data storage

  • CLI Interface: Easy-to-use command-line interface for all operations

  • Flexible Data Periods: Support for quarterly and fiscal year data

  • Batch Processing: Process multiple companies at once or all SEC-registered companies

Requirements

  • Python 3.12+

  • Poetry (for dependency management)

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd project_eden
  2. Install Poetry (if not already installed):

    # On Windows
    powershell -ExecutionPolicy ByPass -c "irm https://install.python-poetry.org | iex"
    
    # On macOS/Linux
    curl -sSL https://install.python-poetry.org | python3 -
  3. Install dependencies:

    poetry install
  4. Activate the virtual environment:

    poetry shell

Usage

Project Eden provides a command-line interface through the eden command.

Initialize Database and Ingest Data

To set up the database and start ingesting financial data:

# Initialize with all publicly traded companies
eden init

# Initialize with specific tickers
eden init AAPL MSFT GOOG

# Initialize with tickers from a file
eden init --file tickers.txt

# Initialize with specific data period
eden init --period quarter AAPL MSFT

Create Database Tables

To create database tables without ingesting data:

# Create all tables
eden create

# Create specific tables
eden create table1 table2

Ingest Financial Data

To ingest financial data for specific companies:

# Ingest data for specific tickers
eden ingest AAPL MSFT GOOG

# Ingest data for all publicly traded companies
eden ingest

# Ingest data from a file of tickers
eden ingest --file portfolio_tickers.txt

# Ingest only quarterly data
eden ingest --period quarter AAPL

# Ingest only fiscal year data
eden ingest --period fy AAPL

Using ZenML Pipeline Mode

For enhanced tracking, observability, and reproducibility, you can use the --pipeline flag to execute ingestion through ZenML pipelines:

# Use ZenML pipeline for ingestion
eden ingest --pipeline AAPL MSFT GOOG

# Initialize with pipeline mode
eden init --pipeline AAPL MSFT

Benefits of Pipeline Mode:

  • Tracking & Observability: All pipeline runs are tracked with metadata, parameters, and results

  • Reproducibility: Every run is versioned with exact parameters for easy re-execution

  • Error Handling: Better failure isolation and built-in retry mechanisms

  • Lineage Tracking: Complete audit trail of what data was ingested, when, and with what configuration

  • Scalability: Easy to switch from local execution to cloud orchestrators (e.g., Kubernetes) without code changes

When to Use Pipeline Mode:

  • Production data ingestion workflows

  • Large-scale ingestion (recommended to also use --parallel, see below)

  • Scheduled or automated runs

  • When you need compliance and audit trails

  • When you want to track ingestion history and performance

When to Use Direct Mode (default):

  • Quick testing or debugging (1-5 tickers)

  • Development and experimentation

  • Simple one-off ingestions

Using Parallel Pipeline Mode

For maximum performance when ingesting many tickers, use the --parallel flag with --pipeline to enable parallel execution with automatic rate limiting:

# Parallel ingestion with rate limiting
eden ingest --pipeline --parallel AAPL MSFT GOOG AMZN META

# Initialize with parallel mode
eden init --pipeline --parallel AAPL MSFT

# Ingest all tickers in parallel
eden ingest --pipeline --parallel

How Parallel Mode Works:

The parallel pipeline uses a token bucket rate limiter that coordinates across all parallel workers to ensure the total API call rate never exceeds your configured limit (rate_limit_per_min in config.json).

  • Workers process tickers simultaneously

  • Each worker acquires “tokens” before making API calls

  • Tokens refill at the configured rate (e.g., 300 per minute)

  • Workers automatically wait if insufficient tokens are available

Performance Benefits:

  • Sequential mode: Processes ~1 ticker per minute (with 5 datasets)

  • Parallel mode: Processes up to 60 tickers per minute (with 300 calls/min limit)

  • Example: 100 tickers takes ~100 minutes sequential vs. ~2-3 minutes parallel

Command Options

All commands support the following options:

  • --config, -c: Path to configuration file (default: db/db/config.json)

  • --help: Show help information for any command

For data ingestion commands (init and ingest):

  • --file, -f: Path to file containing ticker symbols (one per line)

  • --period, -p: Data period to ingest (quarter, fy, or all)

  • --pipeline: Use ZenML pipeline for execution (enables tracking, observability, and reproducibility)

  • --parallel: Use parallel execution with rate limiting (requires --pipeline flag)

Configuration

The application uses a JSON configuration file located at project_eden/db/db/config.json. You can specify a custom configuration file using the --config option.

Example configuration files are available in the project_eden/db/ directory:

  • config_example.json - Template configuration file

  • config_dev.json - Development configuration

Development

This project uses Poetry for dependency management and includes development tools:

  • Black: Code formatting (line length: 99 characters)

  • Python 3.12: Target Python version

To contribute:

  1. Install development dependencies:

    poetry install --with dev
  2. Format code with Black:

    poetry run black .
  3. Run the CLI in development mode:

    poetry run python -m project_eden.cli

Project Structure

project_eden/
├── assets/                 # Project assets (logos, images)
├── project_eden/           # Main package
│   ├── cli.py             # Command-line interface
│   ├── db/                # Database modules
│   │   ├── create_tables.py
│   │   ├── data_ingestor.py
│   │   └── utils.py
│   ├── pipeline/          # ZenML pipelines
│   │   ├── __init__.py
│   │   ├── data_ingestion_etl.py          # Sequential ingestion pipeline
│   │   └── data_ingestion_parallel.py     # Parallel ingestion pipeline with rate limiting
│   ├── steps/             # ZenML pipeline steps
│   │   ├── __init__.py
│   │   └── data_ingestion.py              # Data ingestion steps (load config, fetch data, etc.)
│   ├── utils/             # Shared utilities
│   │   ├── __init__.py
│   │   └── rate_limiter.py                # Token bucket rate limiter
│   └── __init__.py
├── scripts/               # Utility scripts
├── tests/                 # Unit tests
│   └── utils/
│       └── test_rate_limiter.py           # Tests for rate limiter
├── pyproject.toml         # Project configuration
└── README.rst             # This file

Author

  • George Labaria

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). For those who do not wish to adhere to the AGPL’s open-source requirements, a separate commercial license is available. Please contact the author to discuss licensing options. See the AGPL-3.0 text here: https://www.gnu.org/licenses/agpl-3.0.en.html

Getting Help

For more information on any command, use the --help option:

eden --help
eden init --help
eden ingest --help
eden create --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

project_eden-0.2.4.tar.gz (294.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

project_eden-0.2.4-py3-none-any.whl (265.2 kB view details)

Uploaded Python 3

File details

Details for the file project_eden-0.2.4.tar.gz.

File metadata

  • Download URL: project_eden-0.2.4.tar.gz
  • Upload date:
  • Size: 294.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11

File hashes

Hashes for project_eden-0.2.4.tar.gz
Algorithm Hash digest
SHA256 079977e3c922630bc08af1790365d03b03fe7bb1e0244ffe5811e59064755175
MD5 59b08c133e0f87a3e00d89c3ac3f5153
BLAKE2b-256 9c125191582a3f8949b73c17939bb87edf4c3200e5db54b77915eabc77bd11c9

See more details on using hashes here.

File details

Details for the file project_eden-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: project_eden-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 265.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11

File hashes

Hashes for project_eden-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 53c1569d8ea4ba788e9f6096a25c814a2a2ca84ff50139d1412457b5cf32f753
MD5 c79c6e33eaf17efc6ad092e75b5ef640
BLAKE2b-256 aea032c898f9248386f5b24779670afe217a5deceb54007fbbee86842c0fd2a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page