Skip to main content

Project Eden is a fundamental analysis engine, empowering investors to make smarter investment decisions.

Project description

Project Eden Logo

Project Eden is a fundamental analysis engine that empowers investors to make smarter investment decisions by ingesting and analyzing financial data from publicly traded companies.

Features

  • Financial Data Ingestion: Automatically fetch financial data for any publicly traded company

  • Database Management: Create and manage database tables for financial data storage

  • CLI Interface: Easy-to-use command-line interface for all operations

  • Flexible Data Periods: Support for quarterly and fiscal year data

  • Batch Processing: Process multiple companies at once or all SEC-registered companies

Requirements

  • Python 3.12+

  • Poetry (for dependency management)

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd project_eden
  2. Install Poetry (if not already installed):

    # On Windows
    powershell -ExecutionPolicy ByPass -c "irm https://install.python-poetry.org | iex"
    
    # On macOS/Linux
    curl -sSL https://install.python-poetry.org | python3 -
  3. Install dependencies:

    poetry install
  4. Activate the virtual environment:

    poetry shell

Usage

Project Eden provides a command-line interface through the eden command.

Initialize Database and Ingest Data

To set up the database and start ingesting financial data:

# Initialize with all publicly traded companies
eden init

# Initialize with specific tickers
eden init AAPL MSFT GOOG

# Initialize with tickers from a file
eden init --file tickers.txt

# Initialize with specific data period
eden init --period quarter AAPL MSFT

Create Database Tables

To create database tables without ingesting data:

# Create all tables
eden create

# Create specific tables
eden create table1 table2

Ingest Financial Data

To ingest financial data for specific companies:

# Ingest data for specific tickers
eden ingest AAPL MSFT GOOG

# Ingest data for all publicly traded companies
eden ingest

# Ingest data from a file of tickers
eden ingest --file portfolio_tickers.txt

# Ingest only quarterly data
eden ingest --period quarter AAPL

# Ingest only fiscal year data
eden ingest --period fy AAPL

Using ZenML Pipeline Mode

For enhanced tracking, observability, and reproducibility, you can use the --pipeline flag to execute ingestion through ZenML pipelines:

# Use ZenML pipeline for ingestion
eden ingest --pipeline AAPL MSFT GOOG

# Initialize with pipeline mode
eden init --pipeline AAPL MSFT

Benefits of Pipeline Mode:

  • Tracking & Observability: All pipeline runs are tracked with metadata, parameters, and results

  • Reproducibility: Every run is versioned with exact parameters for easy re-execution

  • Error Handling: Better failure isolation and built-in retry mechanisms

  • Lineage Tracking: Complete audit trail of what data was ingested, when, and with what configuration

  • Scalability: Easy to switch from local execution to cloud orchestrators (e.g., Kubernetes) without code changes

When to Use Pipeline Mode:

  • Production data ingestion workflows

  • Large-scale ingestion (recommended to also use --parallel, see below)

  • Scheduled or automated runs

  • When you need compliance and audit trails

  • When you want to track ingestion history and performance

When to Use Direct Mode (default):

  • Quick testing or debugging (1-5 tickers)

  • Development and experimentation

  • Simple one-off ingestions

Using Parallel Pipeline Mode

For maximum performance when ingesting many tickers, use the --parallel flag with --pipeline to enable parallel execution with automatic rate limiting:

# Parallel ingestion with rate limiting
eden ingest --pipeline --parallel AAPL MSFT GOOG AMZN META

# Initialize with parallel mode
eden init --pipeline --parallel AAPL MSFT

# Ingest all tickers in parallel
eden ingest --pipeline --parallel

How Parallel Mode Works:

The parallel pipeline uses a token bucket rate limiter that coordinates across all parallel workers to ensure the total API call rate never exceeds your configured limit (rate_limit_per_min in config.json).

  • Workers process tickers simultaneously

  • Each worker acquires “tokens” before making API calls

  • Tokens refill at the configured rate (e.g., 300 per minute)

  • Workers automatically wait if insufficient tokens are available

Performance Benefits:

  • Sequential mode: Processes ~1 ticker per minute (with 5 datasets)

  • Parallel mode: Processes up to 60 tickers per minute (with 300 calls/min limit)

  • Example: 100 tickers takes ~100 minutes sequential vs. ~2-3 minutes parallel

Command Options

All commands support the following options:

  • --config, -c: Path to configuration file (default: db/db/config.json)

  • --help: Show help information for any command

For data ingestion commands (init and ingest):

  • --file, -f: Path to file containing ticker symbols (one per line)

  • --period, -p: Data period to ingest (quarter, fy, or all)

  • --pipeline: Use ZenML pipeline for execution (enables tracking, observability, and reproducibility)

  • --parallel: Use parallel execution with rate limiting (requires --pipeline flag)

Configuration

The application uses a JSON configuration file located at project_eden/db/db/config.json. You can specify a custom configuration file using the --config option.

Example configuration files are available in the project_eden/db/ directory:

  • config_example.json - Template configuration file

  • config_dev.json - Development configuration

Development

This project uses Poetry for dependency management and includes development tools:

  • Black: Code formatting (line length: 99 characters)

  • Python 3.12: Target Python version

To contribute:

  1. Install development dependencies:

    poetry install --with dev
  2. Format code with Black:

    poetry run black .
  3. Run the CLI in development mode:

    poetry run python -m project_eden.cli

Project Structure

project_eden/
├── assets/                 # Project assets (logos, images)
├── examples/              # Example scripts
│   └── run_ingestion_pipeline.py
├── project_eden/          # Main package
│   ├── cli.py            # Command-line interface
│   ├── db/               # Database modules
│   │   ├── config.json   # Configuration files
│   │   ├── create_tables.py
│   │   ├── data_ingestor.py
│   │   └── utils.py
│   ├── pipeline/         # ZenML pipelines
│   │   ├── __init__.py
│   │   ├── data_ingestion_etl.py          # Sequential ingestion pipeline
│   │   └── data_ingestion_parallel.py     # Parallel ingestion pipeline with rate limiting
│   ├── steps/            # ZenML pipeline steps
│   │   ├── __init__.py
│   │   └── data_ingestion.py              # Data ingestion steps (load config, fetch data, etc.)
│   └── __init__.py
├── scripts/              # Utility scripts
├── pyproject.toml        # Project configuration
└── README.rst           # This file

Author

  • George Labaria

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). For those who do not wish to adhere to the AGPL’s open-source requirements, a separate commercial license is available. Please contact the author to discuss licensing options. See the AGPL-3.0 text here: https://www.gnu.org/licenses/agpl-3.0.en.html

Getting Help

For more information on any command, use the --help option:

eden --help
eden init --help
eden ingest --help
eden create --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

project_eden-0.2.0.tar.gz (290.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

project_eden-0.2.0-py3-none-any.whl (263.0 kB view details)

Uploaded Python 3

File details

Details for the file project_eden-0.2.0.tar.gz.

File metadata

  • Download URL: project_eden-0.2.0.tar.gz
  • Upload date:
  • Size: 290.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11

File hashes

Hashes for project_eden-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e76777212d67d1c863733db878e602e1a705be9ad5f5f44b7ab7f4a5c4164eaa
MD5 71867539ce7ae5554ec3fb72d663516c
BLAKE2b-256 dad8c135484c0ab301e81120fb7f0f44d06eff38acb10be6250e39a5c4af7706

See more details on using hashes here.

File details

Details for the file project_eden-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: project_eden-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 263.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11

File hashes

Hashes for project_eden-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8bbb2f0f3f55bc2c61898c93ab942a196fb9061b86c203551c6d8cd048c7ba9
MD5 3fbd0a1f25167f51260d89904f61beca
BLAKE2b-256 82b5e093658610b4bf16012d6375d6f5a393e46d77f525c9f6849bccf5b0650f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page