Project Eden is a fundamental analysis engine, empowering investors to make smarter investment decisions.
Project description
Project Eden is a fundamental analysis engine that empowers investors to make smarter investment decisions by ingesting and analyzing financial data from publicly traded companies.
Features
Financial Data Ingestion: Automatically fetch financial data for any publicly traded company
Database Management: Create and manage database tables for financial data storage
CLI Interface: Easy-to-use command-line interface for all operations
Flexible Data Periods: Support for quarterly and fiscal year data
Batch Processing: Process multiple companies at once or all SEC-registered companies
Requirements
Python 3.12+
Poetry (for dependency management)
Installation
Clone the repository:
git clone <repository-url> cd project_eden
Install Poetry (if not already installed):
# On Windows powershell -ExecutionPolicy ByPass -c "irm https://install.python-poetry.org | iex" # On macOS/Linux curl -sSL https://install.python-poetry.org | python3 -
Install dependencies:
poetry install
Activate the virtual environment:
poetry shell
Usage
Project Eden provides a command-line interface through the eden command.
Initialize Database and Ingest Data
To set up the database and start ingesting financial data:
# Initialize with all publicly traded companies eden init # Initialize with specific tickers eden init AAPL MSFT GOOG # Initialize with tickers from a file eden init --file tickers.txt # Initialize with specific data period eden init --period quarter AAPL MSFT
Create Database Tables
To create database tables without ingesting data:
# Create all tables eden create # Create specific tables eden create table1 table2
Ingest Financial Data
To ingest financial data for specific companies:
# Ingest data for specific tickers eden ingest AAPL MSFT GOOG # Ingest data for all publicly traded companies eden ingest # Ingest data from a file of tickers eden ingest --file portfolio_tickers.txt # Ingest only quarterly data eden ingest --period quarter AAPL # Ingest only fiscal year data eden ingest --period fy AAPL
Using ZenML Pipeline Mode
For enhanced tracking, observability, and reproducibility, you can use the --pipeline flag to execute ingestion through ZenML pipelines:
# Use ZenML pipeline for ingestion eden ingest --pipeline AAPL MSFT GOOG # Initialize with pipeline mode eden init --pipeline AAPL MSFT
Benefits of Pipeline Mode:
Tracking & Observability: All pipeline runs are tracked with metadata, parameters, and results
Reproducibility: Every run is versioned with exact parameters for easy re-execution
Error Handling: Better failure isolation and built-in retry mechanisms
Lineage Tracking: Complete audit trail of what data was ingested, when, and with what configuration
Scalability: Easy to switch from local execution to cloud orchestrators (e.g., Kubernetes) without code changes
When to Use Pipeline Mode:
Production data ingestion workflows
Large-scale ingestion (recommended to also use --parallel, see below)
Scheduled or automated runs
When you need compliance and audit trails
When you want to track ingestion history and performance
When to Use Direct Mode (default):
Quick testing or debugging (1-5 tickers)
Development and experimentation
Simple one-off ingestions
Using Parallel Pipeline Mode
For maximum performance when ingesting many tickers, use the --parallel flag with --pipeline to enable parallel execution with automatic rate limiting:
# Parallel ingestion with rate limiting eden ingest --pipeline --parallel AAPL MSFT GOOG AMZN META # Initialize with parallel mode eden init --pipeline --parallel AAPL MSFT # Ingest all tickers in parallel eden ingest --pipeline --parallel
How Parallel Mode Works:
The parallel pipeline uses a token bucket rate limiter that coordinates across all parallel workers to ensure the total API call rate never exceeds your configured limit (rate_limit_per_min in config.json).
Workers process tickers simultaneously
Each worker acquires “tokens” before making API calls
Tokens refill at the configured rate (e.g., 300 per minute)
Workers automatically wait if insufficient tokens are available
Performance Benefits:
Sequential mode: Processes ~1 ticker per minute (with 5 datasets)
Parallel mode: Processes up to 60 tickers per minute (with 300 calls/min limit)
Example: 100 tickers takes ~100 minutes sequential vs. ~2-3 minutes parallel
Command Options
All commands support the following options:
--config, -c: Path to configuration file (default: db/db/config.json)
--help: Show help information for any command
For data ingestion commands (init and ingest):
--file, -f: Path to file containing ticker symbols (one per line)
--period, -p: Data period to ingest (quarter, fy, or all)
--pipeline: Use ZenML pipeline for execution (enables tracking, observability, and reproducibility)
--parallel: Use parallel execution with rate limiting (requires --pipeline flag)
Configuration
The application uses a JSON configuration file located at project_eden/db/db/config.json. You can specify a custom configuration file using the --config option.
Example configuration files are available in the project_eden/db/ directory:
config_example.json - Template configuration file
config_dev.json - Development configuration
Development
This project uses Poetry for dependency management and includes development tools:
Black: Code formatting (line length: 99 characters)
Python 3.12: Target Python version
To contribute:
Install development dependencies:
poetry install --with dev
Format code with Black:
poetry run black .
Run the CLI in development mode:
poetry run python -m project_eden.cli
Project Structure
project_eden/ ├── assets/ # Project assets (logos, images) ├── examples/ # Example scripts │ └── run_ingestion_pipeline.py ├── project_eden/ # Main package │ ├── cli.py # Command-line interface │ ├── db/ # Database modules │ │ ├── config.json # Configuration files │ │ ├── create_tables.py │ │ ├── data_ingestor.py │ │ └── utils.py │ ├── pipeline/ # ZenML pipelines │ │ ├── __init__.py │ │ ├── data_ingestion_etl.py # Sequential ingestion pipeline │ │ └── data_ingestion_parallel.py # Parallel ingestion pipeline with rate limiting │ ├── steps/ # ZenML pipeline steps │ │ ├── __init__.py │ │ └── data_ingestion.py # Data ingestion steps (load config, fetch data, etc.) │ └── __init__.py ├── scripts/ # Utility scripts ├── pyproject.toml # Project configuration └── README.rst # This file
License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). For those who do not wish to adhere to the AGPL’s open-source requirements, a separate commercial license is available. Please contact the author to discuss licensing options. See the AGPL-3.0 text here: https://www.gnu.org/licenses/agpl-3.0.en.html
Getting Help
For more information on any command, use the --help option:
eden --help eden init --help eden ingest --help eden create --help
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file project_eden-0.2.0.tar.gz.
File metadata
- Download URL: project_eden-0.2.0.tar.gz
- Upload date:
- Size: 290.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e76777212d67d1c863733db878e602e1a705be9ad5f5f44b7ab7f4a5c4164eaa
|
|
| MD5 |
71867539ce7ae5554ec3fb72d663516c
|
|
| BLAKE2b-256 |
dad8c135484c0ab301e81120fb7f0f44d06eff38acb10be6250e39a5c4af7706
|
File details
Details for the file project_eden-0.2.0-py3-none-any.whl.
File metadata
- Download URL: project_eden-0.2.0-py3-none-any.whl
- Upload date:
- Size: 263.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.13.1 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8bbb2f0f3f55bc2c61898c93ab942a196fb9061b86c203551c6d8cd048c7ba9
|
|
| MD5 |
3fbd0a1f25167f51260d89904f61beca
|
|
| BLAKE2b-256 |
82b5e093658610b4bf16012d6375d6f5a393e46d77f525c9f6849bccf5b0650f
|