Financial equity data aggregation toolkit

These details have not been verified by PyPI

Project links

Project description

Equity Aggregator

Description

Equity Aggregator is a financial data tool that collects and normalises raw equity data from authoritative sources (Euronext, LSE, SEC, XETRA), before enriching it with third-party market vendor data to produce a unified canonical dataset of unique equities.

Altogether, this tool makes it possible to retrieve up-to-date information on over 9,500+ equities from ten countries worldwide:

Source	Country	Description
🇪🇺 Euronext	Europe	Pan-European stock exchange operating in Netherlands, France, Belgium, Portugal, Ireland, and Norway
🇬🇧 LSE	United Kingdom	London Stock Exchange
🇺🇸 SEC	United States	Securities and Exchange Commission
🇩🇪 XETRA	Germany	Deutsche Börse electronic trading platform

What kind of Equity Data is available?

Equity Aggregator provides a comprehensive profile for each equity in its canonical collection, structured through validated schemas that ensure clean separation between essential identity metadata and extensive financial metrics:

Identity Metadata

Field	Description
name	Full company name
symbol	Trading symbol
share class figi	Authoritative OpenFIGI identifier
isin	International Securities Identification Number
cusip	CUSIP identifier
cik	Central Index Key for SEC filings

Financial Metrics

Category	Fields
Market Data	`last_price`, `market_cap`, `currency`, `market_volume`
Trading Venues	`mics`
Price Performance	`fifty_two_week_min`, `fifty_two_week_max`, `performance_1_year`
Share Structure	`shares_outstanding`, `share_float`, `dividend_yield`
Ownership	`held_insiders`, `held_institutions`, `short_interest`
Profitability	`profit_margin`, `gross_margin`, `operating_margin`
Cash Flow	`free_cash_flow`, `operating_cash_flow`
Valuation	`trailing_pe`, `price_to_book`, `trailing_eps`
Returns	`return_on_equity`, `return_on_assets`
Fundamentals	`revenue`, `revenue_per_share`, `ebitda`, `total_debt`
Classification	`industry`, `sector`, `analyst_rating`

[!NOTE] The OpenFIGI Share Class FIGI is the only definitive unique identifier for each equity in this dataset. While other identifiers like ISIN, CUSIP, and CIK are also collected, they may not be universally available across all global markets or may have inconsistencies in formatting and coverage.

OpenFIGI provides standardised, globally unique identifiers that work consistently across all equity markets and exchanges, hence its selection for Equity Aggregator.

How do I get started?

Package Installation

Equity Aggregator is available to download via pip as the equity-aggregator package:

pip install equity-aggregator

CLI Usage

Once installed, Equity Aggregator provides a comprehensive command-line interface for managing equity data operations. The CLI offers three main commands:

seed - Aggregate and populate the local database with fresh equity data
export - Export the local canonical equity database to compressed JSONL format
download - Download the latest canonical equity data from remote repository

Run equity-aggregator --help for more information:

usage: equity-aggregator [-h] [-v] [-d] [-q] {seed,export,download} ...

aggregate, download, and export canonical equity data

options:
  -h, --help            show this help message and exit
  -v, --verbose         enable verbose logging (INFO level)
  -d, --debug           enable debug logging (DEBUG level)
  -q, --quiet           quiet mode - only show warnings and errors

commands:
  Available operations

  {seed,export,download}
    seed                aggregate enriched canonical equity data sourced from data feeds
    export              export local canonical equity data to compressed JSONL format
    download            download latest canonical equity data from remote repository

Use 'equity-aggregator <command> --help' for help

Download Command

The download command retrieves the latest pre-processed canonical equity dataset from GitHub Releases, eliminating the need to run the full aggregation pipeline via seed locally. This command:

Downloads compressed equity data (canonical_equities.jsonl.gz) from the latest nightly build
Automatically rebuilds the database locally from the downloaded data
Provides access to 9,500+ equities with immediate effect

[!TIP] Optional: Increase Rate Limits

Set GITHUB_TOKEN to increase download limits from 60/hour to 5,000/hour:
export GITHUB_TOKEN="your_personal_access_token_here"
Create a token at GitHub Settings - no special scopes needed. Recommended for frequent downloads or CI/CD pipelines.

Export Command

The export command extracts canonical equity data from the local database and exports it as compressed JSONL (JSON Lines) format. It reads all canonical equities from the local database and exports the data to canonical_equities.jsonl.gz in the specified output directory.

This creates a portable, standardised dataset suitable for analysis, sharing, or backup while preserving all equity metadata and financial metrics in structured JSON format.

# Export aggregated data to compressed JSON in specified directory
equity-aggregator export --output-dir ~/Downloads
equity-aggregator export --output-dir /path/to/export/location

Seed Command

The seed command executes the complete equity aggregation pipeline, collecting raw data from authoritative sources (Euronext, LSE, SEC, XETRA), enriching it with market data from enrichment feeds, and storing the processed results in the local database. This command runs the full transformation pipeline to create a fresh canonical equity dataset.

This command requires that the following API keys are set prior:

export EXCHANGE_RATE_API_KEY="your_key_here"
export OPENFIGI_API_KEY="your_key_here"

# Run the main aggregation pipeline (requires API keys)
equity-aggregator seed

[!IMPORTANT] Note that the seed command processes thousands of equities and is intentionally rate-limited to respect external API constraints. A full run typically takes 90 minutes depending on network conditions and API response times.

This is mitigated by the automated nightly CI pipeline that runs seed and publishes the latest canonical equity dataset. Users can download this pre-built data using equity-aggregator download instead of running the full aggregation pipeline locally.

Python API Integration

Beyond the CLI, Equity Aggregator also exposes a focused public API that enables seamless integration opportunities. The API automatically detects and downloads the latest canonical equity dataset from remote sources when needed, ensuring users always work with up-to-date data.

Retrieving All Equities

The retrieve_canonical_equities() function downloads and returns the complete dataset of canonical equities. This function automatically handles data retrieval and local database management, downloading the latest canonical equity dataset when needed.

from equity_aggregator import retrieve_canonical_equities

# Retrieve all canonical equities (downloads if database doesn't exist locally)
equities = retrieve_canonical_equities()
print(f"Retrieved {len(equities)} canonical equities")

# Iterate through equities
for equity in equities[:3]:  # Show first 3
    print(f"{equity.identity.symbol}: {equity.identity.name}")

Example Output:

Retrieved 9547 canonical equities
AAPL: APPLE INC
MSFT: MICROSOFT CORP
GOOGL: ALPHABET INC

Retrieving Individual Equities

The retrieve_canonical_equity() function retrieves a single equity by its Share Class FIGI identifier. This function works independently and automatically downloads data if needed.

from equity_aggregator import retrieve_canonical_equity

# Retrieve a specific equity by FIGI identifier
apple_equity = retrieve_canonical_equity("BBG000B9XRY4")

print(f"Company: {apple_equity.identity.name}")
print(f"Symbol: {apple_equity.identity.symbol}")
print(f"Market Cap: ${apple_equity.financials.market_cap:,.0f}")
print(f"Currency: {apple_equity.pricing.currency}")

Example Output:

Company: APPLE INC
Symbol: AAPL
Market Cap: $3,500,000,000,000
Currency: USD

Data Models

All data is returned as type-safe Pydantic models, ensuring data validation and integrity. The CanonicalEquity model provides structured access to identity metadata, pricing information, and financial metrics.

from equity_aggregator import retrieve_canonical_equity, CanonicalEquity

equity: CanonicalEquity = retrieve_canonical_equity("BBG000B9XRY4")

# Access identity metadata
identity = equity.identity
print(f"FIGI: {identity.share_class_figi}")
print(f"ISIN: {identity.isin}")
print(f"CUSIP: {identity.cusip}")

# Access financial metrics
financials = equity.financials
print(f"P/E Ratio: {financials.trailing_pe}")
print(f"Market Cap: {financials.market_cap}")

Example Output:

FIGI: BBG000B9XRY4
ISIN: US0378331005
CUSIP: 037833100
P/E Ratio: 28.5
Market Cap: 3500000000000

[!NOTE] Both functions work independently - retrieve_canonical_equity() automatically downloads data if needed, so there's no requirement to call retrieve_canonical_equities() first.

Data Storage

Equity Aggregator automatically stores its database (i.e. data_store.db) in system-appropriate locations using platform-specific directories:

macOS: ~/Library/Application Support/equity-aggregator/
Windows: %APPDATA%\equity-aggregator\
Linux: ~/.local/share/equity-aggregator/

Log files are also automatically written to the system-appropriate log directory:

macOS: ~/Library/Logs/equity-aggregator/
Windows: %LOCALAPPDATA%\equity-aggregator\Logs\
Linux: ~/.local/state/equity-aggregator/

This ensures consistent integration with the host operating system's data and log management practices.

Development Setup

Follow these steps to set up the development environment for the Equity Aggregator application.

Prerequisites

Before starting, ensure the following conditions have been met:

Python 3.12+: The application requires Python 3.12 or later
uv: Python package manager
Git: For version control
Docker (optional): For containerised development and deployment

Environment Setup

Clone the repository:

git clone <repository-url>
cd equity-aggregator

Create and activate virtual environment:

# Create virtual environment with Python 3.12
uv venv --python 3.12

# Activate the virtual environment
source .venv/bin/activate

Install dependencies:

# Install all dependencies and sync workspace
uv sync --all-packages

Environment Variables

The application requires API keys for external data sources. A template file .env_example is provided in the project root for guidance.

Copy the example environment file:

cp .env_example .env

Configure API keys by editing `.env` and adding the following:

Mandatory Keys:

EXCHANGE_RATE_API_KEY - Required for currency conversion
- Retrieve from: ExchangeRate-API
- Used for converting equity prices to USD reference currency
OPENFIGI_API_KEY - Required for equity identification
- Retrieve from: OpenFIGI
- Used for equity identification and deduplication

Optional Keys:

INTRINIO_API_KEY - For additional data enrichment
- Retrieve from: Intrinio
- Provides supplementary equity enrichment data
GITHUB_TOKEN - For increased GitHub API rate limits
- Retrieve from: GitHub Settings
- Increases release download rate limits from 60/hour to 5,000/hour
- No special scopes required for public repositories

Verify Installation

This setup provides access to the full development environment with all dependencies, testing frameworks, and development tools configured.

It should therefore be possible to verify correct operation by running the following commands using uv:

# Verify the application is properly installed
uv run equity-aggregator --help

# Run unit tests to confirm functionality
uv run pytest -m unit

# Check code formatting and linting
uv run ruff check src

# Test API key configuration
uv run --env-file .env equity-aggregator seed

Running Tests

Run the test suites using the following commands:

# Run all unit tests
uv run pytest -m unit

# Run with verbose output
uv run pytest -m unit -v

# Run with coverage reporting
uv run pytest -m unit --cov=equity_aggregator --cov-report=term-missing

# Run with detailed coverage and HTML report
uv run pytest -vvv -m unit --cov=equity_aggregator --cov-report=term-missing --cov-report=html

# Run live tests (requires API keys and internet connection)
uv run pytest -m live

# Run all tests
uv run pytest

Code Quality and Linting

The project uses ruff for static analysis, code formatting, and linting:

# Format code automatically
uv run ruff format

# Check for linting issues
uv run ruff check

# Fix auto-fixable linting issues
uv run ruff check --fix

# Check formatting without making changes
uv run ruff format --check

# Run linting on specific directory
uv run ruff check src

[!NOTE] Ruff checks only apply to the src directory - tests are excluded from formatting and linting requirements.

Docker

The Equity Aggregator project can optionally be containerised using Docker. The docker-compose.yml defines the equity-aggregator service.

Docker Commands

# Build and run the container
docker compose up --build

# Run in background
docker compose up -d

# Stop and remove containers
docker compose down

# View container logs
docker logs equity-aggregator

# Execute commands in running container
docker compose exec equity-aggregator bash

[!NOTE] The Docker container mounts the data/ directory as a volume for persistent database storage.

Architecture

Project Structure

The codebase is organised following best practices, ensuring a clear separation between core domain logic, external adapters, and infrastructure components:

equity-aggregator/
├── src/equity_aggregator/           # Main application source
│   ├── cli/                         # Command-line interface
│   ├── domain/pipeline/             # Core aggregation pipeline
│   │   └── transforms/              # Transformation stages
│   ├── adapters/data_sources/       # External data integrations
│   │   ├── authoritative_feeds/     # Primary sources (Euronext, LSE, SEC, XETRA)
│   │   └── enrichment_feeds/        # Yahoo Finance integration
│   ├── schemas/                     # Data validation and types
│   └── storage/                     # Database operations
├── data/                            # Database and cache
├── tests/                           # Unit and integration tests
├── docker-compose.yml               # Container configuration
└── pyproject.toml                   # Project metadata and dependencies

Data Transformation Pipeline

The aggregation pipeline consists of six sequential transformation stages, each with a specific responsibility:

Parse: Extract and validate raw equity data from authorative feed data
Convert: Normalise currency values to USD reference currency using live exchange rates
Identify: Attach authoritative identification metadata (i.e. Share Class FIGI) via OpenFIGI API integration
Deduplicate: Merge duplicate equity records predicated on Share Class FIGI
Enrich: Supplement core data with additional market metrics sourced from enrichment feeds
Canonicalise: Transform enriched data into the final canonical equity schema

Clean Architecture Layers

The codebase adheres to clean architecture principles with distinct layers:

Domain Layer (domain/): Contains core business logic, pipeline orchestration, and transformation rules independent of external dependencies
Adapter Layer (adapters/): Implements interfaces for external systems including data feeds, APIs, and third-party services
Infrastructure Layer (storage/, cli/): Handles system concerns, regarding database operations and command-line tooling
Schema Layer (schemas/): Defines data contracts and validation rules using Pydantic models for type safety

Disclaimer

[!IMPORTANT] Important Legal Notice

This software aggregates data from various third-party sources including Yahoo Finance, Euronext, London Stock Exchange, SEC, and XETRA. Equity Aggregator is not affiliated, endorsed, or vetted by any of these organisations.

Data Sources and Terms:

Yahoo Finance: This tool uses Yahoo's publicly available APIs. Refer to Yahoo!'s terms of use for details on your rights to use the actual data downloaded. Yahoo! finance API is intended for personal use only.

Market Data: All market data is obtained from publicly available sources and is intended for research and educational purposes only.

Usage Responsibility:

Users are responsible for complying with all applicable terms of service and legal requirements of the underlying data providers

This software is provided for informational and educational purposes only

No warranty is provided regarding data accuracy, completeness, or fitness for any particular purpose

Users should independently verify any data before making financial decisions

Commercial Use: Users intending commercial use should review and comply with the terms of service of all underlying data providers.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.16

Feb 19, 2026

1.0.15

Feb 18, 2026

1.0.14

Feb 18, 2026

1.0.13

Feb 18, 2026

1.0.12

Feb 18, 2026

1.0.11

Feb 18, 2026

1.0.10

Feb 16, 2026

1.0.9

Feb 16, 2026

1.0.8

Feb 16, 2026

1.0.7

Feb 15, 2026

1.0.6

Feb 15, 2026

1.0.5

Feb 10, 2026

1.0.4

Feb 10, 2026

1.0.3

Feb 8, 2026

1.0.2

Feb 5, 2026

1.0.1

Feb 4, 2026

1.0.0

Jan 22, 2026

0.1.8

Jan 22, 2026

0.1.7

Jan 22, 2026

0.1.6

Jan 22, 2026

0.1.5

Jan 22, 2026

0.1.4

Jan 21, 2026

This version

0.1.1

Sep 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

equity_aggregator-0.1.1.tar.gz (353.8 kB view details)

Uploaded Sep 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

equity_aggregator-0.1.1-py3-none-any.whl (105.1 kB view details)

Uploaded Sep 21, 2025 Python 3

File details

Details for the file equity_aggregator-0.1.1.tar.gz.

File metadata

Download URL: equity_aggregator-0.1.1.tar.gz
Upload date: Sep 21, 2025
Size: 353.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.19

File hashes

Hashes for equity_aggregator-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`97cb08cbf98f5f5482e3aec2c76a03e630096be7c98290592396559d80e2ed53`
MD5	`a36c7f6022171190e842f00aa32be880`
BLAKE2b-256	`8efd01da432180f1c3a1dc3e8ac9d10e9b009c5aa5265d27ad5f604ca7a9afda`

See more details on using hashes here.

File details

Details for the file equity_aggregator-0.1.1-py3-none-any.whl.

File metadata

Download URL: equity_aggregator-0.1.1-py3-none-any.whl
Upload date: Sep 21, 2025
Size: 105.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.19

File hashes

Hashes for equity_aggregator-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f63f9d8296f051b57bea8561734dfd043b62316cc5c9150214d10936868aee38`
MD5	`a3ef08d22a00232c84edbae34767dda0`
BLAKE2b-256	`ddf8d336bc9d7a6972d986ab8c7784f1587f2b170e3e6fa13bb9911212a905a8`

See more details on using hashes here.

equity-aggregator 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Equity Aggregator

Description

What kind of Equity Data is available?

Identity Metadata

Financial Metrics

How do I get started?

Package Installation

CLI Usage

Download Command

Export Command

Seed Command

Python API Integration

Retrieving All Equities

Retrieving Individual Equities

Data Models

Data Storage

Development Setup

Prerequisites

Environment Setup

Clone the repository:

Create and activate virtual environment:

Install dependencies:

Environment Variables

Copy the example environment file:

Configure API keys by editing .env and adding the following:

Mandatory Keys:

Optional Keys:

Verify Installation

Running Tests

Code Quality and Linting

Docker

Docker Commands

Architecture

Project Structure

Data Transformation Pipeline

Clean Architecture Layers

Disclaimer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configure API keys by editing `.env` and adding the following: