Skip to main content

Add your description here

Project description

MC Postgres DB

A Python package containing ORM models for a PostgreSQL database that powers a personal quantitative trading and investment analysis platform.

Overview

This package provides SQLAlchemy ORM models and database utilities for managing financial data, trading strategies, portfolio analytics, and market research. The database serves as the backbone for a personal "quant hedge fund" project, storing everything from market data and content data.

Features

  • Asset Models: AssetType and Asset tables for categorizing and managing financial instruments and various fiat and digital currencies
  • Provider Models: ProviderType and Provider tables for handling data sources and exchanges
  • Market Data Models: ProviderAssetMarket table for storing OHLCV and bid/ask price data
  • Order Models: ProviderAssetOrder table for tracking trading orders between assets
  • Content Models: ContentType, ProviderContent, and AssetContent tables for managing news articles and social content
  • Sentiment Models: SentimentType and ProviderContentSentiment tables for analyzing content sentiment
  • Asset Group Models: AssetGroup, AssetGroupMember, and ProviderAssetGroupAttribute tables for pairs trading with asset grouping and statistical analytics
  • Relation Models: ProviderAsset table for mapping relationships between providers and assets

Installation

From PyPI

pip install mc-postgres-db

From Source

# Clone the repository
git clone <repository-url>
cd mc-postgres-db

# Install using uv (recommended)
uv sync

Database Setup

  1. PostgreSQL Setup: Ensure PostgreSQL is installed and running
  2. Environment Variables: Set up your database connection string
    export SQLALCHEMY_DATABASE_URL="postgresql://username:password@localhost:5432/mc_trading_db"
    

Usage

from sqlalchemy import create_engine, select
from sqlalchemy.orm import Session
from mc_postgres_db.models import Asset, Provider, ProviderAssetMarket

# Create database connection
url = "postgresql://username:password@localhost:5432/mc_trading_db"
engine = create_engine(url)

# Query assets
with Session(engine) as session:
    stmt = select(Asset).where(Asset.is_active)
    assets = session.scalars(stmt).all()
    asset_pairs = {asset.id: asset.name for asset in assets}
    print("Available assets:")
    for asset_id, asset_name in asset_pairs.items():
        print(f"{asset_id}: {asset_name}")

# Query market data
with Session(engine) as session:
    stmt = (
        select(ProviderAssetMarket)
        .where(
            ProviderAssetMarket.from_asset_id == 1,  # Bitcoin for example
            ProviderAssetMarket.to_asset_id == 2,    # USD for example
            ProviderAssetMarket.provider_id == 3,    # Binance for example
        )
        .order_by(ProviderAssetMarket.timestamp.desc())
        .limit(10)
    )
    market_data = session.scalars(stmt).all()
    for data in market_data:
        print(f"Timestamp: {data.timestamp}, Close: {data.close}, Volume: {data.volume}")

# Get assets from a provider
with Session(engine) as session:
    stmt = select(Provider).where(Provider.id == 1)
    provider = session.scalars(stmt).one()
    provider_assets = provider.get_all_assets(engine)
    print(f"Assets available from {provider.name}:")
    for provider_asset in provider_assets:
        print(f"Asset code: {provider_asset.asset_code}")

Models Overview

Core Models

  • AssetType: Categorizes assets (e.g., stocks, bonds, cryptocurrencies) with names and descriptions
  • Asset: Represents financial instruments with references to asset types, symbols, and optional underlying assets
  • ProviderType: Categorizes data providers (e.g., exchanges, news services) with names and descriptions
  • Provider: Represents data sources with references to provider types and optional underlying providers
  • ProviderAsset: Maps the relationship between providers and assets with asset codes and active status
  • ProviderAssetOrder: Tracks orders for assets from providers including timestamp, price, and volume
  • ProviderAssetMarket: Stores OHLCV (Open, High, Low, Close, Volume) market data and bid/ask prices for asset pairs
  • ContentType: Categorizes content (e.g., news articles, social media posts) with names and descriptions
  • ProviderContent: Stores content from providers with timestamps, titles, descriptions, and full content
  • AssetContent: Maps the relationship between content and assets
  • SentimentType: Categorizes sentiment analysis methods (e.g., PROVIDER, NLTK, VADER) with names and descriptions
  • ProviderContentSentiment: Stores sentiment analysis results for content with positive, negative, neutral, and overall sentiment scores
  • AssetGroup: Groups assets that share common attributes with names and descriptions
  • AssetGroupMember: Maps asset pairs to asset groups for pairs trading (from_asset_id, to_asset_id)
  • ProviderAssetGroupAttribute: Stores comprehensive statistical analytics for pairs trading including:
    • Cointegration p-values for stationarity testing
    • Ornstein-Uhlenbeck process parameters (mu, theta, sigma) for mean reversion modeling
    • Linear regression parameters (alpha, beta) and fit statistics (MSE, R-squared, standard errors, t-values, p-values, F-statistics)
    • Supports multiple lookback windows and provider-specific calculations

Database Schema Features

  • Inheritance Support: Assets and providers can reference underlying entities for hierarchical relationships
  • Timestamped Records: All tables include creation and update timestamps
  • Soft Delete Pattern: Uses is_active flags to mark records as inactive without deletion
  • Time Series Data: Market data is organized by timestamp for efficient time-series operations
  • Cross-Reference Tables: Enables many-to-many relationships between assets, providers, and content
  • Pairs Trading Support: Specialized tables for pairs trading with statistical measures and multiple time windows
  • Composite Primary Keys: Ensures uniqueness across multiple dimensions (timestamp, provider, asset group, lookback window)

Development

Setting up Development Environment

# Install development dependencies using uv
uv sync --dev

# Run tests
uv run pytest

# Run linting
uv run ruff check
uv run ruff format

Database Migrations

# Generate new migration
uv run alembic revision --autogenerate -m "Description of changes"

# Apply migrations
uv run alembic upgrade head

# Rollback migration
uv run alembic downgrade -1

Project Structure

mc-postgres-db/
├── src/                       # Source code directory
│   └── mc_postgres_db/        # Main package directory
│       ├── __init__.py
│       ├── models.py
│       ├── operations.py
│       ├── prefect/
│       │   ├── __init__.py
│       │   ├── tasks.py
│       │   └── asyncio/
│       │       ├── __init__.py
│       │       └── tasks.py
│       └── testing/
│           ├── __init__.py
│           └── utilities.py
├── tests/                    # Unit and integration tests
├── alembic/                  # Database migrations
├── pyproject.toml            # Project configuration and dependencies
├── uv.lock                   # Locked dependency versions
└── README.md                 # Project documentation

Data Sources

This database integrates with various financial data providers:

  • Market data APIs (Alpha Vantage, IEX Cloud, etc.)
  • Fundamental data providers
  • Alternative data sources
  • Custom scraped data

Security & Compliance

  • Database connections use SSL encryption
  • Sensitive data is encrypted at rest
  • Access controls and audit logging implemented
  • Regular backups and disaster recovery procedures

Performance Considerations

  • Optimized indexes for common query patterns
  • Partitioned tables for large time-series data
  • Connection pooling for high-throughput operations
  • Caching layer for frequently accessed data

Testing Utilities

This package provides a robust testing harness for database-related tests, allowing you to run your tests against a temporary SQLite database that mirrors your PostgreSQL schema. This is especially useful for testing Prefect flows and tasks that interact with the database, without requiring a live PostgreSQL instance or extensive mocking.

postgres_test_harness

The postgres_test_harness context manager (found in mc_postgres_db.testing.utilities) creates a temporary SQLite database file, initializes all ORM models, and patches the Prefect tasks used to obtain the SQLAlchemy engine (both sync and async) so that all database operations in your flows and tasks are transparently redirected to this SQLite database.

Key benefits:

  • No need to change or mock every Prefect flow or task that uses the database engine.
  • All Prefect tasks that call get_engine (sync or async) will automatically use the temporary SQLite database.
  • The database is created fresh for each test session or function (depending on fixture scope), ensuring isolation and repeatability.
  • At the end of the test, the database and all tables are cleaned up.

Usage with Pytest

You can use the harness as a fixture in your tests. For example:

import pytest
from mc_postgres_db.testing.utilities import postgres_test_harness

@pytest.fixture(scope="function", autouse=True)
def postgres_harness():
    with postgres_test_harness():
        yield

def test_my_flow():
    # Any Prefect task that calls get_engine() will use the SQLite test DB
    ...

If you are also testing Prefect flows, the postgres harness will already use Prefect's harness to ensure isolation:

import pytest
from mc_postgres_db.testing.utilities import postgres_test_harness

@pytest.fixture(scope="session", autouse=True)
def postgres_harness():
    with postgres_test_harness(prefect_server_startup_timeout=45):
        yield

Now, all your tests (including those that run Prefect flows) will use the temporary SQLite database, and you don't need to modify your flows or tasks to support testing.

Contributing

This is a personal project, but suggestions and improvements are welcome:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request

License

This project is for personal use and learning purposes.

Disclaimer

This software is for educational and personal use only. It is not intended for production trading or investment advice. Use at your own risk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mc_postgres_db-1.1.7.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mc_postgres_db-1.1.7-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file mc_postgres_db-1.1.7.tar.gz.

File metadata

  • Download URL: mc_postgres_db-1.1.7.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.19

File hashes

Hashes for mc_postgres_db-1.1.7.tar.gz
Algorithm Hash digest
SHA256 fa28ea749bdd48a08b1a44f7ceecc30894ebd1e91b3591328fb759b0060e4a05
MD5 b7b11f2f082bf95b018f86e4a34f0819
BLAKE2b-256 42becdb06e397a25755b34620f119e692453bd2d186a946cfdff013790dfef0b

See more details on using hashes here.

File details

Details for the file mc_postgres_db-1.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for mc_postgres_db-1.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0d923723b0b0ecbde314562ca07fe2897c86c7cfa085350ac1b0bd44a1c9bf79
MD5 48255ec752bba08dc44352f087bb31ab
BLAKE2b-256 597bf37c944f7f09bdd2721c6d9f8733a82f5dfd22feb98a4da4a4151bec2c6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page