Skip to main content

A Python library for registering JSON objects in PostgreSQL with canonicalisation and caching

Project description

json-register

CI

Note: This library is currently in beta. The API is stable but may change in future releases based on user feedback and production usage.

json-register is a caching registry for JSON objects, with storage in a PostgreSQL database, using their JSONB encoding. It ensures that semantically equivalent JSON objects are cached only once by employing a canonicalisation strategy in the cache, and using JSONB comparisons in the database. The database assigns a uniqiue 32-bit integer identifier to each object.

This library is written in Rust and provides native bindings for Python, allowing for seamless integration into applications written in either language.

Features

  • Canonicalisation: JSON objects are canonicalised (keys sorted, whitespace removed) before storage to ensure uniqueness based on content.
  • Caching: An in-memory Least Recently Used (LRU) cache minimizes database lookups for frequently accessed objects.
  • PostgreSQL Integration: Efficiently stores and retrieves JSON data using PostgreSQL's JSONB type.
  • Batch Processing: Supports batch registration of objects to reduce network round-trips and improve throughput.
  • Cross-Language Support: Provides a native Rust API and a Python extension module.
  • Security: SQL injection prevention through identifier validation and automatic password sanitization in error messages.
  • Configurable Timeouts: Optional connection pool timeouts for acquire, idle, and maximum lifetime settings.
  • Monitoring: Query methods for connection pool metrics and cache hit rate statistics.

Installation

Rust

Add the following to your Cargo.toml:

[dependencies]
json-register = "0.3.0"
tokio = { version = "1.0", features = ["full"] }
serde_json = "1.0"

Python

Ensure you have a compatible Python environment (3.8+) and install the package.

Currently available on TestPyPI:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ json-register-rust

Once published to PyPI:

pip install json-register-rust

Database Schema

Before using json-register, create the required table and index in your PostgreSQL database:

CREATE TABLE IF NOT EXISTS json_objects (
    id SERIAL PRIMARY KEY,
    json_object JSONB UNIQUE NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_json_objects_gin ON json_objects USING GIN (json_object);

The GIN index enables efficient containment and path queries on the JSONB column. You can customise the table name, id column, and jsonb column names - just ensure they match your Register / JsonRegister configuration.

Usage

Rust Example

The following example demonstrates how to initialize the registry and register JSON objects using the Rust API.

use json_register::Register;
use serde_json::json;
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    // Configuration parameters
    let connection_string = "postgres://user:password@localhost:5432/dbname";
    let table_name = "json_objects";
    let id_column = "id";
    let jsonb_column = "data";
    let pool_size = 10;
    let lru_cache_size = 1000;

    // Initialize the register
    let register = Register::new(
        connection_string,
        table_name,
        id_column,
        jsonb_column,
        pool_size,
        lru_cache_size,
        None, // acquire_timeout_secs (defaults to 5)
        None, // idle_timeout_secs (defaults to 600)
        None, // max_lifetime_secs (defaults to 1800)
        None, // use_tls (defaults to false)
        None, // ca_cert_path (for private CAs)
    ).await?;

    // Register a single object
    let object = json!({
        "name": "Alice",
        "role": "Engineer",
        "active": true
    });

    let id = register.register_object(&object).await?;
    println!("Registered object with ID: {}", id);

    // Register a batch of objects
    let batch = vec![
        json!({"name": "Bob", "role": "Manager"}),
        json!({"name": "Charlie", "role": "Designer"}),
    ];

    let ids = register.register_batch_objects(&batch).await?;
    println!("Registered batch IDs: {:?}", ids);

    Ok(())
}

Python Example (Synchronous)

The following example demonstrates how to use the library within a Python application using the synchronous API.

from json_register import JsonRegister

def main():
    # Initialize the register
    register = JsonRegister(
        database_name="dbname",
        database_host="localhost",
        database_port=5432,
        database_user="user",
        database_password="password",
        lru_cache_size=1000,
        table_name="json_objects",
        id_column="id",
        jsonb_column="data",
        pool_size=10
    )

    # Register a single object
    obj = {
        "name": "Alice",
        "role": "Engineer",
        "active": True
    }

    obj_id = register.register_object(obj)
    print(f"Registered object with ID: {obj_id}")

    # Register a batch of objects
    batch = [
        {"name": "Bob", "role": "Manager"},
        {"name": "Charlie", "role": "Designer"}
    ]

    batch_ids = register.register_batch_objects(batch)
    print(f"Registered batch IDs: {batch_ids}")

if __name__ == "__main__":
    main()

Python Example (Asynchronous)

For async Python applications (FastAPI, aiohttp, etc.), use the async variants to avoid blocking the event loop.

from json_register import JsonRegister
import asyncio

async def main():
    # Initialize the register (constructor is synchronous)
    register = JsonRegister(
        database_name="dbname",
        database_host="localhost",
        database_port=5432,
        database_user="user",
        database_password="password",
        lru_cache_size=1000,
        table_name="json_objects",
        id_column="id",
        jsonb_column="data",
        pool_size=10
    )

    # Register a single object asynchronously
    obj = {
        "name": "Alice",
        "role": "Engineer",
        "active": True
    }

    obj_id = await register.register_object_async(obj)
    print(f"Registered object with ID: {obj_id}")

    # Register a batch of objects asynchronously
    batch = [
        {"name": "Bob", "role": "Manager"},
        {"name": "Charlie", "role": "Designer"}
    ]

    batch_ids = await register.register_batch_objects_async(batch)
    print(f"Registered batch IDs: {batch_ids}")

if __name__ == "__main__":
    asyncio.run(main())

Configuration

Timeout Parameters

Optional timeout parameters can be specified when initializing the register. All timeouts are in seconds.

  • acquire_timeout_secs: Timeout for acquiring a connection from the pool (default: 5)
  • idle_timeout_secs: Timeout before closing idle connections (default: 600)
  • max_lifetime_secs: Maximum lifetime of a connection (default: 1800)

Rust Example with Custom Timeouts

let register = Register::new(
    connection_string,
    table_name,
    id_column,
    jsonb_column,
    pool_size,
    lru_cache_size,
    Some(10),   // 10 second acquire timeout
    Some(300),  // 5 minute idle timeout
    Some(3600), // 1 hour max lifetime
    None,       // use_tls
    None,       // ca_cert_path
).await?;

Python Example with Custom Timeouts

register = JsonRegister(
    database_name="dbname",
    database_host="localhost",
    database_port=5432,
    database_user="user",
    database_password="password",
    acquire_timeout_secs=10,   # 10 second acquire timeout
    idle_timeout_secs=300,     # 5 minute idle timeout
    max_lifetime_secs=3600,    # 1 hour max lifetime
)

TLS Configuration

The library supports TLS for secure database connections, including custom CA certificates for private/internal environments.

Public CA (AWS RDS, Cloud SQL, etc.)

register = JsonRegister(
    database_name="dbname",
    database_host="db.example.com",
    database_port=5432,
    database_user="user",
    database_password="password",
    use_tls=True
)

Private CA (On-Premises / Internal)

For environments where PostgreSQL uses certificates signed by an internal CA:

register = JsonRegister(
    database_name="dbname",
    database_host="db.internal",
    database_port=5432,
    database_user="user",
    database_password="password",
    use_tls=True,
    ca_cert_path="/etc/ssl/certs/internal-ca.pem"
)
let register = Register::new(
    "postgres://user:password@db.internal:5432/dbname",
    "json_objects", "id", "json_object",
    10, 1000,
    None, None, None,
    Some(true),  // use_tls
    Some("/etc/ssl/certs/internal-ca.pem"),  // ca_cert_path
).await?;

When ca_cert_path is provided, TLS is automatically enabled (you don't need to also set use_tls=True, though it's recommended for clarity).

Security Logging

The library emits structured warnings at connection time for security-sensitive configurations:

  • WARN when TLS is disabled (plaintext connections)
  • WARN when no password is configured
  • WARN when custom CA certificates are loaded (with cert count and path)
  • ERROR if the CA certificate file cannot be read or parsed

These warnings are emitted via tracing (Rust) and bridged to Python's logging module automatically.

Monitoring

The library provides comprehensive telemetry metrics for integration with monitoring systems such as Prometheus, OpenTelemetry, or custom logging. All metrics can be retrieved individually or as a complete snapshot.

Connection Pool Metrics

  • pool_size(): Total number of connections in the pool (idle and active)
  • idle_connections(): Number of idle connections available for use
  • active_connections(): Number of connections currently in use
  • is_closed(): Whether the connection pool is closed

Cache Metrics

  • cache_hits(): Total number of successful cache lookups
  • cache_misses(): Total number of unsuccessful cache lookups
  • cache_hit_rate(): Hit rate as a percentage (0.0 to 100.0)
  • cache_size(): Current number of items in the cache
  • cache_capacity(): Maximum cache capacity
  • cache_evictions(): Total number of items evicted from the cache

Database Metrics

  • db_queries_total(): Total number of database queries executed
  • db_query_errors(): Total number of failed database queries

Operation Metrics

  • register_single_calls(): Number of times register_object was called
  • register_batch_calls(): Number of times register_batch_objects was called
  • total_objects_registered(): Total number of objects registered across all calls

Telemetry Snapshot

The telemetry_metrics() method (Rust only) returns a complete snapshot of all metrics in a single call, which is useful for OpenTelemetry exporters

Rust Monitoring Example

// Get all metrics at once (recommended for OpenTelemetry)
let metrics = register.telemetry_metrics();
println!("Cache: {}/{} items, {} evictions", metrics.cache_size, metrics.cache_capacity, metrics.cache_evictions);
println!("Cache performance: {} hits, {} misses ({:.2}% hit rate)",
    metrics.cache_hits, metrics.cache_misses, metrics.cache_hit_rate);
println!("Pool: {} total, {} active, {} idle",
    metrics.pool_size, metrics.active_connections, metrics.idle_connections);
println!("Database: {} queries, {} errors",
    metrics.db_queries_total, metrics.db_query_errors);
println!("Operations: {} objects registered ({} single + {} batch calls)",
    metrics.total_objects_registered, metrics.register_single_calls, metrics.register_batch_calls);

// Or query individual metrics
let hit_rate = register.cache_hit_rate();
let active = register.active_connections();

Python Monitoring Example

# Individual metrics
print(f"Cache: {register.cache_size()}/{register.cache_capacity()} items")
print(f"Cache evictions: {register.cache_evictions()}")
print(f"Active connections: {register.active_connections()}")
print(f"DB queries: {register.db_queries_total()}, errors: {register.db_query_errors()}")
print(f"Objects registered: {register.total_objects_registered()}")
print(f"Single calls: {register.register_single_calls()}, Batch calls: {register.register_batch_calls()}")

Logging

The library uses the tracing crate for structured logging. Logs include connection info, cache hit/miss statistics, and batch sizes.

Rust

Use tracing-subscriber to see logs:

use tracing_subscriber::EnvFilter;

tracing_subscriber::fmt()
    .with_env_filter(EnvFilter::from_default_env())
    .init();

Set the RUST_LOG environment variable to control log levels:

# See debug logs from json-register
RUST_LOG=json_register=debug cargo run

# See trace logs (cache hits/misses)
RUST_LOG=json_register=trace cargo run

Python

Logs are automatically bridged to Python's logging module:

import logging

# Configure Python logging as usual
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s %(levelname)s %(name)s: %(message)s'
)

# Logs from json-register will appear with logger name 'json_register'
# You can also configure just the json_register logger:
logging.getLogger('json_register').setLevel(logging.DEBUG)

Log Levels

Level Content
INFO Connection events, configuration
DEBUG Cache statistics, batch sizes, database queries
TRACE Individual cache hits/misses (verbose)

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telicent_json_register-0.3.0.tar.gz (45.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl (2.7 MB view details)

Uploaded CPython 3.8+Windows x86-64

telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.34+ x86-64

telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl (2.9 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file telicent_json_register-0.3.0.tar.gz.

File metadata

  • Download URL: telicent_json_register-0.3.0.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for telicent_json_register-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e3a92469fda3663961b67631528962b3ff22d03d87abf3c9027e68ba44e62454
MD5 e876dccdcb064053807bbaeb2cae4619
BLAKE2b-256 e9e59bba957d7a9b7e1ddd8c6908ad2fbbdf314bd367407d9d4a40c79e6fe7a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for telicent_json_register-0.3.0.tar.gz:

Publisher: release.yml on telicent-oss/json-register

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 52cfdcdc58560067d8c8d32c8148e555992c3528eb734ca8f50a2ad367fffa88
MD5 d5e7db476f176931b8d67bf4845427b3
BLAKE2b-256 b5d991974dab1ec0d21e5584e705b59fd902864b2ea0e63133f8b7e24ac6dab8

See more details on using hashes here.

Provenance

The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl:

Publisher: release.yml on telicent-oss/json-register

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 7b4c65e79af847a251cac2dd8625b9de23eae06b1a64caba9b08de493fb0fb40
MD5 0f66e1303d2b560e03a4b44c8a544ebf
BLAKE2b-256 e6098fcc62d3bca5e439274a2a4d84f451a2ba05622ea8381ce041dab8319358

See more details on using hashes here.

Provenance

The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on telicent-oss/json-register

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 48ffa5fdb0bfd1ba3cd2607ab835bccb8bfc7dec10fd5fd8327711be9be6cb5e
MD5 eea01879b65c20b6bce907624aeb6fd4
BLAKE2b-256 1cc96a099d9d3a2c6e44f896adc6226259d3e373329dfbd237af0c40a7d55e17

See more details on using hashes here.

Provenance

The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on telicent-oss/json-register

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a3f55f4cc450da08510e81d889900eef8f95d48dbde8d97f22c7246f3c39c415
MD5 fa1093020a89ff235155fa2d48d953d7
BLAKE2b-256 c746118a8b1af77e913937c9d3ed6023661fd3642255e0bc63d62cea53c989a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on telicent-oss/json-register

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page