A Python library for registering JSON objects in PostgreSQL with canonicalisation and caching
Project description
json-register
Note: This library is currently in beta. The API is stable but may change in future releases based on user feedback and production usage.
json-register is a caching registry for JSON objects, with storage in a PostgreSQL database, using their JSONB encoding. It ensures that semantically equivalent JSON objects are cached only once by employing a canonicalisation strategy in the cache, and using JSONB comparisons in the database. The database assigns a uniqiue 32-bit integer identifier to each object.
This library is written in Rust and provides native bindings for Python, allowing for seamless integration into applications written in either language.
Features
- Canonicalisation: JSON objects are canonicalised (keys sorted, whitespace removed) before storage to ensure uniqueness based on content.
- Caching: An in-memory Least Recently Used (LRU) cache minimizes database lookups for frequently accessed objects.
- PostgreSQL Integration: Efficiently stores and retrieves JSON data using PostgreSQL's
JSONBtype. - Batch Processing: Supports batch registration of objects to reduce network round-trips and improve throughput.
- Cross-Language Support: Provides a native Rust API and a Python extension module.
- Security: SQL injection prevention through identifier validation and automatic password sanitization in error messages.
- Configurable Timeouts: Optional connection pool timeouts for acquire, idle, and maximum lifetime settings.
- Monitoring: Query methods for connection pool metrics and cache hit rate statistics.
Installation
Rust
Add the following to your Cargo.toml:
[dependencies]
json-register = "0.3.0"
tokio = { version = "1.0", features = ["full"] }
serde_json = "1.0"
Python
Ensure you have a compatible Python environment (3.8+) and install the package.
Currently available on TestPyPI:
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ json-register-rust
Once published to PyPI:
pip install json-register-rust
Database Schema
Before using json-register, create the required table and index in your PostgreSQL database:
CREATE TABLE IF NOT EXISTS json_objects (
id SERIAL PRIMARY KEY,
json_object JSONB UNIQUE NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_json_objects_gin ON json_objects USING GIN (json_object);
The GIN index enables efficient containment and path queries on the JSONB column. You can customise the table name, id column, and jsonb column names - just ensure they match your Register / JsonRegister configuration.
Usage
Rust Example
The following example demonstrates how to initialize the registry and register JSON objects using the Rust API.
use json_register::Register;
use serde_json::json;
use std::error::Error;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
// Configuration parameters
let connection_string = "postgres://user:password@localhost:5432/dbname";
let table_name = "json_objects";
let id_column = "id";
let jsonb_column = "data";
let pool_size = 10;
let lru_cache_size = 1000;
// Initialize the register
let register = Register::new(
connection_string,
table_name,
id_column,
jsonb_column,
pool_size,
lru_cache_size,
None, // acquire_timeout_secs (defaults to 5)
None, // idle_timeout_secs (defaults to 600)
None, // max_lifetime_secs (defaults to 1800)
None, // use_tls (defaults to false)
None, // ca_cert_path (for private CAs)
).await?;
// Register a single object
let object = json!({
"name": "Alice",
"role": "Engineer",
"active": true
});
let id = register.register_object(&object).await?;
println!("Registered object with ID: {}", id);
// Register a batch of objects
let batch = vec![
json!({"name": "Bob", "role": "Manager"}),
json!({"name": "Charlie", "role": "Designer"}),
];
let ids = register.register_batch_objects(&batch).await?;
println!("Registered batch IDs: {:?}", ids);
Ok(())
}
Python Example (Synchronous)
The following example demonstrates how to use the library within a Python application using the synchronous API.
from json_register import JsonRegister
def main():
# Initialize the register
register = JsonRegister(
database_name="dbname",
database_host="localhost",
database_port=5432,
database_user="user",
database_password="password",
lru_cache_size=1000,
table_name="json_objects",
id_column="id",
jsonb_column="data",
pool_size=10
)
# Register a single object
obj = {
"name": "Alice",
"role": "Engineer",
"active": True
}
obj_id = register.register_object(obj)
print(f"Registered object with ID: {obj_id}")
# Register a batch of objects
batch = [
{"name": "Bob", "role": "Manager"},
{"name": "Charlie", "role": "Designer"}
]
batch_ids = register.register_batch_objects(batch)
print(f"Registered batch IDs: {batch_ids}")
if __name__ == "__main__":
main()
Python Example (Asynchronous)
For async Python applications (FastAPI, aiohttp, etc.), use the async variants to avoid blocking the event loop.
from json_register import JsonRegister
import asyncio
async def main():
# Initialize the register (constructor is synchronous)
register = JsonRegister(
database_name="dbname",
database_host="localhost",
database_port=5432,
database_user="user",
database_password="password",
lru_cache_size=1000,
table_name="json_objects",
id_column="id",
jsonb_column="data",
pool_size=10
)
# Register a single object asynchronously
obj = {
"name": "Alice",
"role": "Engineer",
"active": True
}
obj_id = await register.register_object_async(obj)
print(f"Registered object with ID: {obj_id}")
# Register a batch of objects asynchronously
batch = [
{"name": "Bob", "role": "Manager"},
{"name": "Charlie", "role": "Designer"}
]
batch_ids = await register.register_batch_objects_async(batch)
print(f"Registered batch IDs: {batch_ids}")
if __name__ == "__main__":
asyncio.run(main())
Configuration
Timeout Parameters
Optional timeout parameters can be specified when initializing the register. All timeouts are in seconds.
acquire_timeout_secs: Timeout for acquiring a connection from the pool (default: 5)idle_timeout_secs: Timeout before closing idle connections (default: 600)max_lifetime_secs: Maximum lifetime of a connection (default: 1800)
Rust Example with Custom Timeouts
let register = Register::new(
connection_string,
table_name,
id_column,
jsonb_column,
pool_size,
lru_cache_size,
Some(10), // 10 second acquire timeout
Some(300), // 5 minute idle timeout
Some(3600), // 1 hour max lifetime
None, // use_tls
None, // ca_cert_path
).await?;
Python Example with Custom Timeouts
register = JsonRegister(
database_name="dbname",
database_host="localhost",
database_port=5432,
database_user="user",
database_password="password",
acquire_timeout_secs=10, # 10 second acquire timeout
idle_timeout_secs=300, # 5 minute idle timeout
max_lifetime_secs=3600, # 1 hour max lifetime
)
TLS Configuration
The library supports TLS for secure database connections, including custom CA certificates for private/internal environments.
Public CA (AWS RDS, Cloud SQL, etc.)
register = JsonRegister(
database_name="dbname",
database_host="db.example.com",
database_port=5432,
database_user="user",
database_password="password",
use_tls=True
)
Private CA (On-Premises / Internal)
For environments where PostgreSQL uses certificates signed by an internal CA:
register = JsonRegister(
database_name="dbname",
database_host="db.internal",
database_port=5432,
database_user="user",
database_password="password",
use_tls=True,
ca_cert_path="/etc/ssl/certs/internal-ca.pem"
)
let register = Register::new(
"postgres://user:password@db.internal:5432/dbname",
"json_objects", "id", "json_object",
10, 1000,
None, None, None,
Some(true), // use_tls
Some("/etc/ssl/certs/internal-ca.pem"), // ca_cert_path
).await?;
When ca_cert_path is provided, TLS is automatically enabled (you don't need to also set use_tls=True, though it's recommended for clarity).
Security Logging
The library emits structured warnings at connection time for security-sensitive configurations:
- WARN when TLS is disabled (plaintext connections)
- WARN when no password is configured
- WARN when custom CA certificates are loaded (with cert count and path)
- ERROR if the CA certificate file cannot be read or parsed
These warnings are emitted via tracing (Rust) and bridged to Python's logging module automatically.
Monitoring
The library provides comprehensive telemetry metrics for integration with monitoring systems such as Prometheus, OpenTelemetry, or custom logging. All metrics can be retrieved individually or as a complete snapshot.
Connection Pool Metrics
pool_size(): Total number of connections in the pool (idle and active)idle_connections(): Number of idle connections available for useactive_connections(): Number of connections currently in useis_closed(): Whether the connection pool is closed
Cache Metrics
cache_hits(): Total number of successful cache lookupscache_misses(): Total number of unsuccessful cache lookupscache_hit_rate(): Hit rate as a percentage (0.0 to 100.0)cache_size(): Current number of items in the cachecache_capacity(): Maximum cache capacitycache_evictions(): Total number of items evicted from the cache
Database Metrics
db_queries_total(): Total number of database queries executeddb_query_errors(): Total number of failed database queries
Operation Metrics
register_single_calls(): Number of timesregister_objectwas calledregister_batch_calls(): Number of timesregister_batch_objectswas calledtotal_objects_registered(): Total number of objects registered across all calls
Telemetry Snapshot
The telemetry_metrics() method (Rust only) returns a complete snapshot of all metrics in a single call, which is useful for OpenTelemetry exporters
Rust Monitoring Example
// Get all metrics at once (recommended for OpenTelemetry)
let metrics = register.telemetry_metrics();
println!("Cache: {}/{} items, {} evictions", metrics.cache_size, metrics.cache_capacity, metrics.cache_evictions);
println!("Cache performance: {} hits, {} misses ({:.2}% hit rate)",
metrics.cache_hits, metrics.cache_misses, metrics.cache_hit_rate);
println!("Pool: {} total, {} active, {} idle",
metrics.pool_size, metrics.active_connections, metrics.idle_connections);
println!("Database: {} queries, {} errors",
metrics.db_queries_total, metrics.db_query_errors);
println!("Operations: {} objects registered ({} single + {} batch calls)",
metrics.total_objects_registered, metrics.register_single_calls, metrics.register_batch_calls);
// Or query individual metrics
let hit_rate = register.cache_hit_rate();
let active = register.active_connections();
Python Monitoring Example
# Individual metrics
print(f"Cache: {register.cache_size()}/{register.cache_capacity()} items")
print(f"Cache evictions: {register.cache_evictions()}")
print(f"Active connections: {register.active_connections()}")
print(f"DB queries: {register.db_queries_total()}, errors: {register.db_query_errors()}")
print(f"Objects registered: {register.total_objects_registered()}")
print(f"Single calls: {register.register_single_calls()}, Batch calls: {register.register_batch_calls()}")
Logging
The library uses the tracing crate for structured logging. Logs include connection info, cache hit/miss statistics, and batch sizes.
Rust
Use tracing-subscriber to see logs:
use tracing_subscriber::EnvFilter;
tracing_subscriber::fmt()
.with_env_filter(EnvFilter::from_default_env())
.init();
Set the RUST_LOG environment variable to control log levels:
# See debug logs from json-register
RUST_LOG=json_register=debug cargo run
# See trace logs (cache hits/misses)
RUST_LOG=json_register=trace cargo run
Python
Logs are automatically bridged to Python's logging module:
import logging
# Configure Python logging as usual
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s %(levelname)s %(name)s: %(message)s'
)
# Logs from json-register will appear with logger name 'json_register'
# You can also configure just the json_register logger:
logging.getLogger('json_register').setLevel(logging.DEBUG)
Log Levels
| Level | Content |
|---|---|
INFO |
Connection events, configuration |
DEBUG |
Cache statistics, batch sizes, database queries |
TRACE |
Individual cache hits/misses (verbose) |
License
This project is licensed under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file telicent_json_register-0.3.0.tar.gz.
File metadata
- Download URL: telicent_json_register-0.3.0.tar.gz
- Upload date:
- Size: 45.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3a92469fda3663961b67631528962b3ff22d03d87abf3c9027e68ba44e62454
|
|
| MD5 |
e876dccdcb064053807bbaeb2cae4619
|
|
| BLAKE2b-256 |
e9e59bba957d7a9b7e1ddd8c6908ad2fbbdf314bd367407d9d4a40c79e6fe7a1
|
Provenance
The following attestation bundles were made for telicent_json_register-0.3.0.tar.gz:
Publisher:
release.yml on telicent-oss/json-register
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
telicent_json_register-0.3.0.tar.gz -
Subject digest:
e3a92469fda3663961b67631528962b3ff22d03d87abf3c9027e68ba44e62454 - Sigstore transparency entry: 1349252946
- Sigstore integration time:
-
Permalink:
telicent-oss/json-register@c8f6008e491a32324d60069c29fe03549162bf6e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telicent-oss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8f6008e491a32324d60069c29fe03549162bf6e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 2.7 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52cfdcdc58560067d8c8d32c8148e555992c3528eb734ca8f50a2ad367fffa88
|
|
| MD5 |
d5e7db476f176931b8d67bf4845427b3
|
|
| BLAKE2b-256 |
b5d991974dab1ec0d21e5584e705b59fd902864b2ea0e63133f8b7e24ac6dab8
|
Provenance
The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl:
Publisher:
release.yml on telicent-oss/json-register
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
telicent_json_register-0.3.0-cp38-abi3-win_amd64.whl -
Subject digest:
52cfdcdc58560067d8c8d32c8148e555992c3528eb734ca8f50a2ad367fffa88 - Sigstore transparency entry: 1349253369
- Sigstore integration time:
-
Permalink:
telicent-oss/json-register@c8f6008e491a32324d60069c29fe03549162bf6e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telicent-oss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8f6008e491a32324d60069c29fe03549162bf6e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 3.1 MB
- Tags: CPython 3.8+, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b4c65e79af847a251cac2dd8625b9de23eae06b1a64caba9b08de493fb0fb40
|
|
| MD5 |
0f66e1303d2b560e03a4b44c8a544ebf
|
|
| BLAKE2b-256 |
e6098fcc62d3bca5e439274a2a4d84f451a2ba05622ea8381ce041dab8319358
|
Provenance
The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl:
Publisher:
release.yml on telicent-oss/json-register
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
telicent_json_register-0.3.0-cp38-abi3-manylinux_2_34_x86_64.whl -
Subject digest:
7b4c65e79af847a251cac2dd8625b9de23eae06b1a64caba9b08de493fb0fb40 - Sigstore transparency entry: 1349253151
- Sigstore integration time:
-
Permalink:
telicent-oss/json-register@c8f6008e491a32324d60069c29fe03549162bf6e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telicent-oss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8f6008e491a32324d60069c29fe03549162bf6e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.9 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48ffa5fdb0bfd1ba3cd2607ab835bccb8bfc7dec10fd5fd8327711be9be6cb5e
|
|
| MD5 |
eea01879b65c20b6bce907624aeb6fd4
|
|
| BLAKE2b-256 |
1cc96a099d9d3a2c6e44f896adc6226259d3e373329dfbd237af0c40a7d55e17
|
Provenance
The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on telicent-oss/json-register
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
telicent_json_register-0.3.0-cp38-abi3-macosx_11_0_arm64.whl -
Subject digest:
48ffa5fdb0bfd1ba3cd2607ab835bccb8bfc7dec10fd5fd8327711be9be6cb5e - Sigstore transparency entry: 1349253058
- Sigstore integration time:
-
Permalink:
telicent-oss/json-register@c8f6008e491a32324d60069c29fe03549162bf6e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telicent-oss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8f6008e491a32324d60069c29fe03549162bf6e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.0 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3f55f4cc450da08510e81d889900eef8f95d48dbde8d97f22c7246f3c39c415
|
|
| MD5 |
fa1093020a89ff235155fa2d48d953d7
|
|
| BLAKE2b-256 |
c746118a8b1af77e913937c9d3ed6023661fd3642255e0bc63d62cea53c989a9
|
Provenance
The following attestation bundles were made for telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl:
Publisher:
release.yml on telicent-oss/json-register
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
telicent_json_register-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl -
Subject digest:
a3f55f4cc450da08510e81d889900eef8f95d48dbde8d97f22c7246f3c39c415 - Sigstore transparency entry: 1349253257
- Sigstore integration time:
-
Permalink:
telicent-oss/json-register@c8f6008e491a32324d60069c29fe03549162bf6e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telicent-oss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8f6008e491a32324d60069c29fe03549162bf6e -
Trigger Event:
workflow_dispatch
-
Statement type: