Skip to main content

A simple collection type for homogeneous, immutable and ordered sequences.

Reason this release was yanked:

Package file missing

Project description

PureSet

PureSet

Python Version License Version

PureSet is an immutable, ordered, and hashable collection type for Python. It ensures type homogeneity across elements, making it a robust replacement for both sets and sequences in production applications. PureSet offers accuracy, predictability, and clarity in managing homogeneous data structures.


Core Features

  • Immutability: Elements cannot be changed after creation; assures data integrity.
  • Ordering: Retains insertion sequence, making it predictable for iteration, exporting, or display use cases.
  • Hashability: Collections of hashable objects are themselves hashable; can be used as dictionary keys.
  • Uniqueness: Removes duplicates according to standard Python object equality.
  • Type and Schema Homogeneity: Strict enforcement that all elements are of not only the same type, but also of the same shape (for dicts and custom objects—by attribute/property names and types).
  • Performance: Optimized for high efficiency in membership, intersection, union, and set-like operations.
  • Signature Inspection: Provides a .signature property representing the canonical type/structure of the set’s contents, critical for debugging, API contracts, and documentation.
  • Universal Container: Works seamlessly with primitives, tuples, dicts, custom classes, and even mixed nested containers.

Installation & Requirements

To install the PureSet package, simply use pip:

pip install pureset
  • Python Versions: Compatible with Python 3.9 and above.
  • Dependencies: Pure Python, with no external dependencies.

API Overview

This section presents expanded, realistic examples of PureSet in production-grade scenarios, demonstrating its capabilities beyond simple collections.

Real-World Usability

  • Contracts in APIs: Require or emit only valid structures to callers; enforce contract at runtime.
  • Data Pipelines (ETL): Guarantee all records are clean, normalized, and of valid shape before aggregation or transformation.
  • State Machines: Prevent illegal state transitions by checking membership in a PureSet of allowed values.
  • Unique Entity Sets: Model deduplicated entities (users, objects, configurations) with order preserved and structure enforced.
  • Distributed Computing: Share, serialize, or hash-combine validated and immutable data blocks across processes or systems.

PureSet’s .signature is especially useful for audits, logging, debugging mismatches, and can be serialized for external schema verification.

from pureset import PureSet

1. Robust Enum Replacement and State Management

PureSet provides a type-safe, ordered, and immutable alternative for defining a finite set of states or options, offering clear advantages over traditional string literals or basic tuples. It's particularly useful for defining state machine transitions or valid configuration options.

# Define a set of valid order states for an e-commerce system
# The order guarantees a predictable sequence for UI display or reporting.
ORDER_STATES = PureSet("Pending", "Processing", "Shipped", "Delivered", "Cancelled")

def process_order_status_update(order_id: str, new_status: str) -> None:
    if new_status not in ORDER_STATES:
        raise ValueError(
            f"Invalid order status '{new_status}' for order {order_id}.\n"
            f"Allowed states are: [{ORDER_STATES.join(' | ')}]"
        )

    # In a real system, this would interact with a database or external service
    print(f"Order {order_id}: Status updated to '{new_status}'.")


# Simulate a valid status update
process_order_status_update("ORD12345", "Shipped")

# Simulate an invalid status update
try:
    process_order_status_update("ORD12346", "Returned")
except ValueError as e:
    print(e)
    # Invalid order status 'Returned' for order ORD12346. 
    # Allowed states are: [Pending | Processing | Shipped | Delivered | Cancelled]

2. Validating Homogeneity and Schema Consistency for Complex Data Structures

When dealing with collections of dictionaries or custom objects in data processing pipelines or API interactions, ensuring all elements conform to a specific schema is paramount. PureSet enforces not just type homogeneity but also structural consistency, raising errors for schema mismatches. NOTE: PureSet always refers to the first element as a validator of all other elements given afterwards. You can always check the validator schema by using the .signature property.

# Define a PureSet of user profiles, each represented by a dictionary.
# PureSet ensures all dictionaries have the same keys and value types.
user_profiles = PureSet(
    {"id": 1, "name": "Alice Smith", "age": 28, "email": "alice@example.com"},
    {"id": 2, "name": "Bob Johnson", "age": 35, "email": "bob@example.com"},
)

# Attempt to add a profile with a mismatched schema (e.g., missing 'email' or different key)
try:
    mismatched_profiles = PureSet(
        {"id": 3, "name": "Charlie Brown", "age": 42, "email": "charlie@example.com"},
        {"id": 4, "name": "Diana Prince", "years_old": 30},  # Schema mismatch
    )
except TypeError as e:
    print(e)
    # Incompatible element type or shape at position 2:
    # Exp: (<class 'dict'>, {'age': <class 'int'>, 'email': <class 'str'>, 'id': <class 'int'>, 'name': <class 'str'>});
    # Got: (<class 'dict'>, {'id': <class 'int'>, 'name': <class 'str'>, 'years_old': <class 'int'>})


# Example with nested tuples: PureSet enforces consistency for tuples with consistent internal types.
data_points = PureSet((1, "x_coord", 10.5), (2, "y_coord", 20.3))

# Attempt to create a PureSet with inconsistent tuple element types
try:
    invalid_data_points = PureSet(
        (1, "x_coord", 10.5),
        (2, "y_coord", "invalid_value"),  # Type mismatch within tuple
    )
except TypeError as e:
    print(e)
    # Incompatible element type or shape at position 2:
    # Exp: (<class 'tuple'>, (<class 'int'>, <class 'str'>, <class 'float'>));
    # Got: (<class 'tuple'>, (<class 'int'>, <class 'str'>, <class 'str'>))

6. Layer Validation in ML/DL Model Pipelines or Validation of Nested Containers

Handling sequences, matrix input, or data layer validation:

batch = PureSet(
    ([1.4, 2.8, 3.1], 'class_a'),
    ([0.9, 2.2, 3.5], 'class_b'),
)
print(batch.signature)
# Output: (tuple, ([float, float, float], str))

Testing


License

This project is released under the Apache License 2.0. Please review the LICENSE file for further details.


PureSet is engineered to give your Python data code the safety, transparency, and power required for production-scale scenarios—across API, analytics, ML, and system development!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pureset-1.0.250704.1.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pureset-1.0.250704.1-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file pureset-1.0.250704.1.tar.gz.

File metadata

  • Download URL: pureset-1.0.250704.1.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.9

File hashes

Hashes for pureset-1.0.250704.1.tar.gz
Algorithm Hash digest
SHA256 dcd19bb70bcb27c9f623d343cbbbbab4307ea244b9a06094cf6c56dece1594ad
MD5 341b5de65590344da50cee60344a8b69
BLAKE2b-256 5da482770a16b387bc00606d7abfbc895aa5ab6ca7a2cb9f48e0bf505e3e144f

See more details on using hashes here.

File details

Details for the file pureset-1.0.250704.1-py3-none-any.whl.

File metadata

  • Download URL: pureset-1.0.250704.1-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.9

File hashes

Hashes for pureset-1.0.250704.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ce9ef1479982da72d7ba9d7dfc700bbf557ccbca6cad739d86ad7338b3d16274
MD5 5eea8d1f0de95969a72610adc2b43ba3
BLAKE2b-256 0f5e074031c6761306c4ec075f5d6bb45d7a0fee91267c38f5a5cf622aec4550

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page