secureml

A Python library for privacy-preserving machine learning

These details have not been verified by PyPI

Project links

Project description

SecureML Logo

Python Versions

Documentation

SecureML is an open-source Python library that integrates with popular machine learning frameworks like TensorFlow and PyTorch. It provides developers with easy-to-use utilities to ensure that AI agents handle sensitive data in compliance with data protection regulations.

Key Features

Data Anonymization Utilities:
- K-anonymity implementation with adaptive generalization
- Pseudonymization with format-preserving encryption
- Configurable data masking with statistical property preservation
- Hierarchical data generalization with taxonomy support
- Automatic sensitive data detection
Privacy-Preserving Training Methods:
- Differential privacy integration with PyTorch (via Opacus) and TensorFlow (via TF Privacy)
- Federated learning with Flower, allowing training on distributed data without centralization
- Support for secure aggregation and privacy-preserving federated learning
Compliance Checkers: Tools to analyze datasets and model configurations for potential privacy risks
Synthetic Data Generation:
- Multiple generation methods including statistical modeling, GANs, and copulas
- SDV integration with Gaussian Copula, CTGAN, and TVAE synthesizers
- Automatic sensitive data detection and special handling
- Preservation of statistical properties and correlations between variables
- Support for mixed data types (numeric, categorical, datetime)
- Configurable privacy-utility tradeoff controls
- Tabular data synthesis with relation preservation
Regulation-Specific Presets:
- Pre-configured YAML settings aligned with major regulations (GDPR, CCPA, HIPAA, LGPD)
- Detailed compliance requirements for each regulation
- Customizable identifiers for personal data and sensitive information
- Integration with compliance checking functionality
Audit Trails and Reporting:
- Comprehensive audit logging of data access, transformations, and model operations
- Detailed event tracking for privacy-related operations with timestamps and contexts
- Function-level auditing through decorators
- Automated compliance reports in HTML and PDF formats
- Visual dashboards with charts showing privacy metrics and event distributions
- Integration with compliance checkers for continuous monitoring

Installation

With pip (Python 3.11-3.12):

pip install secureml

Optional Dependencies

# For generating PDF reports for compliance and audit trails
pip install secureml[pdf]

# For secure key management with HashiCorp Vault
pip install secureml[vault]

# For all optional components
pip install secureml[pdf,vault]

Quick Start

Data Anonymization

Anonymizing a dataset to comply with privacy regulations:

import pandas as pd
from secureml import anonymize

# Load your dataset
data = pd.DataFrame({
    "name": ["John Doe", "Jane Smith", "Bob Johnson"],
    "age": [32, 45, 28],
    "email": ["john.doe@example.com", "jane.smith@example.com", "bob.j@example.com"],
    "ssn": ["123-45-6789", "987-65-4321", "456-78-9012"],
    "zip_code": ["10001", "94107", "60601"],
    "income": [75000, 82000, 65000]
})
    
# Anonymize using k-anonymity
anonymized_data = anonymize(
    data,
    method="k-anonymity",
    k=2,
        sensitive_columns=["name", "email", "ssn"]
    )
    
    print(anonymized_data)

Compliance Checking with Regulation Presets

SecureML includes built-in presets for major regulations (GDPR, CCPA, HIPAA, LGPD) that define the compliance requirements specific to each regulation:

import pandas as pd
from secureml import check_compliance
    
# Load your dataset
data = pd.read_csv("your_dataset.csv")
    
# Model configuration
model_config = {
    "model_type": "neural_network",
    "input_features": ["age", "income", "zip_code"],
    "output": "purchase_likelihood",
    "training_method": "standard_backprop"
}
    
# Check compliance with GDPR
report = check_compliance(   
    data=data,
    model_config=model_config,
    regulation="GDPR"
)
    
# View compliance issues
if report.has_issues():
    print("Compliance issues found:")
    for issue in report.issues:
        print(f"- {issue['component']}: {issue['issue']} ({issue['severity']})")
        print(f"  Recommendation: {issue['recommendation']}")

Privacy-Preserving Machine Learning

Train a model with differential privacy guarantees:

import torch.nn as nn
import pandas as pd
from secureml import differentially_private_train
    
# Create a simple PyTorch model
model = nn.Sequential(
    nn.Linear(10, 64),
    nn.ReLU(),
    nn.Linear(64, 2),
    nn.Softmax(dim=1)
)
    
# Load your dataset
data = pd.read_csv("your_dataset.csv")
    
# Train with differential privacy
private_model = differentially_private_train(
    model=model,
    data=data,
    epsilon=1.0,  # Privacy budget
    delta=1e-5,   # Privacy delta parameter
    epochs=10,
    batch_size=64
)

Synthetic Data Generation

Generate synthetic data that maintains the statistical properties of the original data:

import pandas as pd
from secureml import generate_synthetic_data
    
# Load your dataset
data = pd.read_csv("your_dataset.csv")
    
# Generate synthetic data
synthetic_data = generate_synthetic_data(
    template=data,
    num_samples=1000,
    method="statistical",  # Options: simple, statistical, sdv-copula, gan
    sensitive_columns=["name", "email", "ssn"]
)
    
print(synthetic_data.head())

Documentation

For detailed documentation, examples, and API reference, visit our documentation.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request or Issue. Our focus is expanding supported legislations beyond GDPR, CCPA, HIPAA, and LGPD. You can help us with that!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Jun 27, 2025

This version

0.3.0

Apr 17, 2025

0.2.4

Apr 10, 2025

0.2.2

Apr 6, 2025

0.2.1

Apr 2, 2025

0.1.9a0 pre-release

Apr 2, 2025

0.1.8a0 pre-release

Apr 2, 2025

0.1.7a0 pre-release

Apr 2, 2025

0.1.6a0 pre-release

Apr 2, 2025

0.1.5a0 pre-release

Apr 2, 2025

0.1.4a0 pre-release

Apr 2, 2025

0.1.3a0 pre-release

Apr 2, 2025

0.1.2a0 pre-release

Apr 2, 2025

0.1.1a0 pre-release

Apr 2, 2025

0.1.0

Apr 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

secureml-0.3.0.tar.gz (87.6 kB view details)

Uploaded Apr 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

secureml-0.3.0-py3-none-any.whl (95.6 kB view details)

Uploaded Apr 17, 2025 Python 3

File details

Details for the file secureml-0.3.0.tar.gz.

File metadata

Download URL: secureml-0.3.0.tar.gz
Upload date: Apr 17, 2025
Size: 87.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for secureml-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`96d22fea878099464fe3483d955b4d6bd5115bc2f0ed79fca5d1b7a1312c37ad`
MD5	`d1a92a86f2a7cc158de4ee024b1d5d85`
BLAKE2b-256	`d65c5a9b326083cc5ca38b02dd4ccf570747e647069adba2e83e7b1dfd550224`

See more details on using hashes here.

File details

Details for the file secureml-0.3.0-py3-none-any.whl.

File metadata

Download URL: secureml-0.3.0-py3-none-any.whl
Upload date: Apr 17, 2025
Size: 95.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for secureml-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f30565cb85696a29d504818689e938697eade8982121ef65ffe3d5f1fd7b64cc`
MD5	`faa0c52af8dfcd8a2b69fa883d8ad5bf`
BLAKE2b-256	`38e4c9429363973c4318982c1da43ec75349d4158d6c8cca84904198f4cab717`

See more details on using hashes here.

secureml 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Documentation

Key Features

Installation

Optional Dependencies

Quick Start

Data Anonymization

Compliance Checking with Regulation Presets

Privacy-Preserving Machine Learning

Synthetic Data Generation

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes