Skip to main content

A Python package for Altastata data processing and machine learning integration

Project description

Altastata Python Package

A powerful Python package for secure, encrypted cloud storage with seamless integration for data processing, AI, machine learning, and RAG applications.

Installation

pip install altastata

Features

  • fsspec filesystem interface - Use standard Python file operations with encrypted cloud storage
  • Real-time Event Notifications - Listen for file share, delete, and create events
  • LangChain Integration - Native support for document loaders and vector stores
  • PyTorch & TensorFlow Support - Custom datasets for machine learning workflows
  • Multi-cloud Support - Works with AWS, Azure, GCP, and more
  • End-to-end Encryption - AES-256 encryption with zero-trust architecture

Quick Start

from altastata import AltaStataFunctions, AltaStataPyTorchDataset, AltaStataTensorFlowDataset
from altastata.altastata_tensorflow_dataset import register_altastata_functions_for_tensorflow
from altastata.altastata_pytorch_dataset import register_altastata_functions_for_pytorch

# Configuration parameters
user_properties = """#My Properties
#Sun Jan 05 12:10:23 EST 2025
AWSSecretKey=*****
AWSAccessKeyId=*****
myuser=bob123
accounttype=amazon-s3-secure
................................................................
region=us-east-1"""

private_key = """-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3,F26EBECE6DDAEC52

poe21ejZGZQ0GOe+EJjDdJpNvJcq/Yig9aYXY2rCGyxXLGVFeYJFg7z6gMCjIpSd
................................................................
wV5BUmp5CEmbeB4r/+BlFttRZBLBXT1sq80YyQIVLumq0Livao9mOg==
-----END RSA PRIVATE KEY-----"""

# Create an instance of AltaStataFunctions
altastata_functions = AltaStataFunctions.from_credentials(user_properties, private_key)
altastata_functions.set_password("my_password")

# Register the altastata functions for PyTorch or TensorFlow as a custom dataset
register_altastata_functions_for_pytorch(altastata_functions, "bob123_rsa")
register_altastata_functions_for_tensorflow(altastata_functions, "bob123_rsa")

# For PyTorch application use
torch_dataset = AltaStataPyTorchDataset(
    "bob123_rsa",
    root_dir=root_dir,
    file_pattern=pattern,
    transform=transform
)

# For TensorFlow application use
tensorflow_dataset = AltaStataTensorFlowDataset(
    "bob123_rsa",  # Using AltaStata account for testing
    root_dir=root_dir,
    file_pattern=pattern,
    preprocess_fn=preprocess_fn
)

fsspec Integration

Altastata implements the fsspec interface, making it compatible with any Python library that uses standard file operations:

from altastata import AltaStataFunctions
from altastata.fsspec import create_filesystem

# Create AltaStata connection
altastata_functions = AltaStataFunctions.from_account_dir('/path/to/account')
altastata_functions.set_password("your_password")

# Create fsspec filesystem
fs = create_filesystem(altastata_functions, "my_account")

# Use it like any Python file system
files = fs.ls("Public/")
with fs.open("Public/Documents/file.txt", "r") as f:
    content = f.read()
    print(content)

This means you can use Altastata with pandas, dask, xarray, and hundreds of other libraries without any special configuration.

Event Listener

Get real-time notifications when file operations occur:

from altastata import AltaStataFunctions

# Event handler
def event_handler(event_name, data):
    print(f"📢 Event: {event_name}, Data: {data}")
    if event_name == "SHARE":
        print("File was shared!")
    elif event_name == "DELETE":
        print("File was deleted!")

# Initialize with callback server
altastata = AltaStataFunctions.from_account_dir(
    '/path/to/account',
    enable_callback_server=True,
    callback_server_port=25334
)
altastata.set_password("your_password")

# Register listener
listener = altastata.add_event_listener(event_handler)

# Events will now be delivered in real-time!
# See event-listener-example/ for complete demos

Perfect for:

  • Audit logging and compliance
  • Real-time sync and backup
  • Security monitoring
  • RAG vector store updates
  • Workflow automation

See event-listener-example/ for complete documentation and working examples.

LangChain Integration

Use Altastata as a document source for LangChain applications:

from langchain.document_loaders import DirectoryLoader
from altastata.fsspec import create_filesystem
from altastata import AltaStataFunctions

# Create AltaStata connection
altastata_functions = AltaStataFunctions.from_account_dir('/path/to/account')
altastata_functions.set_password("your_password")

# Create fsspec filesystem
fs = create_filesystem(altastata_functions, "my_account")

# Use with LangChain document loaders
loader = DirectoryLoader("Public/Documents/", filesystem=fs)
documents = loader.load()

# Use with vector stores
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

vectorstore = FAISS.from_documents(documents, OpenAIEmbeddings())

Perfect for:

  • RAG (Retrieval-Augmented Generation) applications
  • Document processing pipelines
  • Knowledge base construction
  • Multi-modal AI applications

PyTorch & TensorFlow Integration

Altastata provides custom datasets for machine learning workflows:

from altastata import AltaStataFunctions, AltaStataPyTorchDataset
from altastata.altastata_pytorch_dataset import register_altastata_functions_for_pytorch

# Create AltaStata connection
altastata_functions = AltaStataFunctions.from_account_dir('/path/to/account')
altastata_functions.set_password("your_password")

# Register for PyTorch
register_altastata_functions_for_pytorch(altastata_functions, "my_account")

# Use as PyTorch dataset
dataset = AltaStataPyTorchDataset(
    "my_account",
    root_dir="Public/Documents/",
    file_pattern="*.txt",
    transform=your_transform
)

See the full documentation for more examples and advanced usage.

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

altastata-0.1.19.tar.gz (121.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

altastata-0.1.19-py3-none-any.whl (121.2 MB view details)

Uploaded Python 3

File details

Details for the file altastata-0.1.19.tar.gz.

File metadata

  • Download URL: altastata-0.1.19.tar.gz
  • Upload date:
  • Size: 121.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for altastata-0.1.19.tar.gz
Algorithm Hash digest
SHA256 7ce618ace3ff466a78d2bbfc8d0318dfecb2b3750039730cf3bd17101393809e
MD5 12ccdca39e4bb1fa663911433b5c2e3d
BLAKE2b-256 418705b6b52ce8230103e58ee3dc227d43b5ca10803f4eba01654c4ef08b5fbf

See more details on using hashes here.

File details

Details for the file altastata-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: altastata-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 121.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for altastata-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 b8ea7271a221a6c79468e55cca324b49afb9ee3d0351cbc65c4f8bab45128f1e
MD5 bb1fa2b7b030906ae21aa2f40d8884a6
BLAKE2b-256 5af1b0436081968c42727bc822145707324659a6617b2c7f696efa372df13d78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page