Skip to main content

A Python package for Altastata data processing and machine learning integration

Project description

Altastata Python Package

Secure, encrypted cloud storage for Python — with fsspec, PyTorch/TensorFlow, boto3/S3, gRPC, and a bundled Web Console file manager.

pip install altastata

What you get

Capability How
Encrypted files in S3, Azure, IBM COS, etc. AltaStataFunctions + account folder
Standard Python file APIs fsspec (create_filesystem)
ML datasets AltaStataPyTorchDataset, AltaStataTensorFlowDataset
S3 tools (boto3, aws CLI, s3fs) Bundled S3 gateway on port 9876
gRPC API + browser UI Bundled gateway on port 9877
Real-time share/delete events gRPC EventsService or Py4J callback server

Account folder

Each user keeps one directory under ~/.altastata/accounts/<display-name>/, for example:

amazon.rsa.bob123/
  altastata-myorg-bob123.user.properties   # cloud credentials (from your admin)
  private.key                              # RSA (password-encrypted PEM)
  public.key
Account type Key files in folder Password at login
RSA private.key, public.key Yes — decrypts private.key
PQC kyber_private.key, dilithium_private.key, … Yes
HPCS hpcs-privkey.blob, public.key, hpcs.marker No — GREP11 on gateway
HSM *user.properties only (keys in cloud KMS/HSM) No

Your org admin creates *user.properties after you send them public.key (RSA/PQC/HPCS).


Quick start (recommended: gRPC)

transport="grpc" starts the bundled Java gateway if it is not already running.

from altastata import AltaStataFunctions

# RSA / PQC — use your real password
f = AltaStataFunctions.from_account_dir(
    "/path/to/.altastata/accounts/amazon.rsa.bob123",
    transport="grpc",
    password="your_password",
)

# HPCS / HSM — empty password
f = AltaStataFunctions.from_account_dir(
    "/path/to/.altastata/accounts/amazon.rsa.hpcs.myuser",
    transport="grpc",
    password="",
)

versions = f.list_cloud_versions("Public/", True)
print(versions)

Ports (local gateway)

Port Service
9877 gRPC + Web Console (open http://127.0.0.1:9877)
9876 S3-compatible REST API
25333 Py4J (legacy in-process bridge)

Start the gateway manually:

altastata-grpc-server
# same as: python -m altastata.grpc_server

When the wheel includes altastata/lib/altastata-console-static/, the launcher serves the AltaStata Console SPA on 9877. In the browser: Settings → choose your account folder → Sign in (HSM/HPCS: leave password blank).

Set ALTASTATA_WEB_UI_DIR= (empty) to disable the UI and keep gRPC-only.

HPCS in Docker / Jupyter

The gateway needs a populated grep11client.yaml (mount at /etc/ep11client/grep11client.yaml) and access to your hpcs-privkey.blob. See containers/jupyter/README-Docker.md and .cursor/rules in the mycloud repo for compose overrides.


Legacy Py4J transport (default)

from altastata import AltaStataFunctions

f = AltaStataFunctions.from_account_dir("/path/to/account")
f.set_password("your_password")   # omit for HSM; HPCS uses GREP11 on gateway

Or inline credentials:

f = AltaStataFunctions.from_credentials(user_properties_text, private_key_pem)
f.set_password("your_password")

fsspec

from altastata import AltaStataFunctions
from altastata.fsspec import create_filesystem

f = AltaStataFunctions.from_account_dir("/path/to/account", transport="grpc", password="secret")
fs = create_filesystem(f, "my_account")

for name in fs.ls("Public/"):
    print(name)

with fs.open("Public/readme.txt", "r") as fh:
    print(fh.read())

Works with pandas, dask, LangChain DirectoryLoader, and other fsspec consumers.


S3-compatible API (boto3, aws CLI, s3fs)

The bundled JVM exposes S3 on 9876 (same account as the Python API).

f = AltaStataFunctions.from_account_dir("/path/to/account", transport="grpc", password="secret")

# Convenience wrapper (requires: pip install boto3)
s3 = f.boto3_s3()
s3.put_object(Bucket="altastata-bucket", Key="hello.txt", Body=b"hi")

# Or pass creds to any S3 client
creds = f.s3_credentials()
# endpoint_url http://127.0.0.1:9876, aws_access_key_id, aws_secret_access_key, ...

f.install_aws_env()   # sets AWS_* for shell / !aws s3 ls in Jupyter

S3 gateway is enabled by default in the Jupyter Docker image (ALTASTATA_SERVICES_S3GATEWAY_ENABLED=true).


PyTorch & TensorFlow

from altastata import AltaStataFunctions, AltaStataPyTorchDataset, AltaStataTensorFlowDataset
from altastata.altastata_pytorch_dataset import register_altastata_functions_for_pytorch
from altastata.altastata_tensorflow_dataset import register_altastata_functions_for_tensorflow

f = AltaStataFunctions.from_account_dir("/path/to/account", transport="grpc", password="secret")
register_altastata_functions_for_pytorch(f, "my_account")

dataset = AltaStataPyTorchDataset("my_account", root_dir="Public/", file_pattern="*.jpg")

See examples/pytorch-example/ and examples/tensorflow-example/.


Event notifications

def on_event(name, data):
    print(name, data)

f = AltaStataFunctions.from_account_dir(
    "/path/to/account",
    enable_callback_server=True,
    callback_server_port=25334,
)
f.set_password("secret")
f.add_event_listener(on_event)

With transport="grpc", use the Web Console or gRPC EventsService.Watch for cross-user SHARE/DELETE notifications.

See examples/event-listener-example/.


Docker Jupyter (optional)

Pre-built images: ghcr.io/sergevil/altastata/jupyter-datascience-{arm64,amd64}:latest

cd containers/jupyter
docker compose -f docker-compose.yml -f docker-compose-ghcr.yml up -d

Full build/run guide: containers/jupyter/README-Docker.md


More documentation

  • Developers (build wheel, bundle JAR + Console SPA, PyPI): README-developer.md
  • gRPC design (LoginV2, account setup): mycloud/altastata-grpc/CONSOLE_ACCOUNT_SETUP_DESIGN.md
  • Examples: examples/

License

MIT License — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

altastata-0.1.44.tar.gz (137.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

altastata-0.1.44-py3-none-any.whl (137.8 MB view details)

Uploaded Python 3

File details

Details for the file altastata-0.1.44.tar.gz.

File metadata

  • Download URL: altastata-0.1.44.tar.gz
  • Upload date:
  • Size: 137.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for altastata-0.1.44.tar.gz
Algorithm Hash digest
SHA256 f10d16a096f501ca99258426692c9f78ce33cac976b3e7168d4b272f32dd2565
MD5 d78d8b7d2115d366b963afe912e82d95
BLAKE2b-256 d0bf54167dad778ad72b88367ea247a099b3f1c5ab6fae56b53ace2530fbb9ca

See more details on using hashes here.

File details

Details for the file altastata-0.1.44-py3-none-any.whl.

File metadata

  • Download URL: altastata-0.1.44-py3-none-any.whl
  • Upload date:
  • Size: 137.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for altastata-0.1.44-py3-none-any.whl
Algorithm Hash digest
SHA256 6e61399b0c97915d36b6931147ca4bc5a83679a95f9bc252e214505c14f4f3fb
MD5 4fa928836f3e903c39bff7a42f7c4fb6
BLAKE2b-256 fd5877918441358a11ae63bcfbbe89d3608946d300c88e37dcd073d5bd21a900

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page