Skip to main content

A-Alpha Bio SDK for accessing Atlas datasets

Project description

aablocks

A-Alpha Bio SDK for accessing Atlas data blocks.

Installation

pip install aablocks

For DataFrame support:

pip install aablocks pandas   # For pandas
pip install aablocks polars   # For polars

Quick Start

Python

import aablocks as aa

# Login (opens browser)
aa.login()

# List data blocks
blocks = aa.list_blocks()

# Get data as pandas DataFrame
df = aa.get_data("ab1001")

CLI

# Login (opens browser)
> aablocks login

# List data blocks
> aablocks list

# Download a data block
> aablocks get ab1614 -o data.csv

# Download structure files
> aablocks structures ab1614

Python API

aa.login()

Authenticate with the Atlas. Opens browser for OAuth. No-op if already logged in.

import aablocks as aa
aa.login()

aa.logout()

Clear cached authentication token.

aa.logout()

aa.list_blocks(all_versions=False, format=None)

List all accessible data blocks.

Parameters:

Name Type Description
all_versions bool Return all versions (default: latest only)
format str "csv", "list", "pandas", or "polars" (default: config)

Returns: list[DataBlock] | str | DataFrame

# As pandas DataFrame
df = aa.list_blocks()

# As DataBlock objects
blocks = aa.list_blocks(format="list")
for d in blocks:
    print(f"{d.id}: {d.name}")

# All versions as polars
df = aa.list_blocks(all_versions=True, format="polars")

aa.get_details(block_id, version=None)

Get metadata for a specific block.

Parameters:

Name Type Description
block_id str Data block ID (e.g., "ab1001")
version str Version (default: latest)

Returns: DataBlock

block = aa.get_details("ab1001")
print(block.name)
print(block.modes)  # ['source', 'ml']

aa.get_datacard(block_id, version=None)

Get the datacard content for a block.

Parameters:

Name Type Description
block_id str Data block ID
version str Version (default: latest)

Returns: dict (see OpenAPI DatacardResponse schema)

datacard = aa.get_datacard("ab1001")
print(datacard["exec"]["scientific_value"])
for uc in datacard["exec"]["use_cases"]:
    print(uc["title"], uc["description"])

aa.get_data(block_id, version=None, mode=None, format=None, output_path=None, output_compressed=False, progress=None, schema=True, load_negatives=False)

Download data for a data block. When schema=True (default), column types are automatically applied when reading as pandas or polars — no manual schema handling needed.

Parameters:

Name Type Description
block_id str Data block ID (e.g., "ab1001")
version str Version (default: latest)
mode str Data mode: "source" or "ml"
format str "csv", "pandas", or "polars"
output_path str Write to file instead of returning
output_compressed bool Keep gzip compression
progress bool Show progress bar
schema bool Apply column types automatically (default: True)
load_negatives bool Keep negative control rows (default: False). Set True to include them.

Returns: str | DataFrame | None

# Pandas DataFrame with proper types (Int64, Float64, boolean, string)
df = aa.get_data("ab1001")
print(df.dtypes)

# Polars DataFrame with proper types
df = aa.get_data("ab1001", format="polars")
print(df.schema)

# ML-ready data
df = aa.get_data("ab1001", mode="ml")

# Include negative control rows (ANeg/AlphaNeg)
df = aa.get_data("ab1001", load_negatives=True)

# Skip schema (raw pandas inference)
df = aa.get_data("ab1001", schema=False)

# Download to file
aa.get_data("ab1001", output_path="data.csv")

# Download compressed
aa.get_data("ab1001", output_path="data.csv.gz", output_compressed=True)

aa.download_structures(block_id, version=None, output_path=None, progress=None)

Download all structure files for a data block as a zip archive.

Parameters:

Name Type Description
block_id str Data block ID (e.g., "ab1001")
version str Version (default: latest)
output_path str Output file path (default: filename from server)
progress bool Show progress bar (default: config)

Returns: str (output file path)

# Download structures (filename from server)
path = aa.download_structures("ab1614")

# Download to specific path
path = aa.download_structures("ab1614", output_path="structures.zip")

aa.download_structure(block_id, filename, version=None, output_path=None)

Download a single structure file for a block.

Parameters:

Name Type Description
block_id str Data block ID (e.g., "ab1614")
filename str Structure filename (e.g., "structure.cif")
version str Version (default: latest)
output_path str Write to file. If None, returns content as string.

Returns: str (file content or output file path)

# Get structure content as string
content = aa.download_structure("ab1614", "structure.cif")

# Download to file
path = aa.download_structure("ab1614", "structure.cif", output_path="my_file.cif")

aa.set_config(key, value)

Set a configuration value.

Key Description Default
api_format Output format (csv, pandas, polars) "pandas"
progress Show download progress True
aa.set_config("api_format", "polars")
aa.set_config("progress", False)

DataBlock

Data block metadata container returned by list_blocks(format="list") and get_details().

Attribute Type Description
id str Data block identifier
name str Human-readable name
experiment str Overview/use case
details str Experimental details
modes list[str] Data modes (e.g., ["source", "ml"])
version str Version number
locked bool Locked for current user's tier
url str | None Direct URL
structure_count int | None Number of structure files
a_size int | None A-library size
alpha_size int | None Alpha-library size
total_ppi_count int | None Total PPI count
unique_ppi_count int | None Unique PPI count
product str | None Product slug (e.g., atlas-vhh)
product_display_name str | None Human-readable product name (e.g., Atlas VHH Consortia)
product_kind str | None Product category: consortium, exclusive, or open-source
source str | None Product release name (e.g., VHH Q1 2025); None when unrelased

CLI

After installation, the aablocks command is available in your terminal.

Global Options

Option Description
--version Show version and exit
--help Show help and exit

aablocks login

Log in to the Atlas. Opens your browser for authentication. Tokens are cached locally and automatically refreshed.

> aablocks login
Opening browser for authentication...
Logged in successfully.

aablocks logout

Log out and clear cached credentials.

> aablocks logout
Logged out successfully.

aablocks list [OPTIONS]

List all data blocks accessible to the current user.

Option Description
--all-versions Include all versions of each data block
-f, --format Output format: table, csv, or json (default: table)
# List as table (default) — columns: ID, Name, Version, Released, Structures, A Size, Alpha Size, Total PPI, Unique PPI, Modes
> aablocks list
ID           Name                           Version  Released     Structures   A Size     Alpha Size Total PPI    Unique PPI   Modes
--------------------------------------------------------------------------------------------------------------------------------------
ab1001       AlphaBlock 1001                1        2026-01-21                500        200        100000       50000        source, ml
ab1479       AlphaBlock 1479                1        2026-01-21                800        300        240000       95000        source
ab1614       AlphaBlock 1614                1        2026-01-21   34570

# List as JSON
> aablocks list -f json

# List as CSV
> aablocks list -f csv

# Include all versions
> aablocks list --all-versions

aablocks details <block_id> [OPTIONS]

Show detailed metadata for a specific block.

Option Description
-v, --version Specific version to retrieve
> aablocks details ab1001
ID:           ab1001
Name:         AlphaBlock 1001
Version:      1
Released:     2026-01-21
Groups:       tier1
Modes:        source, ml
Experiment:   Local affinity landscape on VHH72-SARS-CoV-2 RBD

aablocks datacard <block_id> [OPTIONS]

Show the datacard for a block.

Option Description
-v, --version Specific version to retrieve
--raw Output raw JSON instead of formatted text
# Display formatted datacard text
> aablocks datacard ab1001

# Get raw JSON
> aablocks datacard ab1001 --raw

# Save to file
> aablocks datacard ab1001 --raw > datacard.json

aablocks get <block_id> [OPTIONS]

Download CSV data for a block.

Option Description
-v, --version Specific version to retrieve
-m, --mode Data mode (e.g., source, ml). Default: source
-f, --format Output format: csv, table, or gz
-o, --output Output file path
--progress/--no-progress Show download progress bar
--load-negatives Include negative control rows (ANeg/AlphaNeg). Excluded by default.
# Print CSV to stdout
> aablocks get ab1001

# Download to file
> aablocks get ab1001 -o data.csv

# Download compressed
> aablocks get ab1001 -o data.csv.gz -f gz

# Get ML-ready variant
> aablocks get ab1001 --mode ml

# Include negative control rows
> aablocks get ab1001 --load-negatives

# Display as table
> aablocks get ab1001 -f table

# Pipe to other tools
> aablocks get ab1001 | head -100 > sample.csv

aablocks structures <block_id> [OPTIONS]

Download structure files for a block. By default, downloads all structures as a zip archive. Use --file to download a single structure file.

Option Description
-v, --version Specific version to retrieve
-f, --file Download a single structure file by name
-o, --output Write to file at the given path. If no path is given, defaults to the --file filename or the server-provided filename.
--progress/--no-progress Show download progress bar
# Download all structures as zip (filename from server)
> aablocks structures ab1614
Structures written to ab1614_structures.zip

# Download to specific path
> aablocks structures ab1614 -o my_structures.zip
Structures written to my_structures.zip

# Download a single structure file (prints content to stdout)
> aablocks structures ab1614 --file structure.cif

# Download a single structure file to disk
> aablocks structures ab1614 --file structure.cif -o
Structure written to structure.cif

# Download a single structure file to a specific path
> aablocks structures ab1614 --file structure.cif -o my_file.cif
Structure written to my_file.cif

aablocks config [key] [value]

Get or set configuration options.

# Show all settings
> aablocks config

# Get a specific value
> aablocks config cli_format

# Set a value
> aablocks config cli_format table
Key Values Default Description
api_format csv, pandas, polars pandas Default format for Python API
cli_format csv, table csv Default format for CLI output
progress true, false true Show download progress bars

Examples

Complete Python Workflow

import aablocks as aa

# Authenticate
aa.login()

# Browse data blocks
blocks = aa.list_blocks(format="list")
print(f"Found {len(blocks)} data blocks")

for d in blocks[:3]:
    print(f"{d.id}: {d.name} (v{d.version})")

# Get details
details = aa.get_details("ab1001")
print(f"Modes: {details.modes}")

# Download data (schema applied automatically)
df = aa.get_data("ab1001")
print(df.head())

# ML version
df_ml = aa.get_data("ab1001", mode="ml")

# Read datacard
datacard = aa.get_datacard("ab1001")
print(datacard["exec"]["scientific_value"])

# Download structures
aa.download_structures("ab1614")

Scripting

# List data block IDs only
> aablocks list -f csv | tail -n +2 | cut -d, -f1

# Download all accessible data blocks
> for id in $(aablocks list -f csv | tail -n +2 | cut -d, -f1); do
    aablocks get $id -o "${id}.csv"
done

Schema

Each data block has a column schema that defines proper dtypes for pandas and polars. When using get_data(), schemas are applied automatically (schema=True by default).

Automatic Typing

import aablocks as aa

aa.login()

# Pandas — columns are typed as Int64, Float64, boolean, string
df = aa.get_data("ab1001")
print(df.dtypes)
# mata_description                       string
# alphaseq_affinity                      Float64
# above_background                       boolean
# pos_a                                  Int64

# Nullable integers stay Int64 (not float64) even with missing values
print(df["pos_a"].dtype)  # Int64

# Polars — columns are typed via schema_overrides
df = aa.get_data("ab1001", format="polars")
print(df.schema)
# {'mata_description': Utf8, 'alphaseq_affinity': Float64, 'pos_a': Int64, ...}

# Skip schema (raw inference)
df = aa.get_data("ab1001", schema=False)

Manual Schema Usage

# Fetch the schema separately
schema = aa.get_schema("ab1001")
# {'dtype': {'mata_description': 'string', 'pos_a': 'Int64', ...}, 'parse_dates': []}

# Use directly with pandas
import pandas as pd
df = pd.read_csv("local_data.csv", **schema)

# Convert to polars kwargs
polars_kwargs = aa.pandas_schema_to_polars(schema)
import polars as pl
df = pl.read_csv("local_data.csv", **polars_kwargs)

CLI

# Show schema for default mode
> aablocks schema ab1001
{
  "dtype": {
    "mata_description": "string",
    "alphaseq_affinity": "Float64",
    "above_background": "boolean",
    "pos_a": "Int64"
  }
}

# Show schema for ML mode
> aablocks schema ab1001 -m ml

License

Apache 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aablocks-0.1.18.tar.gz (91.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aablocks-0.1.18-py3-none-any.whl (75.6 kB view details)

Uploaded Python 3

File details

Details for the file aablocks-0.1.18.tar.gz.

File metadata

  • Download URL: aablocks-0.1.18.tar.gz
  • Upload date:
  • Size: 91.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aablocks-0.1.18.tar.gz
Algorithm Hash digest
SHA256 d251719926085c66e214339df5f722f331740767c146656bbe309c483362d143
MD5 437f62aef81948542fe7f5c313e2ddd7
BLAKE2b-256 f3963fa0872275aff35dcad42d146897ce0aa6b19282ae66ea9c543bebfbd437

See more details on using hashes here.

File details

Details for the file aablocks-0.1.18-py3-none-any.whl.

File metadata

  • Download URL: aablocks-0.1.18-py3-none-any.whl
  • Upload date:
  • Size: 75.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aablocks-0.1.18-py3-none-any.whl
Algorithm Hash digest
SHA256 f283e34c7278ce9cd10f51f983f28a4dbdf912d96893b244827dbcb6ea8039e0
MD5 3735a8a2b588759edbf596b1693d5f98
BLAKE2b-256 eeb264ab3c813525f65a6e954f928efce86fdd70016ee40241eb2987d61f9ee8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page