Skip to main content

Python bindings for SILO - high-performance sequence database

Project description

LAPIS-SILO

Sequence Indexing engine for Large Order of genomic data

For information on how to build, test, and contribute to SILO, see Contributing.

Python Bindings

SILO provides Python bindings via Cython. The bindings wrap the core C++ Database and are installable by pip install silodb.

See Contributing for build instructions.

Usage

from silodb import Database

# Create a new database
db = Database()

# Or load from a saved state
db = Database("/path/to/saved/database")

# Create a nucleotide sequence table
db.create_nucleotide_sequence_table(
    table_name="sequences",
    primary_key_name="id",
    sequence_name="main",
    reference_sequence="ACGT..."
)

# Append data from file
db.append_data_from_file("sequences", "/path/to/data.ndjson")

# Get reference sequence
ref = db.get_nucleotide_reference_sequence("sequences", "main")

# Get filtered bitmap (list of matching row indices)
indices = db.get_filtered_bitmap("sequences", "some_filter")

# Get prevalent mutations
mutations = db.get_prevalent_mutations(
    table_name="sequences",
    sequence_name="main",
    prevalence_threshold=0.05,
    filter_expression=""
)

# Save database state
db.save_checkpoint("/path/to/save/directory")

# Print all data (to stdout)
db.print_all_data("sequences")

Configuration Files

For SILO, there are three different configuration files:

  • DatabaseConfig described in file database_config.h
  • PreprocessingConfig used when started with preprocessing and described in file preprocessing_config.h. For details see silo preprocessing --help.
  • RuntimeConfig used when started with api and described in file runtime_config.h For details see silo api --help.

The database config contains the schema of the database and is always required when preprocessing data. The database config will be saved together with the output of the preprocessing and is therefore not required when starting SILO as an API.

An example configuration file can be seen in testBaseData/exampleDataset/database_config.yaml.

By default, the config files are expected to be YAML files in the current working directory in snake_case (database_config.yaml, preprocessing_config.yaml, runtime_config.yaml), but their location can be overridden using the options --database-config=X, --preprocessing-config=X, and --runtime-config=X.

Preprocessing and Runtime configurations contain default values for all fields and are thus only optional. Their parameters can also be provided as command-line arguments in snake_case and as environment variables prefixed with SILO_ in capital SNAKE_CASE. (e.g. SILO_INPUT_DIRECTORY).

The precendence is CLI argument > Environment Variable > Configuration File > Default Value

Run The Preprocessing

The preprocessing acts as a program that takes an input directory that contains the to-be-processed data and an output directory where the processed data will be stored. Both need to be mounted to the container.

SILO expects a preprocessing config that can to be mounted to the default location /app/preprocessing_config.yaml.

Additionally, a database config and a ndjson file containing the data are required. They should typically be mounted in /preprocessing/input.

docker run \
  -v your/input/directory:/preprocessing/input \
  -v your/preprocessing/output:/preprocessing/output \
  -v your/preprocessing_config.yaml:/app/preprocessing_config.yaml
  silo preprocessing

Both config files can also be provided in custom locations:

silo preprocessing --preprocessing-config=./custom/preprocessing_config.yaml --database-config=./custom/database_config.yaml

The Docker image contains a default preprocessing config that sets defaults specific for running SILO in Docker. Apart from that, there are default values if neither user-provided nor default config specify fields. The user-provided preprocessing config can be used to overwrite the default values. For a full reference, see the help text.

Run docker container (api)

After preprocessing the data, the api can be started with the following command:

docker run
  -p 8081:8081
  -v your/preprocessing/output:/data
  silo api

The directory where SILO expects the preprocessing output can be overwritten via silo api --data-directory=/custom/data/directory or in a corresponding configuration file.

Acknowledgments

Original genome indexing logic with roaring bitmaps by Prof. Neumann: https://db.in.tum.de/~neumann/gi/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

silodb-0.11.0-cp314-cp314-manylinux_2_35_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.35+ x86-64

silodb-0.11.0-cp314-cp314-manylinux_2_35_aarch64.whl (8.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.35+ ARM64

silodb-0.11.0-cp314-cp314-macosx_11_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

silodb-0.11.0-cp314-cp314-macosx_10_15_x86_64.whl (6.7 MB view details)

Uploaded CPython 3.14macOS 10.15+ x86-64

silodb-0.11.0-cp313-cp313-manylinux_2_35_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.35+ x86-64

silodb-0.11.0-cp313-cp313-manylinux_2_35_aarch64.whl (8.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.35+ ARM64

silodb-0.11.0-cp313-cp313-macosx_11_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

silodb-0.11.0-cp313-cp313-macosx_10_15_x86_64.whl (6.7 MB view details)

Uploaded CPython 3.13macOS 10.15+ x86-64

silodb-0.11.0-cp312-cp312-manylinux_2_35_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

silodb-0.11.0-cp312-cp312-manylinux_2_35_aarch64.whl (8.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ ARM64

silodb-0.11.0-cp312-cp312-macosx_11_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

silodb-0.11.0-cp312-cp312-macosx_10_15_x86_64.whl (6.7 MB view details)

Uploaded CPython 3.12macOS 10.15+ x86-64

silodb-0.11.0-cp311-cp311-manylinux_2_35_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ x86-64

silodb-0.11.0-cp311-cp311-manylinux_2_35_aarch64.whl (8.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ ARM64

silodb-0.11.0-cp311-cp311-macosx_11_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

silodb-0.11.0-cp311-cp311-macosx_10_15_x86_64.whl (6.7 MB view details)

Uploaded CPython 3.11macOS 10.15+ x86-64

File details

Details for the file silodb-0.11.0-cp314-cp314-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp314-cp314-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 8040797a1e3b71ca1cef18439d9657ff283030fcc9d9476dd18b3f769186a08a
MD5 f9661cf59b130955b6046146bce5ee45
BLAKE2b-256 6f9b5ce15a2f4eea18d4c35af67ab36a935a6d0606e5864ab65e720e7c6a6997

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp314-cp314-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp314-cp314-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 00c1f25dd27e7ac9a2a940935fa0593bf0dc584b20e3f49ddddd7f90528ae11d
MD5 2db58fa5d9ba7aed55aca15fb369d715
BLAKE2b-256 f9db5eb7509e11d1982f81d83aedee40886fffad1e904e2ddd57f650ffa6e19d

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 02fb8646326e149adfde402e2aac90490038785329a63f644e415891515b5b70
MD5 80f4597914aed1408ae35b7e8bf2e50a
BLAKE2b-256 c19af41ecc566d15bdcdb9949c9b1497f805731a741194b63de7d3ec128764a6

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp314-cp314-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp314-cp314-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 e85a4498b94aac95f2c96bb88f44728c9f009d24de36cbeb91d200da25e5fb87
MD5 05fbdba973824ef45ac4a252c58aae9c
BLAKE2b-256 32de8a8c572a89d64758c0888d64257e82cf11c6f48c37e2e40ddf6e387efefe

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp313-cp313-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp313-cp313-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 5f246ecbbdbcc522e2e0a328a4413b6c4ac62edfa82861a7208faf79661b1993
MD5 79c1c23799691a076b111e7f662cae9a
BLAKE2b-256 306a631762352df0500868f6a354d7fa22a590bf8c906feedd1628530e297aec

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp313-cp313-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp313-cp313-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 faf78054380e0cd54d3f983caeab7dfd753ab6d843c4a875b8f97b302b4c156a
MD5 79acaab7b43543bc9a55db9768bc81ef
BLAKE2b-256 73a1e8ce8fcf30562f13b7622caac4fb67c783eaf58ef3ba745a9a09157094e7

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 21bdc24d08c2a5965b6c181d8c04c90c95f3055bd16bf5aa7657c3cd63dfbfd2
MD5 217c9916ccf74038e1d5ab0a1ab36624
BLAKE2b-256 cf62e310f2231ebdecc36b1367f0575bc4a67e65dac1de0ee3b56a0069e5ff2b

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp313-cp313-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp313-cp313-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 50ffee4daa53e6f7167218f7067bd9331d5791ee26f7d3dd0411ae117fcdae5e
MD5 05d20a943a883d3b7874d21a39df97ee
BLAKE2b-256 08b7e3af1c49af1b62c5fcdd243ddfbfdc9e3615a7feae1a85d07ed814bb11fa

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 9f25aef00703fde4cc665deee3d1ac60387c4b1ee60ae321301d81e1cee89ae8
MD5 5e824c0279da5df1b954fb1336739f94
BLAKE2b-256 800c1016999de3016a19f5ba2f1d2488cf32b214a95188c002f0a11d56f0e006

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp312-cp312-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp312-cp312-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 28af58658a7c872323e148e13ac2e13b4020643fb3710ba0f0d86d9bdc64897d
MD5 545a505fa2db5631583fb631750704a2
BLAKE2b-256 a55ce1c947a648c65caecd21f47cae88a85c20fd3a319fca0faf07a32412093d

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9a4e772c59ef3e508e0ac0b5540e6a12eecb1518269ecaadaa9b0fb6c815fac1
MD5 5de1cb7aa35b6549b235d7ba8bdd8e7e
BLAKE2b-256 9b495d75762555204d9471ed17473376c2ff9e087f23b1ea1edfc2d30d634e64

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp312-cp312-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp312-cp312-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 c6d01dd95d55f0890ec0ee77b4b21b8d053c8923ebb06c0a540a98b998af1c59
MD5 136a41a4cfcf0de9742babbcf68ec064
BLAKE2b-256 33942f33075b6bf1c6e5e0f40b9a24828d419bac07f476feb5a23a41973a4895

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 fed39f72ba0a91709be35cf6e448ea4648c3cf7f1a420dc1263207efd006e874
MD5 eab83dd01979edd7baae08ace1e6d1ca
BLAKE2b-256 fa1c54cced5ad8cc10cf15109d2074b1e4ea1056e0e0fd5e091edad4524b2690

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp311-cp311-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp311-cp311-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 d8d8755922855798dd6f79928429474766bfbcc595737f63352ce1e8aa8387b6
MD5 c4d95b7a37cbc26eb264ecd71be17255
BLAKE2b-256 28b3ba4a589cfeca2f94935fd3ad09e037b0cd1f9d792ea7d874308bbaa3c54a

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 12eff0eec49edeb707181e0a8c4ac0b660eed3276e43dcd04a567d2485d24683
MD5 8f60888579089c72a71560b44757913e
BLAKE2b-256 313b457fc1c0a00f159221867781b96c4f458de2099125058ac1a33db13663c4

See more details on using hashes here.

File details

Details for the file silodb-0.11.0-cp311-cp311-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.0-cp311-cp311-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b449cb93eb9c6023076813b79ecfe0aedbdedeef5af83576247813d4a93937c2
MD5 9493cbe5e42679fb0d86697e2469f431
BLAKE2b-256 d1c9462534144b7d4ae091672f54e466fd7d70027b5306e3e04bb38bfd99baef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page