Skip to main content

Python bindings for SILO - high-performance sequence database

Project description

LAPIS-SILO

High-performance analytical database for sequence alignment data

For information on how to build, test, and contribute to SILO, see Contributing.

Python Bindings

SILO provides Python bindings via Cython. The bindings wrap the core C++ Database and are installable by pip install silodb.

See Contributing for build instructions.

Usage

from silodb import Database

# Create a new database
db = Database()

# Or load from a saved state
db = Database("/path/to/saved/database")

# Create a nucleotide sequence table
db.create_nucleotide_sequence_table(
    table_name="sequences",
    primary_key_name="id",
    sequence_name="main",
    reference_sequence="ACGT..."
)

# Append data from file
db.append_data_from_file("sequences", "/path/to/data.ndjson")

# Get reference sequence
ref = db.get_nucleotide_reference_sequence("sequences", "main")

# Get filtered bitmap (list of matching row indices)
indices = db.get_filtered_bitmap("sequences", "some_filter")

# Get prevalent mutations
mutations = db.get_prevalent_mutations(
    table_name="sequences",
    sequence_name="main",
    prevalence_threshold=0.05,
    filter_expression=""
)

# Save database state
db.save_checkpoint("/path/to/save/directory")

# Print all data (to stdout)
db.print_all_data("sequences")

Configuration Files

For SILO, there are three different configuration files:

  • DatabaseConfig described in file database_config.h
  • PreprocessingConfig used when started with preprocessing and described in file preprocessing_config.h. For details see silo preprocessing --help.
  • RuntimeConfig used when started with api and described in file runtime_config.h For details see silo api --help.

The database config contains the schema of the database and is always required when preprocessing data. The database config will be saved together with the output of the preprocessing and is therefore not required when starting SILO as an API.

An example configuration file can be seen in testBaseData/exampleDataset/database_config.yaml.

By default, the config files are expected to be YAML files in the current working directory in snake_case (database_config.yaml, preprocessing_config.yaml, runtime_config.yaml), but their location can be overridden using the options --database-config=X, --preprocessing-config=X, and --runtime-config=X.

Preprocessing and Runtime configurations contain default values for all fields and are thus only optional. Their parameters can also be provided as command-line arguments in snake_case and as environment variables prefixed with SILO_ in capital SNAKE_CASE. (e.g. SILO_INPUT_DIRECTORY).

The precendence is CLI argument > Environment Variable > Configuration File > Default Value

Run The Preprocessing

The preprocessing acts as a program that takes an input directory that contains the to-be-processed data and an output directory where the processed data will be stored. Both need to be mounted to the container.

SILO expects a preprocessing config that can to be mounted to the default location /app/preprocessing_config.yaml.

Additionally, a database config and a ndjson file containing the data are required. They should typically be mounted in /preprocessing/input.

docker run \
  -v your/input/directory:/preprocessing/input \
  -v your/preprocessing/output:/preprocessing/output \
  -v your/preprocessing_config.yaml:/app/preprocessing_config.yaml
  silo preprocessing

Both config files can also be provided in custom locations:

silo preprocessing --preprocessing-config=./custom/preprocessing_config.yaml --database-config=./custom/database_config.yaml

The Docker image contains a default preprocessing config that sets defaults specific for running SILO in Docker. Apart from that, there are default values if neither user-provided nor default config specify fields. The user-provided preprocessing config can be used to overwrite the default values. For a full reference, see the help text.

Run docker container (api)

After preprocessing the data, the api can be started with the following command:

docker run
  -p 8081:8081
  -v your/preprocessing/output:/data
  silo api

The directory where SILO expects the preprocessing output can be overwritten via silo api --data-directory=/custom/data/directory or in a corresponding configuration file.

Acknowledgments

Original genome indexing logic with roaring bitmaps by Prof. Neumann: https://db.in.tum.de/~neumann/gi/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

silodb-0.11.1-cp314-cp314-manylinux_2_35_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.35+ x86-64

silodb-0.11.1-cp314-cp314-manylinux_2_35_aarch64.whl (8.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.35+ ARM64

silodb-0.11.1-cp314-cp314-macosx_11_0_arm64.whl (5.9 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

silodb-0.11.1-cp314-cp314-macosx_10_15_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.14macOS 10.15+ x86-64

silodb-0.11.1-cp313-cp313-manylinux_2_35_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.35+ x86-64

silodb-0.11.1-cp313-cp313-manylinux_2_35_aarch64.whl (8.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.35+ ARM64

silodb-0.11.1-cp313-cp313-macosx_11_0_arm64.whl (5.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

silodb-0.11.1-cp313-cp313-macosx_10_15_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.13macOS 10.15+ x86-64

silodb-0.11.1-cp312-cp312-manylinux_2_35_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

silodb-0.11.1-cp312-cp312-manylinux_2_35_aarch64.whl (8.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ ARM64

silodb-0.11.1-cp312-cp312-macosx_11_0_arm64.whl (5.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

silodb-0.11.1-cp312-cp312-macosx_10_15_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.12macOS 10.15+ x86-64

silodb-0.11.1-cp311-cp311-manylinux_2_35_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ x86-64

silodb-0.11.1-cp311-cp311-manylinux_2_35_aarch64.whl (8.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ ARM64

silodb-0.11.1-cp311-cp311-macosx_11_0_arm64.whl (5.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

silodb-0.11.1-cp311-cp311-macosx_10_15_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.11macOS 10.15+ x86-64

File details

Details for the file silodb-0.11.1-cp314-cp314-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp314-cp314-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 e9af0c4de93c83c4477c1f38c7d208df49997a141445a97975917f3c81cb86b7
MD5 bf4fbaffa49798087c6d7f14546f1300
BLAKE2b-256 2d15b8f1556bdfc593b24bb8c15d02e5e1d290c98e717c7e63cd5a92618740a5

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp314-cp314-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp314-cp314-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 9fc63af34ccd634c8847f8921859115319c4e26a9279971065f8aebc71f004ba
MD5 c4346f68a6f81a7bee60425c6a984c6c
BLAKE2b-256 7fb24cc7183504874f88eddeb3ad9c721715875f2cc0b8d725bbd8cba6d96a13

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f5152daac6a53b38630260605ceccd7271cf0b9cd88843efa6815c35311fd4f2
MD5 eaca1d17cf70a0ca936a069656adff00
BLAKE2b-256 3cc66254be8adf08e9cf29f7585f0765b77e38529b0ba40b7f24be5db24a025c

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp314-cp314-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp314-cp314-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 4fdb55f663f17311b9102b917008c32d85704232d0fa4c4d5dfdafe6552945df
MD5 9082b11faa268a5ea34ec0418338531a
BLAKE2b-256 22e268ba2512fb9331f491e3f981e0239fd9d1d3f015e3e958a0a8d3a571fd41

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp313-cp313-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp313-cp313-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 803362ca21e807a90cebbfe59ac390881c6fb78e113d53bcfb8efe771e32ecf8
MD5 d25a246e679aff74ca9b0064a96eb26e
BLAKE2b-256 fa1edc60af3619260e8b2d87a92843901bc73a50c68ecca989f3df5296b1ec41

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp313-cp313-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp313-cp313-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 a0159c56995edeede49296b7bd3e52f14e357edba3710fa8bf4fd32d7f0cc79b
MD5 1cbf935cdf19e88b7f7928dfa6c5ace2
BLAKE2b-256 db9f554fcc522ac683a510333d155412d2d04134b85ba53480ce1a45e1464233

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9750ab27774ae24de948c98ec2e9aa833fa34ebd920a865ad29b7459bd1a4b99
MD5 a1bafd49668c9305258c490475335eab
BLAKE2b-256 4ab17cbdd7f8454a670bfc270f77b05ff5055cfc7f12ecdf5b7f437b410e06d4

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp313-cp313-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp313-cp313-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 21e8e2bdc2a4992fbcf4d05d7352d6139a4e1cbcb8c0f69ac7de1caf80ba6a52
MD5 bb01bd99114ae0d17a2efbbf131bc5ee
BLAKE2b-256 ec4b478fa35bba1776f9a59f58347cd3dffb56426bcbbcc98a6629b41b6a1f76

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 f093a99285e4acdc3cff97ea439efba39945bcca4d9bb88b311a9325f6bcf8e7
MD5 27c40e98af474668e5bdecee91d1b851
BLAKE2b-256 b7b0dd7aacb6b14baaa0fdc1d5cdd116f487c65cf6fd2af0f4f783b10487dc0b

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp312-cp312-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp312-cp312-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 34631a4278df1bf6a2e469d1a2ad89a96b97a6dd7314d3f250932830e9a38f34
MD5 08a2594c477a6f132135cddadba4f912
BLAKE2b-256 f4d88eaec3e2bdeba18c01385d580d0f1e0921642767cf914064cc0ae7b41d06

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 92e1878d1134d90c4fb2fd1400137ce24381da6860195112a0ee722a475f23f2
MD5 24d6745b06ff2e1dc23605fde575fbb3
BLAKE2b-256 8f3a0ed5d9f6076e7183cb861dc9e5520a59e653dc13ed4255d1ee8f5297a85c

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp312-cp312-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp312-cp312-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3fd4b029a5a6c6c0185c597531412cbf023f5bf0d51190cf34d190f0a1084115
MD5 c059f86edbe320a0b3da61d810bea2cd
BLAKE2b-256 587b4e848621435ace6a2e42a1839a4d8d1b73d6ff2b41896a7993ab58c368c4

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 01423ec5b9711dc09b80285a254bc652bff1209fea4dd7a6ea4fcbea7dbf97e9
MD5 e261f31a97622c03a8af53580429d69d
BLAKE2b-256 ab48281dcf992a41f01391900f31a7226461ad486d36c0f2f61ca074bd6da2b1

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp311-cp311-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp311-cp311-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 cfa2fc5f15afb9e38d93e4f8e865928369ff3f1d58c9aeef3c33855c1c454ea7
MD5 089ad585127f93d6be5a6ee8f4c4f6b2
BLAKE2b-256 f2cff451700699dce87ff3901e603af831ecddce0f398510f246b1498dcf583d

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 77c6c94ea2669edefb10c76c212c19867b1e209fbba54042bab7be1535d736fc
MD5 a20c1585349175b209f7c0609485dddf
BLAKE2b-256 28708157ec6b00d126f56f751a726a50961ee627c04e4c0ada0de42be81e5b33

See more details on using hashes here.

File details

Details for the file silodb-0.11.1-cp311-cp311-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for silodb-0.11.1-cp311-cp311-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 c3ec39fe23ac5b051959691557b06d8b3c132ad67d4e03e4c26a5be05d49b06b
MD5 d5433273acbbee577eaf0eab63374b97
BLAKE2b-256 90fae8b0b29c45720b93670e6de2244e6b62d90a55cb68051da3a901389cdb81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page