PyO3 bindings and Python interface to PAPASMURF, a Platform-Accelerated Package for Alignment-free SMURF analysis.
Project description
🧙♂️ PAPASMURF 
A Platform-Accelerated Package for Alignment-free SMURF analysis.
🗺️ Overview
SMURF (Short MUltiple Region Framework) is a method proposed by Fuks et al.[1] in 2018 for taxonomic profiling of 16S sequencing data. It uses several PCR-amplified regions inside the 16S rRNA gene to reach high taxonomic resolution despite the use of short read sequencing.
PAPASMURF is a Rust reimplementation of the SMURF method from scratch. It does not aim at being a 1-to-1 reimplementation of the original MATLAB implementation, but allows more control over the parameters used in the original to support sequencing data of lesser quality.
This is the Python version, there is a Rust crate available as well.
🔧 Installing
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
Use Biopython to generate a database from a file containing 16S gene sequences in FASTA format, for instance the Greengenes database:
import papasmurf
# Create a database builder with the two given primers
builder = papasmurf.Builder([
("CCTACGGGNGGCWGCAG", "GACTACHVGGGTATCTAATCC"), # V3-V4 primers
("GTGYCAGCMGCCGCGGTAA", "CCGYCAATTYMTTTRAGTTT"), # V4-V5 primers
])
# Extract k-mers from the reference sequences
with gzip.open("gg_13_5.fasta.gz", "rt") as reader:
for record in Bio.SeqIO.parse(reader, "fasta"):
builder.add(record.id, str(record.seq))
# Build and index the database
database = builder.to_database()
# Save the database in JSON format
database.dump("gg.json", format="json")
Then use the database to map reads from a sample:
# Load database and create a new mapper
database = papasmurf.Database.load("gg.json", format="json")
mapper = papasmurf.Mapper(database)
# Map reads to the k-mers database
with gzip.open("data/Example_L001_R1_001.fastq.gz", "rt") as f1:
with gzip.open("data/Example_L001_R2_001.fastq.gz", "rt") as f2:
for r1, r2 in zip(Bio.SeqIO.parse(f1, "fastq"), Bio.SeqIO.parse(f2, "fastq")):
mapper.add(str(r1.seq), str(r2.seq))
Once all the reads have been mapped, compute the final bacterium frequencies:
# Obtain partial mapping result
result = mapper.finish()
# Run the iterative procedure 10 times to estimate the read proportion vector
result.refine(10)
# Print the names of the reference sequences with >5% relative abundance
for (j, name) in enumerate(database.names):
if result.frequencies[j] > 0.05:
print(name, result.frequencies[j])
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source GPLv3 license.
This project is in no way not affiliated, sponsored, or otherwise endorsed by the original SMURF authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team with support and testing from Fabian Springer.
All brand names and product names used in this material are trademarks or registered trademarks of their respective owners. The author/owner is not affiliated with, endorsed by, or sponsored by any product, organization, or company mentioned. Smurf is a registered trademark of Studio Peyo S.A.
📚 References
- [1] Fuks, Garold, Michael Elgart, Amnon Amir, Amit Zeisel, Peter J. Turnbaugh, Yoav Soen, and Noam Shental. ‘Combining 16S RRNA Gene Variable Regions Enables High-Resolution Microbial Community Profiling’. Microbiome 6 (26 January 2018): 17. doi:10.1186/s40168-017-0396-x.
- [2] Gustavson, Fred G. ‘Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition’. ACM Transactions on Mathematical Software 4, no. 3 (September 1978): 250–69. doi:10.1145/355791.355796.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file papasmurf-0.1.1.tar.gz.
File metadata
- Download URL: papasmurf-0.1.1.tar.gz
- Upload date:
- Size: 87.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
024ebf801c65a4a0b07d50d41a66b913b368e7e99fba5e4c73ee00f439e0fe80
|
|
| MD5 |
b911da394359e3109c8b38c4131c739a
|
|
| BLAKE2b-256 |
982278b4892a8be23be881f1d954f2da12cf96a229d925849e643c857351206e
|
File details
Details for the file papasmurf-0.1.1-cp314-cp314t-win_amd64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp314-cp314t-win_amd64.whl
- Upload date:
- Size: 416.0 kB
- Tags: CPython 3.14t, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3093488d8ccef14ee443040319df65711b236bbc1f2e6da041a07937f2809f92
|
|
| MD5 |
1845cca849df2a022782520c1583c6ef
|
|
| BLAKE2b-256 |
fe07ce44af93525113eb64535f05e034531aad06d1b9c9fa260abf2816344f0e
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp314-cp314t-win_amd64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp314-cp314t-win_amd64.whl -
Subject digest:
3093488d8ccef14ee443040319df65711b236bbc1f2e6da041a07937f2809f92 - Sigstore transparency entry: 1109533371
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 587.1 kB
- Tags: CPython 3.14t, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba8039cf22084abd23f31d4bada013b037a7b9d61eb6b880142d559c6a69913d
|
|
| MD5 |
2612334dd4a470d6790cd28d24fb7b37
|
|
| BLAKE2b-256 |
2cf4b69f9595467f5e546d48d5ee73ff201ecbc461b981f0bc378c7752a06f96
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_x86_64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_x86_64.whl -
Subject digest:
ba8039cf22084abd23f31d4bada013b037a7b9d61eb6b880142d559c6a69913d - Sigstore transparency entry: 1109533363
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 567.7 kB
- Tags: CPython 3.14t, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
311a3ce6b405213808f994dd3096b69a6f75e52a102d24e52d127fc8b2f1a7df
|
|
| MD5 |
79b8aeb828e32223a7030590a043277b
|
|
| BLAKE2b-256 |
3ff972d49c9ef673077b8cb5838f2fdc73d14f765890f9011c990074a76c9890
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_aarch64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp314-cp314t-manylinux_2_28_aarch64.whl -
Subject digest:
311a3ce6b405213808f994dd3096b69a6f75e52a102d24e52d127fc8b2f1a7df - Sigstore transparency entry: 1109533349
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp314-cp314t-macosx_12_0_x86_64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp314-cp314t-macosx_12_0_x86_64.whl
- Upload date:
- Size: 540.3 kB
- Tags: CPython 3.14t, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9aec12e96c2c9464e2b28a43a307577e1a188206c1736ee4a1bc996ae04d6356
|
|
| MD5 |
372dcecd9afdbf1e6a73291febac3003
|
|
| BLAKE2b-256 |
bed2314bc3e61656664b7db37af08c1d605d02b3d7a48cd470ea38510d2945a2
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp314-cp314t-macosx_12_0_x86_64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp314-cp314t-macosx_12_0_x86_64.whl -
Subject digest:
9aec12e96c2c9464e2b28a43a307577e1a188206c1736ee4a1bc996ae04d6356 - Sigstore transparency entry: 1109533362
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp314-cp314t-macosx_11_0_arm64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp314-cp314t-macosx_11_0_arm64.whl
- Upload date:
- Size: 515.0 kB
- Tags: CPython 3.14t, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f714d468b71c544594c0287e11fb082ae7f8962bbeaab25e8a2074905bdcf5ff
|
|
| MD5 |
bb2bc1e74a3d95a6afe65e6c62f02f5c
|
|
| BLAKE2b-256 |
6bacbb3ef245d6e17612adcd635cb285bf85825a63448317d3a015cd85e2da56
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp314-cp314t-macosx_11_0_arm64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp314-cp314t-macosx_11_0_arm64.whl -
Subject digest:
f714d468b71c544594c0287e11fb082ae7f8962bbeaab25e8a2074905bdcf5ff - Sigstore transparency entry: 1109533376
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 420.9 kB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8813d0e38aa2aaa25aa612066afd79a49ae676be6b6250735649d2ee52c4c7c5
|
|
| MD5 |
db47b5ea3c5419c222a65a2b88513255
|
|
| BLAKE2b-256 |
37696ced594202ae40f72406bc5285039b4d133432bf98e4a8bc48c5db53be8c
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp38-abi3-win_amd64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp38-abi3-win_amd64.whl -
Subject digest:
8813d0e38aa2aaa25aa612066afd79a49ae676be6b6250735649d2ee52c4c7c5 - Sigstore transparency entry: 1109533352
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp38-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp38-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 597.8 kB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc60e2f3dba14700b6cfbde03dbb50bb4a8e89922ffe0eded1cde924ba670899
|
|
| MD5 |
84eb52f961958a34bb8e8879f23ecab4
|
|
| BLAKE2b-256 |
d205324e2f7cf86f87b8d9c62ae76de285027e794f1a74d6a2ca0c9bc959afdc
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp38-abi3-manylinux_2_28_x86_64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp38-abi3-manylinux_2_28_x86_64.whl -
Subject digest:
fc60e2f3dba14700b6cfbde03dbb50bb4a8e89922ffe0eded1cde924ba670899 - Sigstore transparency entry: 1109533385
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp38-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp38-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 575.6 kB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a16701c4493282ef83a33d9fd5684597cb95cd936956bc90cbd26ca2b49c8564
|
|
| MD5 |
6633a11eea9d9f2b231d716b0a7de8d5
|
|
| BLAKE2b-256 |
0e59c2f391118e6b87d498296e41821f07a493ac173ca302e4f6fa6679207963
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp38-abi3-manylinux_2_28_aarch64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp38-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
a16701c4493282ef83a33d9fd5684597cb95cd936956bc90cbd26ca2b49c8564 - Sigstore transparency entry: 1109533356
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp38-abi3-macosx_12_0_x86_64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp38-abi3-macosx_12_0_x86_64.whl
- Upload date:
- Size: 548.0 kB
- Tags: CPython 3.8+, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1db77431ec320bca0485fe6edb360210d6b4f57700e9d3dbcfbf49058b77a591
|
|
| MD5 |
50ead73700ee77684835fe6340ca5fad
|
|
| BLAKE2b-256 |
5cb1bcb8b46123b9b3614695863126473fffc010832dae347864338348fdce78
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp38-abi3-macosx_12_0_x86_64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp38-abi3-macosx_12_0_x86_64.whl -
Subject digest:
1db77431ec320bca0485fe6edb360210d6b4f57700e9d3dbcfbf49058b77a591 - Sigstore transparency entry: 1109533395
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type:
File details
Details for the file papasmurf-0.1.1-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: papasmurf-0.1.1-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 524.8 kB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
955cdea199aa5a384f8eb72bd41386f907980eefa916705c6290eddcf59570ca
|
|
| MD5 |
76e3153bcd5d97992c9f3c33d1122a0a
|
|
| BLAKE2b-256 |
afd5e124dd1933b65413d0c7389f2efa2e5f87d0d0d9360ce671ab3309d88e11
|
Provenance
The following attestation bundles were made for papasmurf-0.1.1-cp38-abi3-macosx_11_0_arm64.whl:
Publisher:
python.yml on althonos/papasmurf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
papasmurf-0.1.1-cp38-abi3-macosx_11_0_arm64.whl -
Subject digest:
955cdea199aa5a384f8eb72bd41386f907980eefa916705c6290eddcf59570ca - Sigstore transparency entry: 1109533391
- Sigstore integration time:
-
Permalink:
althonos/papasmurf@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@174c7e82cd136799108b334edbabd0cc5abb9b45 -
Trigger Event:
push
-
Statement type: