Skip to main content

Uniform Sample Identifiers Parser

Project description

sampleids

Uniform sample ID parser

Available on PyPI at https://pypi.org/project/sampleids/ Available on Github at https://github.com/tmcqueen-materials/sampleids

Quick Start

  1. Install: pip3 install sampleids
  2. In code, do:
from sampleids import parse as sid_parse, CONFIDENCE as sid_CONFIDENCE

res = sid_parse("AAA_BBB_YYYYMMDD_C_III_S_(QQQQQQQQQQ)-EE", ["AAA",...], ["BBB",...], ["III",...])
print(res) # will print the tuple SampleID(lab_id='AAA', tool_id='BBB', date='20241001', sample_id='C', provenance_id=['III'], split_id='S', parents=[SampleID(lab_id='', tool_id='', date='', sample_id='', provenance_id=[], split_id='', parents=[], extra='', raw='QQQQQQQQQQ', confidence=<CONFIDENCE.NONE: 0>, why='P_PARENT1_PI_V1_6L01_PI_V1_NOPARSE')], extra='EE', raw='AAA_BBB_20241001_C_III_S_(QQQQQQQQQQ)-EE', confidence=<CONFIDENCE.HIGH: 3>, why='P_PARENT1_PI_V1_6L01')

# You can check if confidence is greater than a minimum value, e.g.:
if res.confidence > sid_CONFIDENCE.LOW:
  print("Confidence is not low!")

# The "why" string gives a log of the code paths taken by the parser.
# If you find a case that fails to parse, and you think it should,
# or a case it parses incorrectly, be sure to include the why string!
print(res.why) # prints 'P_PARENT1_PI_V1_6L01'

Specification

This module parses sample identifiers following the schema described at https://occamy.chemistry.jhu.edu/references/samples/index.php . It is a lenient parser, to account for variations observed in the real world, e.g. swapping of month and date, or swapping of identifier fragments.

Version Compatibility

sampleids is compatible with all versions of Python 3.4+.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sampleids-0.0.1.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

sampleids-0.0.1-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file sampleids-0.0.1.tar.gz.

File metadata

  • Download URL: sampleids-0.0.1.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.20.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.8

File hashes

Hashes for sampleids-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7d4e2d2321d5a77ce725a9df308203cec63df3fdce14cb2865711d4aaf6cb2ae
MD5 7119262c9a9c5cd5b788cedf9e680a3f
BLAKE2b-256 576e026751eb1b230a54b6e3294d5ca64a8f02106ce3828c185c5cd575180cdb

See more details on using hashes here.

File details

Details for the file sampleids-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: sampleids-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.20.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.8

File hashes

Hashes for sampleids-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d36c8caffcf3f5eaa29b9374168aba5c9c5acc003a575bc41dc3ad00073bc188
MD5 20e9ec112b838577382334de22dfc6f3
BLAKE2b-256 44b5452ea73ac1eda0c913abc1163196d7788d9c1fa6f534e40eed6298077a78

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page