Skip to main content

Uniform Sample Identifiers Parser

Project description

sampleids

Uniform sample ID parser

Available on PyPI at https://pypi.org/project/sampleids/ Available on Github at https://github.com/tmcqueen-materials/sampleids

Quick Start

  1. Install: pip3 install sampleids
  2. In code, do:
from sampleids import parse as sid_parse, CONFIDENCE as sid_CONFIDENCE

res = sid_parse("AAA_BBB_YYYYMMDD_C_III_S_(QQQQQQQQQQ)-EE", ["AAA",...], ["BBB",...], ["III",...])
print(res) # will print the tuple SampleID(lab_id='AAA', tool_id='BBB', date='20241001', sample_id='C', provenance_id=['III'], split_id='S', parents=[SampleID(lab_id='', tool_id='', date='', sample_id='', provenance_id=[], split_id='', parents=[], extra='', raw='QQQQQQQQQQ', confidence=<CONFIDENCE.NONE: 0>, why='P_PARENT1_PI_V1_6L01_PI_V1_NOPARSE')], extra='EE', raw='AAA_BBB_20241001_C_III_S_(QQQQQQQQQQ)-EE', confidence=<CONFIDENCE.HIGH: 3>, why='P_PARENT1_PI_V1_6L01')

# You can check if confidence is greater than a minimum value, e.g.:
if res.confidence > sid_CONFIDENCE.LOW:
  print("Confidence is not low!")

# The "why" string gives a log of the code paths taken by the parser.
# If you find a case that fails to parse, and you think it should,
# or a case it parses incorrectly, be sure to include the why string!
print(res.why) # prints 'P_PARENT1_PI_V1_6L01'

Specification

This module parses sample identifiers following the schema described at https://occamy.chemistry.jhu.edu/references/samples/index.php . It is a lenient parser, to account for variations observed in the real world, e.g. swapping of month and date, or swapping of identifier fragments.

Version Compatibility

sampleids is compatible with all versions of Python 3.4+.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sampleids-0.0.2.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

sampleids-0.0.2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file sampleids-0.0.2.tar.gz.

File metadata

  • Download URL: sampleids-0.0.2.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.20.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.8

File hashes

Hashes for sampleids-0.0.2.tar.gz
Algorithm Hash digest
SHA256 28872b39326d3534ce4fac4514f47da6970d2142b5923ffb641653b492d02388
MD5 b4c41ecbb2c04df786cbf1eff27c1ad8
BLAKE2b-256 2e4b3210778780f39f45b5aeaff2e57d6cfc2ab134315b8f6eab7805ccb5ec7a

See more details on using hashes here.

File details

Details for the file sampleids-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: sampleids-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.20.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.8

File hashes

Hashes for sampleids-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9a7a74a038c37e9cfe92f264b24d95290eb8b3de555197d292df3cb2dd35fdb6
MD5 e080ddadba0dd48d031be61f0c9fd134
BLAKE2b-256 13f6bc8e1a0437a72ea1f5629705ec8b15944b1f7db66d2573065db82329b2c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page