Skip to main content

General utilities (not related to chemistry)

Project description

RXN utilities package

Actions tests

This repository contains general Python utilities commonly used in the RXN universe. For utilities related to chemistry, see our other repository rxn-chemutils.

Links:

System Requirements

This package is supported on all operating systems. It has been tested on the following systems:

  • macOS: Big Sur (11.1)

  • Linux: Ubuntu 18.04.4

A Python version of 3.6 or greater is recommended.

Installation guide

The package can be installed from Pypi:

pip install rxn-utils

For local development, the package can be installed with:

pip install -e ".[dev]"

Package highlights

File-related utilities

  • load_list_from_file: read a files into a list of strings.
  • iterate_lines_from_file: same as load_list_from_file, but produces an iterator instead of a list. This can be much more memory-efficient.
  • dump_list_to_file and append_to_file: Write an iterable of strings to a file (one per line).
  • named_temporary_path and named_temporary_directory: provide a context with a file or directory that will be deleted when the context closes. Useful for unit tests.
    >>> with named_temporary_path() as temporary_path:
    ...     # do something on the temporary path.
    ...     # The file or directory at that path will be deleted at the
    ...     # end of the context, except if delete=False.
    
  • ... and others.

CSV-related functionality

  • The function iterate_csv_column and the related executable rxn-extract-csv-column provide an easy way to extract one single column from a CSV file.
  • The StreamingCsvEditor allows for doing a series of operations onto a CSV file without loading it fully in the memory. This is for instance used in rxn-reaction-preprocessing. See a few examples in the unit tests.

Stable shuffling

For reproducible shuffling, or for shuffling two files of identical length so that the same permutation is obtained, one can use the stable_shuffle function. The executable rxn-stable-shuffle is also provided for this purpose.

Both also work with CSV files if the appropriate flag is provided.

chunker and remove_duplicates

For batching an iterable into lists of a specified size, chunker comes in handy. It also does so in a memory-efficient way.

>>> from rxn.utilities.containers import chunker
>>> for chunk in chunker(range(1, 10), chunk_size=4):
...     print(chunk)
[1, 2, 3, 4]
[5, 6, 7, 8]
[9]

remove_duplicates (or iterate_unique_values, its memory-efficient variant) removes duplicates from a container, possibly based on a callable instead of the values:

>>> from rxn.utilities.containers import remove_duplicates
>>> remove_duplicates([3, 6, 9, 2, 3, 1, 9])
[3, 6, 9, 2, 1]
>>> remove_duplicates(["ab", "cd", "efg", "hijk", "", "lmn"], key=lambda x: len(x))
['ab', 'efg', 'hijk', '']

Regex utilities

regex.py provides a few functions that make it easier to build regex strings (considering whether segments should be optional, capturing, etc.).

Others

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rxn-utils-2.0.0.tar.gz (34.2 kB view details)

Uploaded Source

Built Distribution

rxn_utils-2.0.0-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file rxn-utils-2.0.0.tar.gz.

File metadata

  • Download URL: rxn-utils-2.0.0.tar.gz
  • Upload date:
  • Size: 34.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for rxn-utils-2.0.0.tar.gz
Algorithm Hash digest
SHA256 5af2feabb4b82dffb2aa3dda6973c6d9f658175d36380ac36d8e88808afa033e
MD5 14c8f3f325c8c41e6cb5f81224cc3365
BLAKE2b-256 cd53fb8eaf6e2119aacfbc75b9458923535c72e462fac06c637b6e5419756444

See more details on using hashes here.

File details

Details for the file rxn_utils-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: rxn_utils-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for rxn_utils-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1d8598040f9d0fdabfea15d0ec140bd39aa1d32d873efa1890143a7ab29b030
MD5 068fd09b862ed5f7cbb896ea447ea101
BLAKE2b-256 353983dcba297793c24aa0402c80c628e571d425837362d30ac3fae9b3f0f9dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page