Skip to main content

Provides data structures and data manipulation methods not avaialble from default Python modules.

Project description

ataraxis-data-structures

Provides a wide range of non-standard data structures and related data-manipulation methods.

PyPI - Version PyPI - Python Version uv Ruff type-checked: mypy PyPI - License PyPI - Status PyPI - Wheel


Detailed Description

This library provides data structures not readily available from standard Python libraries, such as nested dictionaries and shared memory arrays. In addition to these datastructures, it exposes helper-methods to manipulate the data that are also not readily available from standard or common Python libraries.

Unlike many other Ataraxis modules, this library does not have a very well-defined specialization beyond abstractly dealing with data storage, representation and manipulation. More or less anything data-related not found inside standard or popular Python libraries like numpy, scipy, pandas, etc. is a good candidate to be added to this library. It is designed to be updated frequently to scale with the needs of other Ataraxis modules, but it can also be used as a repository of helpful datastructures and methods to use in non-Ataraxis projects.


Features

  • Supports Windows, Linux, and OSx.
  • Supports Multiprocessing.
  • Supports
  • Pure-python API.
  • GPL 3 License.

Table of Contents


Dependencies

For users, all library dependencies are installed automatically for all supported installation methods (see Installation section). For developers, see the Developers section for information on installing additional development dependencies.


Installation

Source

  1. Download this repository to your local machine using your preferred method, such as git-cloning. Optionally, use one of the stable releases that include precompiled binary wheels in addition to source code.
  2. cd to the root directory of the project using your CLI of choice.
  3. Run python -m pip install . to install the project. Alternatively, if using a distribution with precompiled binaries, use python -m pip install WHEEL_PATH, replacing 'WHEEL_PATH' with the path to the wheel file.

PIP

Use the following command to install the library using PIP: pip install ataraxis-data-structures

Conda / Mamba

Note. Due to conda-forge contributing process being more nuanced than pip uploads, conda versions may lag behind pip and source code distributions.

Use the following command to install the library using Conda or Mamba: conda install ataraxis-data-structures


Usage

Add minimal examples on how the end-user can use your library. This section is not to be an in-depth guide on using the library. Instead, it should provide enough information to start using the library with the expectation that the user can then study the API documentation and code-hints to figure out how to master the library.

Quickstart

PythonDataConverter

The PythonDataConverter is a class that validates and cross converts data from one python type to another. The core of the class functionality is the validate_value method. To properly use the classs one must initialize an instance of the PythonDataConverter class with accepts one required positional argument validator which much be an instance of BoolConverter, NoneConverter, NumericConverter, or StringConverter. Each of these validator classes has their own configurations and must be initalized and passed into the PythonDataConverterclass. Some arguments have default values. Here is an example of creating a PythonDataConverter that utilizes a NumericConverter with default parameters.

converter = PythonDataConverter(validator=NumericConverter())
converter.validate_value("7.1")  # Returns the float 7.1

NumpyDataConverter

The NumpyConverter class is a converter and validator is is able to convert python datatypes to numpy datatypes. The class extends the functionality of the PythonDataConverter to support numpy datatype conversion for only a limited set of numpy datatypes. Numpy strings are not supported. A requirement of the NumpyDataConverter is for the filter_failed argument of the PythonDataConverter to be true, the defaulted false is not allowed. Here is an example of a numeric NumpyDataConverter. Note, NumericConverter cannot have both fields allow_int and allow_float being true when passed into the NumpyDataConverter. Also, the NumpyDataConverter will automatically optimize the bit-width and sign (only integers) of numeric data types is no arguemnt is passed for bit_width or signed

validator = PythonDataConverter(validator=NumericConverter(allow_float=False), filter_failed=True)
converter = NumpyDataConverter(validator)
converter.python_to_numpy_converter("7.1")   # Returns 7.1 with type np.uint8

This can also convert from numpy datatypes to python natives. Using the same validator and converter:

converter.numpy_to_python_converter(np.uint8(7))   # Returns 7 with type int

Config

Config: BoolConverter, NoneConverter, NumericConverter, or StringConverter

The following are examples to initilize each converter class with the full suite of configurations. Class fields that require assignment have default values.

# BoolConverter
bool_convert = BoolConverter(
   parse_bool_equivalents = True # Default true, allows true equivalents ("True", "true", 1, "1", 1.0)  and false equivalents
                                 # ("False", "false, 0, "0", 0.0) bool conversion
)

# NonConverter
none_convert = NoneConverter(
   pare_none_equivalents = True  # Default true, allows none equivalent ("None", "none", "Null", "null") none conversion
)

# NumericConverter
num_convert = NumericConverter(
   parse_number_strings = True   # Default true, converts numbers in string format
   allow_int = True              # Default true, allow int conversion
   allow_float = True            # Default true, allow float conversion
   number_lower_limit = 7        # Default None, rejects numbers lower than this threshold
   number_upper_limit = 17       # Default None, rejects numbers greater than this threshold
)

# StringConverter
string_convert = StringConverter(
   allow_string_conversion: bool = False,          # Default false, allows non-string inputs to convert to strings 
   string_options = ['Bobby', 'Dobby', 'Poppy'],   # Default None, rejects string inputs not in this list/tuple
   string_force_lower: bool = False,               # Default false, force string output to lowercase
)

Config: PythonDataConverter

python_convert = PythonDataConverter(
   NumericConverter(),              # Required, StringConverter not supported
   iterable_output_type = 'tuple',  # Default None, pass in tuple/list for shallow array conversion support
   filter_failed: bool = False,     # Default False, omits the values in a array that failed conversion   
)

Configh: NumpyDataConverter

numpy_convert = NumpyDataConverter(
   PythonDataConverter(NumericConverter()),  # Required
   output_bit_width = "auto",                # Default 'auto', forces numeric inputs to a specific bit width (8, 16, 32, 64)
                                             # Replaced with inf if too large
   signed = True,                            # Default true, optimize the signed or unsigned int based on which saves memory.
)

API Documentation

See the API documentation for the detailed description of the methods and classes exposed by components of this library. The documentation also covers any available cli/gui-interfaces (such as benchmarks).


Developers

This section provides installation, dependency, and build-system instructions for the developers that want to modify the source code of this library. Additionally, it contains instructions for recreating the conda environments that were used during development from the included .yml files.

Installing the library

  1. Download this repository to your local machine using your preferred method, such as git-cloning.
  2. cd to the root directory of the project using your CLI of choice.
  3. Install development dependencies. You have multiple options of satisfying this requirement:
    1. Preferred Method: Use conda or pip to install tox or use an environment that has it installed and call tox -e import-env to automatically import the os-specific development environment included with the source code in your local conda distribution. Alternatively, see environments section for other environment installation methods.
    2. Run python -m pip install .'[dev]' command to install development dependencies and the library using pip. On some systems, you may need to use a slightly modified version of this command: python -m pip install .[dev].
    3. As long as you have an environment with tox installed and do not intend to run any code outside the predefined project automation pipelines, tox will automatically install all required dependencies for each task.

Note: When using tox automation, having a local version of the library may interfere with tox methods that attempt to build the library using an isolated environment. It is advised to remove the library from your test environment, or disconnect from the environment, prior to running any tox tasks. This problem is rarely observed with the latest version of the automation pipeline, but is worth mentioning.

Additional Dependencies

In addition to installing the required python packages, separately install the following dependencies:

  1. Python distributions, one for each version that you intend to support. Currently, this library supports version 3.10 and above. The easiest way to get tox to work as intended is to have separate python distributions, but using pyenv is a good alternative too. This is needed for the 'test' task to work as intended.

Development Automation

This project comes with a fully configured set of automation pipelines implemented using tox. Check tox.ini file for details about available pipelines and their implementation.

Note! All commits to this library have to successfully complete the tox task before being pushed to GitHub. To minimize the runtime task for this task, use tox --parallel.

Environments

All environments used during development are exported as .yml files and as spec.txt files to the envs folder. The environment snapshots were taken on each of the three supported OS families: Windows 11, OSx 14.5 and Ubuntu 22.04 LTS.

To install the development environment for your OS:

  1. Download this repository to your local machine using your preferred method, such as git-cloning.
  2. cd into the envs folder.
  3. Use one of the installation methods below:
    1. Preferred Method: Install tox or use another environment with already installed tox and call tox -e import-env.
    2. Alternative Method: Run conda env create -f ENVNAME.yml or mamba env create -f ENVNAME.yml. Replace 'ENVNAME.yml' with the name of the environment you want to install (axds_dev_osx for OSx, axds_dev_win for Windows and axds_dev_lin for Linux).

Note: the OSx environment was built against M1 (Apple Silicon) platform and may not work on Intel-based Apple devices.


Authors

  • Ivan Kondratyev.

License

This project is licensed under the GPL3 License: see the LICENSE file for details.


Acknowledgments

  • All Sun Lab members for providing the inspiration and comments during the development of this library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ataraxis_data_structures-0.3.0.tar.gz (96.6 kB view details)

Uploaded Source

Built Distribution

ataraxis_data_structures-0.3.0-py3-none-any.whl (69.1 kB view details)

Uploaded Python 3

File details

Details for the file ataraxis_data_structures-0.3.0.tar.gz.

File metadata

File hashes

Hashes for ataraxis_data_structures-0.3.0.tar.gz
Algorithm Hash digest
SHA256 dd90e728ab0228de7f341bd3ee0b66c68632133e996a9146b7a9ab2597fa181e
MD5 15449a9b218c9097caca8d7e3705a68a
BLAKE2b-256 70dc880a0cd80788946eb51b6ed7e675131e941b52f8af48e6b03af37579d0c4

See more details on using hashes here.

File details

Details for the file ataraxis_data_structures-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ataraxis_data_structures-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 172b5f8dbdbff0fecb40597d7bb1c146c3abd9743b3bde46f691f83f40288f94
MD5 0fdc33aa4fb0617eeba4b1666a6c2a43
BLAKE2b-256 26ba01bd59b3a959832b29f50d354e871b1d81aaf13ab09f35b7ab9cd1b028c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page