Skip to main content

something descriptive

Project description

DataSALad

GitHub release PyPI version fury.io Build status codecov Documentation Status

This is a pure-Python library with a collection of utilities for working with data in the vicinity of Git and git-annex. While this is a foundational library from and for the DataLad project, its implementations are standalone, and are meant to be equally well usable outside the DataLad system.

A focus of this library is efficient communication with subprocesses, such as Git or git-annex commands, which read and produce data in some format. The library provides utilities to integrate such subprocess in Python algorithms, for example, to iteratively amend information in JSON-lines formatted data streams that are retrieved in arbitrary chunks over a network connection.

Here is a simple demo how an iterable with inputs can be fed to the cat shell command, while reading its output back as a Python iterable.

>>> with iter_subproc(['cat'], inputs=[b'one', b'two', b'three']) as proc:
...     for chunk in proc:
...         print(chunk)
b'onetwothree'

Developing with datasalad

API stability is important, just as adequate semantic versioning, and informative changelogs.

Public vs internal API

Anything that can be imported directly from any of the sub-packages in datasalad is considered to be part of the public API. Changes to this API determine the versioning, and development is done with the aim to keep this API as stable as possible. This includes signatures and return value behavior.

As an example: from datasalad.runners import iter_git_subproc imports a part of the public API, but from datasalad.runners.git import iter_git_subproc does not.

Use of the internal API

Developers can obviously use parts of the non-public API. However, this should only be done with the understanding that these components may change from one release to another, with no guarantee of transition periods, deprecation warnings, etc.

Developers are advised to never reuse any components with names starting with _ (underscore). Their use should be limited to their individual subpackage.

Contributing

Contributions to this library are welcome! Please see the contributing guidelines for details on scope on styles of potential contributions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasalad-0.1.0.tar.gz (50.4 kB view hashes)

Uploaded Source

Built Distribution

datasalad-0.1.0-py3-none-any.whl (19.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page