Skip to main content

A package to create, publish, and download research datasets

Project description

fair-software.nl recommendations

Badges

1. Code repository

GitHub Badge

2. License

License Badge

3. Community Registry

PyPI Badge

4. Enable Citation

Zenodo Badge

Other best practices

Continuous integration

Python Build Python Publish

Documentation

Documentation Status

fairly

A package to create, publish and clone research datasets.

License: MIT

Installation

fairly requires Python 3.8 or later, and ruamel.yaml version 0.17.26 or later. It can be installed directly using pip.

pip install fairly

Installing from source

  1. Clone or download the source code:

    git clone https://github.com/ITC-CRIB/fairly.git
  2. Go to the root directory:

    cd fairly/
  3. Compile and install using pip:

    pip install .

Usage

Basic example to create a local research dataset and deposit it to a repository:

import fairly

# Initialize a local dataset
dataset = fairly.init_dataset('/path/dataset')

# Set metadata
dataset.metadata['license'] = 'MIT'
dataset.set_metadata(
    title='My dataset',
    keywords=['FAIR', 'research', 'data'],
    authors=[
        '0000-0002-0156-185X',
        {'name': 'John', 'surname': 'Doe'}
    ]
)

# Add data files
dataset.includes.extend([
    'README.txt',
    '*.csv',
    'train/*.jpg'
])

# Save dataset
dataset.save()

# Upload to a data repository
remote_dataset = dataset.upload('zenodo')

Basic example to access a remote dataset and store it locally:

import fairly

# Open a remote dataset
dataset = fairly.dataset('doi:10.4121/21588096.v1')

# Get dataset information
dataset.id
>>> {'id': '21588096', 'version': '1'}

dataset.url
>>> 'https://data.4tu.nl/articles/dataset/.../21588096/1'

dataset.size
>>> 33339

len(dataset.files)
>>> 6

dataset.metadata
>>> Metadata({'keywords': ['Earthquakes', 'precursor', ...], ...})

# Update metadata
dataset.metadata['keywords'] = ['Landslides', 'precursor']
dataset.save_metadata()

# Store dataset to a local directory (i.e. clone dataset)
local_dataset = dataset.store('/path/dataset')

Currently, the package supports the following research data management platforms:

All research data repositories based on the listed platforms are supported.

For more details and examples, consult the package documentation.

Testing

Unit tests can be run by using pytest command in the root directory.

Contributions

Read the guidelines to know how you can be part of this open source project.

JupyterLab Extension

An extension for JupyerLab is being developed in a different repository.

Citation

Please cite this software using as follows:

Girgin, S., Garcia Alvarez, M., & Urra Llanusa, J., fairly: a package to create, publish and clone research datasets [Computer software]

Acknowledgements

This research is funded by the Dutch Research Council (NWO) Open Science Fund, File No. 203.001.114.

Project members:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairly-1.0.1.tar.gz (1.5 MB view hashes)

Uploaded Source

Built Distribution

fairly-1.0.1-py3-none-any.whl (246.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page