Skip to main content

Lightweight Dataverse interface in Python to upload, download and update datasets found in Dataverse instances.

Project description

EasyDataverse
PyPI version PyPI - Python Version
Build Badge Build Badge

EasyDataverse is a Python library used to interface Dataverse installations and dynamically generate Python objects compatible to a metadatablock configuration given at a Dataverse installation. In addition, EasyDataverse allows you to export and import datasets to and from various data formats.

Features

  • Metadataconfig compliant classes for flexible Dataset creation.
  • Upload and download of files and directories to and from Dataverse installations.
  • Export and import of datasets to various formats (JSON, YAML and XML).
  • Fetch datasets from any Dataverse installation into an object oriented structure ready to be integrated.

⚡️ Quick start

Get started with EasyDataverse by running the following command

# Using PyPI
pip install easyDataverse

Or build by source

pip install git+https://github.com/gdcc/easyDataverse.git

⚙️ Quickstart

Dataset creation

EasyDataverse is capable of connecting to a given Dataverse installation and fetch all metadata fields and their properties. This allows you to create a dataset object with all the metadata fields and their properties given at the Dataverse installation.

from easyDataverse import Dataverse

# Connect to a Dataverse installation
dataverse = Dataverse(
  server_url="https://demo.dataverse.org",
  api_token="MY_API_TOKEN",
)

# Initialize a dataset
dataset = dataverse.create_dataset()

# Fill metadata blocks
dataset.citation.title = "My dataset"
dataset.citation.subject = ["Other"]
dataset.citation.add_author(name="John Doe")
dataset.citation.add_dataset_contact(name="John Doe", email="john@doe.com")
dataset.citation.add_ds_description(value="This is a description of the dataset")

# Upload files or directories
dataset.add_file(local_path="./my.file", dv_dir="some/dir")
dataset.add_directory(dirpath="./my_directory", dv_dir="some/dir")

# Upload to the dataverse instance
dataset.upload("my_dataverse_id")

Dataset download and update

EasyDataset allows you to download datasets from any Dataverse installation. The downloaded dataset is represented as an object oriented structure and can be used to update metadata/files, export a dataset to various formats or use it in subsequent applications.

# Method 1: Download a dataset by its DOI
dataverse = Dataverse("https://demo.dataverse.org")
dataset = dataverse.load_dataset(
    pid="doi:10.70122/FK2/W5AGKD",
    version="1",
    filedir="place/for/data",
)

# Method 2: Download via URL
dataset, dataverse = Dataverse.from_ds_url(
    url="https://demo.dataverse.org/dataset.xhtml?persistentId=doi:10.70122/XX/XXXXX&version=DRAFT",
    api_token="MY_API_TOKEN"
)

# Display the content of the dataset
print(dataset)

# Update metadata
dataset.citation.title = "My even nicer dataset"

# Synchronize with the dataverse instance
dataset.update()

📖 Documentation and more examples

You can find a thorough example notebook in the examples directory. This notebook demonstrate basic concepts of EasyDataverse and how to use it in practice.

✍️ Authors

  • Jan Range (EXC2075 SimTech, University of Stuttgart)

⚠️ License

EasyDataverse is free and open-source software licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easydataverse-0.4.4.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

easydataverse-0.4.4-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file easydataverse-0.4.4.tar.gz.

File metadata

  • Download URL: easydataverse-0.4.4.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure

File hashes

Hashes for easydataverse-0.4.4.tar.gz
Algorithm Hash digest
SHA256 46db0f3cd6f2f42ef5739ad9fb367a510741a4cf9a4384174d99698055ca23bf
MD5 498f704c132df43c923c7a53977b6620
BLAKE2b-256 3ecdf1b49d8b7b4eb3e70511351285ccab28e0095d9c063c82a766a9671b103b

See more details on using hashes here.

File details

Details for the file easydataverse-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: easydataverse-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure

File hashes

Hashes for easydataverse-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6abcf9fdd4fa4a80f0802ae8f026808ba4a1b1f0870bba626edf3ea63d01dbfa
MD5 6487104de822c67c94bc78d9ac2ea564
BLAKE2b-256 5206c831e74e1e418af077493ddc711c7d306fcba32b83a707cc43f47368045e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page