iBridges plugin for interacting with Dataverse
Project description
iBridges Dataverse
This package provides a plugin to the iBridges CLI and GUI. The plugin allows to create a dataset on Dataverse and to uplaod iRODS data objects.
Dependencies
- Python 3.9 or higher
- HTTPX
- PyDataverse
- iBridges and iBridges GUI (ibridges, ibridgesgui)
All packages are installable with pip.
๐ก Highlights
- Checksum checks upon upload to Dataverse
- Mark files for upload as you browse and push to Dataverse later --> Git-like workflow
- GUI and CLI receive the same information, so you can switch between them without loosing information
:warning: NOTE
-
The plugin only transfers files which are smaller than 9GB!
-
All data from iRODS will be downloaded to your local computer.
The files will be downloaded one by one to a temporary folder. From there they are uploaded to Dataverse and are finally deleted from your computer. So you need at least 9GB of extra space. In case of errors in the upload to Dataverse, the file will not be deleted from them temporary folder so that files might accumulate on your computer.
Install the plugin
Install the package from pypi:
pip install ibridgesdvn
Install the CLI from the git repository:
pip install git+https://github.com/iBridges-for-iRODS/ibridges-plugin-dataverse.git
Install the GUI and CLI:
pip install git+https://github.com/iBridges-for-iRODS/ibridges-plugin-dataverse.git
pip install "ibridgesdvn[gui]"
This installs the python package ibridgesdvn.
When you start the iBridges CLI or shell you will see the following new commands:
ibridges -h
dv-add-file Mark one or more iRODS data objects to be uploaded to a Dataverse dataset.
dv-setup Print existing Dataverse configurations or create new ones.
dv-cleanup Cleanup all entries from the status, where the list of irods files is empty.
dv-create-ds Create a new dataset in a Dataverse collection.
dv-init Provide token and store for future use
dv-push Push all local changes to the dataverse collection.
dv-rm-file Remove one or more iRODS data objects from upload to a Dataverse dataset.
dv-status List all local changes to the dataset(s).
dv-switch Switch to another existing Dataverse configuration by providing a url or alias.
If you use the iBridges GUI you will be able to select a "Dataverse" view.
The Dataverse View
- Step 1: Configure your Dataverse Connection You can add new configurations or delete old ones.
- Step 2: Select a Dataverse collection to which files should be uploaded. If you do not have a dataset yet, use the "Create New Dataset" button and get a DOI to a newly created dataset.
- Step 3: Select iRODS data objects from the right pane of the view. With the button "<<" they will be added to the list of data that will be uploaded. You can also delete one or several entries from the list.
When you are happy with the list, click "Upload to Dataverse". Now go to your Dataverse dataset in the browser and finish you data publication.
The Dataverse commands
Configuring a Dataverse instance
With the command ibridges dv-setup you can see all existing Dataverse URLs and you can create an alias for a Dataverse URL.
ibridges dv-setup dvnl-demo https://demo.dataverse.nl
To activate this environment you need to provide an API token:
ibridges dv-init dvnl-demo
Your Dataverse token for dvnl-demo :
demo -> https://demo.dataverse.org
* dvnl-demo -> https://demo.dataverse.nl
Your token will not be shown, but stored for future use.
You see that there are two Dataverse URLs. You can switch between them with:
ibridges dv-switch https://demo.dataverse.org
* demo -> https://demo.dataverse.org
dvnl-demo -> https://demo.dataverse.nl
In all of those commands you can use the URLs and the aliases interchangeably.
Note, that those commands are only available for the CLI and not for the shell. All other commands can be used from the shell.
Creating a dataset
To create a dataset, you will need a specific Dataverse dataset.json which you can provide like this:
ibridges shell
ibshell:research-christine> dv-create-ds UUscience --metajson ibridgescontrib/ibridgesdvn/dataset.json
Dataset with pid 'doi:10.80227/PDVNL/RZQRAK' created.
This creates a dataset on the Dataverse we selected with dv-init or dv-switch, it uses the Dataverse collection UUscience. Please adjust to your Dataverse instance.
You will need the 10.80227/PDVNL/RZQRAK part of the pid to add files and to finally upload them to Dataverse, so please save it somewhere.
Alternatively, you can create some minimal metadata with the command:
ibshell:research-christine> dv-create-ds UUscience --metadata
This will open a small interactive questionnaire for the most basic metadata that Dtaaverse needs to create a Dataset.
Browsing files and adding it to Dataverse
After you created the new Dataset on Dataverse, you can now use the iBridges shell to browse and add files as you go along.
We implemented a git-like workflow. So the following command will only mark files to be uploaded to a specific Dataset on Dataverse.
Browse through collections in iRODS
ibshell:research-christine> ls my_books
/uu/home/research-christine/my_books:
/uu/home/research-christine/my_books/AdventuresSherlockHolmes.txt
/uu/home/research-christine/my_books/AliceAdventuresInWonderLand.txt
/uu/home/research-christine/my_books/DonQuixote.txt
/uu/home/research-christine/my_books/Dracula.txt
And select files as you go along:
ibshell:research-christine> dv-add-file 10.80227/PDVNL/RZQRAK irods:my_books/AdventuresSherlockHolmes.txt irods:my_books/DonQuixote.txt
In the iBridges CLI and shell you can use relative iRODS paths. All iRODS paths need to be prefixed with irods:.
You will receive a short summary of all files that you ever marked for upload. This summary also contains all Datavers instances you work on and all Datasets:
{'https://demo.dataverse.nl': {'add_file': [{'dataset': '10.80227/PDVNL/RZQRAK',
'irods_paths': ['/uu/home/research-christine/my_books/AdventuresSherlockHolmes.txt',
'/uu/home/research-christine/my_books/DonQuixote.txt']}]}}
You can also remove iRODS paths again with:
ibshell:research-christine> dv-rm-file 10.80227/PDVNL/RZQRAK irods:my_books/DonQuixote.txt
And with dv-status you can check which files are marked for upload:
ibshell:research-christine> dv-status
{'https://demo.dataverse.nl': {'add_file': [{'dataset': '10.80227/PDVNL/RZQRAK',
'irods_paths': ['/uu/home/research-christine/my_books/AdventuresSherlockHolmes.txt']}]}}
Upload data to dataset
Now that you are happy with the marked data, upload them to the dataset:
ibshell:research-christine> dv-push 10.80227/PDVNL/RZQRAK
Data stored in /Users/staig001/.dvn/data
/uu/home/research-christine/my_books/AdventuresSherlockHolmes.txt
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 580k/580k [00:00<00:00, 1.01MB/s]
Downloaded /uu/home/research-christine/my_books/AdventuresSherlockHolmes.txt --> /Users/staig001/.dvn/data/AdventuresSherlockHolmes.txt
{'pid': 'doi:dataset_id', 'filename': 'AdventuresSherlockHolmes.txt'}
Uploaded /Users/staig001/.dvn/data/AdventuresSherlockHolmes.txt --> 10.80227/PDVNL/RZQRAK
A successful download and upload will trigger that those files will be removed from the status.
ibshell:research-christine> dv-status
{'https://demo.dataverse.nl': {'add_file': [{'dataset': '10.80227/PDVNL/RZQRAK',
'irods_paths': []}]}}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ibridgesdvn-1.0.2.tar.gz.
File metadata
- Download URL: ibridgesdvn-1.0.2.tar.gz
- Upload date:
- Size: 767.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef0f9553db7dce27382a0025cbbff8b9c659d98bfcbeb147c2f630cdcf95326e
|
|
| MD5 |
46e0c51b204c013bd0bf18a654ac45f8
|
|
| BLAKE2b-256 |
6b50639c63d55efcb35032636d5fd81460f125b334a59ff74ea2e006707dfa39
|
File details
Details for the file ibridgesdvn-1.0.2-py3-none-any.whl.
File metadata
- Download URL: ibridgesdvn-1.0.2-py3-none-any.whl
- Upload date:
- Size: 57.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37d765fae86d26c564ab6b3679bd58f66a069d6a817eeea31b04bf1a48679661
|
|
| MD5 |
b343bd516fe7e52b3234a45baff9e830
|
|
| BLAKE2b-256 |
464d3983bd06abc51fdf4a79b27205540c779c5e0e653cc8282871e5480e00a1
|