Python code to collect metrics from datasets in Dataverse collections
Project description
dv-api-metrics
- This module provides Python scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.
- The scripts are primarily used to support the CAFE project and will be archived when the Dataverse Hub supports additional metrics reports.
Requirements
- Requires Python 3.10 or greater
- Uses command line (
$bashshell)
Installation
- Create
dataverse.harvard.eduand/ordemo.dataverse.orgaccount - Retrieve your API key
- On the command line, create a new directory, such as
dv_api_metrics_reports - Then, type:
pip install dv-api-metricsto install the module - On the command line, set
$DATAVERSE_API_TOKENto your API token - Execute desired script
The reports can be run with simple commands, such as:
dv-collection-subjects CAFEdv-collection-citations CAFEdv-harvest-counts CAFEdv-collection-inventory CAFEdv-monthly-datasets CAFEdv-harvest-views CAFEdv-monthly-downloads CAFE
These commands will produce the reports in your current directory (e.g., ~./dv_api_metrics_reports). Please note, these reports may fail randomly for various reasons related to accessing these via the REST APIs - most typically a 403 forbidden result code probably due to rate limiting. If a report fails, it will eventually succeed when run later.
Detailed Usage
The module includes the following scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.
get_collection_dataset_citations.py
- Get a collection's dataset citations. Includes its subcollections.
% python get_collection_dataset_citations.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose
get_collection_dataset_inventory.py
- Get a collection's dataset inventory. Includes its subcollections.
% python get_collection_dataset_inventory.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose
get_collection_datasets_per_subject_count.py
- Get the total count of datasets per subject in a collection and its subcollections.
% python get_collection_harvested_dataets_count.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose
get_collection_harvest_dataset_counts.py
- Get a collection's harvested dataset counts. Includes its subcollections.
% python get_collection_harvest_dataset_counts.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose
get_collection_harvest_dataset_views.py
- Get a collection's harvested dataset unique views. Includes its subcollections.
% python get_collection_harvest_dataset_views.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose
get_collection_metrics.py
- Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
% python get_collection_metrics.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose
get_collection_unique_monthly_downloads.py
- Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
% python get_collection_unique_monthly_downloads.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose
Limitations
- Module uses existing Dataverse Metrics API endpoints or Native API endpoints.
- Make Data Count (MDC) metrics range from 2020-09 to the present.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dv_api_metrics-0.2.9.tar.gz.
File metadata
- Download URL: dv_api_metrics-0.2.9.tar.gz
- Upload date:
- Size: 7.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
113d62f48c9bff70afd07f126ea3d512a1e3dbf34e4234edbf84c5d1b3b2e306
|
|
| MD5 |
c67295e95838f4fc3d9faff1d98c1be3
|
|
| BLAKE2b-256 |
2fc6e4d9580761fcc6939f5640e57ef8af5e5f5d0227113b897dbfec87fdd858
|
File details
Details for the file dv_api_metrics-0.2.9-py3-none-any.whl.
File metadata
- Download URL: dv_api_metrics-0.2.9-py3-none-any.whl
- Upload date:
- Size: 7.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96aaee344a89222425897bf1ca932a0bef5fbe1b1a3acafe6fb71e5435cfc0fd
|
|
| MD5 |
af650c649ff351197a8af14e8aac6d55
|
|
| BLAKE2b-256 |
27df0276f08bd4870de610fe2f193ea3756700dc8b83585ed4d2ad98df98b01f
|