Skip to main content

Python code to collect metrics from datasets in Dataverse collections

Project description

dv-api-metrics

  • This module provides Python scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.
  • The scripts are primarily used to support the CAFE project and will be archived when the Dataverse Hub supports additional metrics reports.

Requirements

  • Requires Python 3.10 or greater
  • Uses command line ($bash shell)

Installation

  • Create dataverse.harvard.edu and/or demo.dataverse.org account
  • Retrieve your API key
  • On the command line, create a new directory, such as dv_api_metrics_reports
  • Then, type: pip install dv-api-metrics to install the module
  • On the command line, set $DATAVERSE_API_TOKEN to your API token
  • Execute desired script

The reports can be run with simple commands, such as:

  • dv-collection-subjects CAFE
  • dv-collection-citations CAFE
  • dv-harvest-counts CAFE
  • dv-collection-inventory CAFE
  • dv-monthly-datasets CAFE
  • dv-harvest-views CAFE
  • dv-monthly-downloads CAFE

These commands will produce the reports in your current directory (e.g., ~./dv_api_metrics_reports). Please note, these reports may fail randomly for various reasons related to accessing these via the REST APIs - most typically a 403 forbidden result code probably due to rate limiting. If a report fails, it will eventually succeed when run later.

Detailed Usage

The module includes the following scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.

get_collection_dataset_citations.py

  • Get a collection's dataset citations. Includes its subcollections.
  • % python get_collection_dataset_citations.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_dataset_inventory.py

  • Get a collection's dataset inventory. Includes its subcollections.
  • % python get_collection_dataset_inventory.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_datasets_per_subject_count.py

  • Get the total count of datasets per subject in a collection and its subcollections.
  • % python get_collection_harvested_dataets_count.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_harvest_dataset_counts.py

  • Get a collection's harvested dataset counts. Includes its subcollections.
  • % python get_collection_harvest_dataset_counts.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_harvest_dataset_views.py

  • Get a collection's harvested dataset unique views. Includes its subcollections.
  • % python get_collection_harvest_dataset_views.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_metrics.py

  • Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
  • % python get_collection_metrics.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose

get_collection_unique_monthly_downloads.py

  • Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
  • % python get_collection_unique_monthly_downloads.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose

Limitations

  • Module uses existing Dataverse Metrics API endpoints or Native API endpoints.
  • Make Data Count (MDC) metrics range from 2020-09 to the present.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dv_api_metrics-0.2.9.tar.gz (7.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dv_api_metrics-0.2.9-py3-none-any.whl (7.6 MB view details)

Uploaded Python 3

File details

Details for the file dv_api_metrics-0.2.9.tar.gz.

File metadata

  • Download URL: dv_api_metrics-0.2.9.tar.gz
  • Upload date:
  • Size: 7.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dv_api_metrics-0.2.9.tar.gz
Algorithm Hash digest
SHA256 113d62f48c9bff70afd07f126ea3d512a1e3dbf34e4234edbf84c5d1b3b2e306
MD5 c67295e95838f4fc3d9faff1d98c1be3
BLAKE2b-256 2fc6e4d9580761fcc6939f5640e57ef8af5e5f5d0227113b897dbfec87fdd858

See more details on using hashes here.

File details

Details for the file dv_api_metrics-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: dv_api_metrics-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 7.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dv_api_metrics-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 96aaee344a89222425897bf1ca932a0bef5fbe1b1a3acafe6fb71e5435cfc0fd
MD5 af650c649ff351197a8af14e8aac6d55
BLAKE2b-256 27df0276f08bd4870de610fe2f193ea3756700dc8b83585ed4d2ad98df98b01f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page