Skip to main content

Python code to collect metrics from datasets in Dataverse collections

Project description

dv-api-metrics

  • This module provides Python scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.
  • The scripts are primarily used to support the CAFE project and will be archived when the Dataverse Hub supports additional metrics reports.

Requirements

  • Requires Python 3.10 or greater
  • Uses command line ($bash shell)

Installation

  • Create dataverse.harvard.edu and/or demo.dataverse.org account
  • Retrieve your API key
  • On the command line, create a new directory, such as dv_api_metrics_reports
  • Then, type: pip install dv-api-metrics to install the module
  • On the command line, set $DATAVERSE_API_TOKEN to your API token
  • Execute desired script

The reports can be run with simple commands, such as:

  • dv-collection-subjects CAFE
  • dv-collection-citations CAFE
  • dv-harvest-counts CAFE
  • dv-collection-inventory CAFE
  • dv-monthly-datasets CAFE
  • dv-harvest-views CAFE
  • dv-monthly-downloads CAFE

These commands will produce the reports in your current directory (e.g., ~./dv_api_metrics_reports). Please note, these reports may fail randomly for various reasons related to accessing these via the REST APIs - most typically a 403 forbidden result code probably due to rate limiting. If a report fails, it will eventually succeed when run later.

Detailed Usage

The module includes the following scripts to collect metrics about Dataverse datasets and collections in demo.dataverse.org or dataverse.harvard.edu.

get_collection_dataset_citations.py

  • Get a collection's dataset citations. Includes its subcollections.
  • % python get_collection_dataset_citations.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_dataset_inventory.py

  • Get a collection's dataset inventory. Includes its subcollections.
  • % python get_collection_dataset_inventory.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_datasets_per_subject_count.py

  • Get the total count of datasets per subject in a collection and its subcollections.
  • % python get_collection_harvested_dataets_count.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_harvest_dataset_counts.py

  • Get a collection's harvested dataset counts. Includes its subcollections.
  • % python get_collection_harvest_dataset_counts.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_harvest_dataset_views.py

  • Get a collection's harvested dataset unique views. Includes its subcollections.
  • % python get_collection_harvest_dataset_views.py <collection> --installation [hdv|demo]\ --filename <filename> --verbose

get_collection_metrics.py

  • Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
  • % python get_collection_metrics.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose

get_collection_unique_monthly_downloads.py

  • Get monthly unique dataset download metrics for a named collection. Saves metrics to a tab-delimited file.
  • % python get_collection_unique_monthly_downloads.py <installation> <collection> \ --metrics [dvm | mdc] --filename <filename> --output [records|time_series] --verbose

Limitations

  • Module uses existing Dataverse Metrics API endpoints or Native API endpoints.
  • Make Data Count (MDC) metrics range from 2020-09 to the present.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dv_api_metrics-0.3.0.tar.gz (7.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dv_api_metrics-0.3.0-py3-none-any.whl (7.6 MB view details)

Uploaded Python 3

File details

Details for the file dv_api_metrics-0.3.0.tar.gz.

File metadata

  • Download URL: dv_api_metrics-0.3.0.tar.gz
  • Upload date:
  • Size: 7.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dv_api_metrics-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1ba4f925d004c519849b80448bbf7cfb09ea148503eeeb61cdc2601c60b5757a
MD5 17ff8b04f81d12c52b823ed1f283ec8f
BLAKE2b-256 fed74c1eed9f9eff8631b35b812901abc715cdfe659219d24654350e1a778f59

See more details on using hashes here.

File details

Details for the file dv_api_metrics-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dv_api_metrics-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dv_api_metrics-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3a1d287d741868ed3d6faa8adf1d30f654a876b027a1f3e37b1734db6372392
MD5 d4e04ff168e1fd3d42cf34236f885694
BLAKE2b-256 8cdecd28629d9884b9b70d16f09c4f101a7f1887471f03e7cb75ea85ed0b6f5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page