Skip to main content

A Python package for interfacing with the Mozilla Data Collective's API

Project description

Project logo

Published Docs Tests

Mozilla Data Collective Python API Library

Python library for interfacing with the Mozilla Data Collective REST API.

Installation

pip install datacollective

Quick Start

IMPORTANT NOTE: Before trying to access any dataset, make sure you have thoroughly read and agreed to the specific dataset's conditions & licensing terms.

  1. Get your API key from the Mozilla Data Collective dashboard

  2. Set the API key in your environment variable:

Option A: Run this command in your terminal (replace your-api-key-here with your actual API key):

export MDC_API_KEY=your-api-key-here

Option B: Create a .env file in your project directory and add this line:

MDC_API_KEY=your-api-key-here
  1. Get your dataset ID from the last section of the dataset URL at the MDC website.

[!TIP] You can find the dataset-id by looking at the URL of the dataset's page on MDC platform. The ID is the unique string of characters located at the very end of the URL, after the /datasets/ path. For example, for URL https://datacollective.mozillafoundation.org/datasets/cminc35no007no707hql26lzk dataset id will be cminc35no007no707hql26lzk.

  1. Save a dataset locally:
from datacollective import save_dataset_to_disk

dataset_path = save_dataset_to_disk("your-dataset-id")

[!TIP] Automatic Resume: If a download is interrupted (e.g., due to a network error or it gets stopped it manually), the next time you try download the same dataset at the same folder location, we will automatically resume from where the download left off!

  1. Get information & metadata about a dataset:
from datacollective import get_dataset_details

details = get_dataset_details("your-dataset-id")
  1. Load the dataset into a pandas DataFrame (Alpha version: Only certain MDC datasets are supported right now):
from datacollective import load_dataset

dataset = load_dataset("your-dataset-id")

For more details, visit our docs

License

This project is released under MPL (Mozilla Public License) 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacollective-0.4.0.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datacollective-0.4.0-py3-none-any.whl (28.7 kB view details)

Uploaded Python 3

File details

Details for the file datacollective-0.4.0.tar.gz.

File metadata

  • Download URL: datacollective-0.4.0.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datacollective-0.4.0.tar.gz
Algorithm Hash digest
SHA256 90c1c090e0d94c0330b9ea4f21ec5037b4f0f10d356b00519bc2b08b98d4c85a
MD5 307058b95dc05648b0333792897e9b7e
BLAKE2b-256 bc9de5c4906b786a1dd057d9a37feb986b014dcad729757c91c50db2bcf8794f

See more details on using hashes here.

File details

Details for the file datacollective-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: datacollective-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 28.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datacollective-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd375fba28d5094ff449a4317ea9e4c6b27c24c3296afcb4fc6303c85dae4110
MD5 c468f363db58f691151237cd57d9cebc
BLAKE2b-256 40450d6fd77503d536a004e5fdbe8dbb94f9352d0699977f8f17025890537264

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page