Skip to main content

A Python port of the TidyTuesday Downloader

Project description

PidyTuesday

PidyTuesday is a Python library that ports the functionality of the TidyTuesday CRAN package to Python. It provides a suite of command-line tools for accessing and downloading TidyTuesday datasets hosted on GitHub.

Features

  • Get the most recent Tuesday date: Useful for aligning with TidyTuesday releases.
  • List available datasets: Discover available TidyTuesday datasets across years.
  • Download datasets: Retrieve individual files or complete datasets.
  • Display dataset README: Open the dataset's README in your web browser.
  • Check GitHub API rate limits: Monitor your GitHub API usage.

Installation

Using uv (recommended)

We make extensive use of uv and uv tools to enable command-line scripts without too much managing of virtual environments.

Please note the PyPi library is case sensitive - you must use PyDyTuesday.

  1. Install uv.

  2. Install PyDyTuesday to your commandline by using uv tool install

    uv tool install PyDyTuesday
    
    pydytuesday last-tuesday
    

Alternatively, you can use uv tool or uvx to avoid adding the command to your path.

uv tool PyDyTuesday last-tuesday

or using uvx:

uvx PyDyTuesday last-tuesday

Using pip

Alternatively, you can install the library directly into your environment using pip.

  1. Install the package (preferably in editable mode during development):

    pip install -e .
    
  2. Once installed, the CLI commands defined in the package (via the [project.scripts] section in pyproject.toml) will be automatically added to your PATH. This means you can run the commands directly from your terminal. For example:

    last-tuesday
    tt-available
    

    If the commands are not directly available in your PATH, you may invoke them using Python's module execution:

    python -m pydytuesday
    

    (Consult your system's documentation on how entry points are installed if you encounter issues.)

Usage

Once you have installed the library using uv, you should be able to run your commands from anywhere on your system.

  • Last Tuesday

    • Description: Prints the most recent Tuesday date relative to today's date or an optionally provided date.
    • Usage:
      pydytuesday last-tuesday
      pydytuesday last-tuesday 2025-03-10
      
      (The second example passes a specific date argument in YYYY-MM-DD format.)
  • TidyTuesday Available

    • Description: Lists all available TidyTuesday datasets.
    • Usage:
      pydytuesday tt-available
      
  • TidyTuesday Datasets

    • Description: Lists datasets for a specific year.
    • Usage:
      pydytuesday tt-datasets 2025
      
      (Example passes the year as an argument.)
  • Download Specific File

    • Description: Downloads a specified file from a TidyTuesday dataset by date.
    • Usage:
      pydytuesday tt-download-file 2025-03-10 data.csv
      
      (The example downloads the file 'data.csv' from the dataset for March 10, 2025.)
  • Download Dataset Files

    • Description: Downloads all or selected files from a TidyTuesday dataset by date.
    • Usage:
      pydytuesday tt-download 2025-03-10
      pydytuesday tt-download 2025-03-10 data.csv summary.json
      
      (The first example downloads all files from the dataset for March 10, 2025. The second example downloads only the specified files.)
  • Display Dataset README

    • Description: Opens the README for a TidyTuesday dataset in your default web browser.
    • Usage:
      pydytuesday readme 2025-03-10
      
      (The example opens the README for the dataset from March 10, 2025.)
  • Check GitHub Rate Limit

    • Description: Checks the remaining GitHub API rate limit.
    • Usage:
      pydytuesday rate-limit-check
      

Example Workflow

Here's a complete example of how to discover, download, and explore TidyTuesday data:

# 1. Find the most recent Tuesday date
pydytuesday last-tuesday
# Output: 2025-03-11

# 2. List available datasets for a specific year
pydytuesday tt-datasets 2025
# Output: Lists all datasets for 2025 with dates and titles

# 3. Download a specific file from a dataset by date
pydytuesday tt-download-file 2025-03-11 example.csv
# Output: Successfully saved example.csv to /path/to/example.csv

# 4. After downloading, you can read the CSV file using pandas in Python:
import pandas as pd

# Read the downloaded CSV file
df = pd.read_csv("example.csv")

# Display the first few rows
print(df.head())

# Get basic information about the dataset
print(df.info())

# Generate summary statistics
print(df.describe())

# Perform data analysis and visualization
import matplotlib.pyplot as plt
df.plot(kind='bar', x='category', y='value')
plt.title('TidyTuesday Data Analysis')
plt.show()

This workflow demonstrates how to use the command-line tools to discover and download data, and then use pandas to analyze the downloaded data.

Contributing

Contributions are welcome! Here's how you can help improve PidyTuesday:

  1. Fork the Repository:
    Click on the "Fork" button at the top right of the repository page and create your own copy.

  2. Clone Your Fork:

    git clone https://github.com/your-username/PidyTuesday.git
    cd PidyTuesday
    
  3. Create a New Branch:
    Start a new feature or bugfix branch:

    git checkout -b feature/your-feature-name
    
  4. Make Your Changes:
    Add new features, fix bugs, or improve documentation. Ensure your code adheres to the project's style guidelines.

  5. Commit Your Changes:
    Write clear commit messages that describe your changes:

    git add .
    git commit -m "Description of your changes"
    
  6. Push to Your Fork:

    git push origin feature/your-feature-name
    
  7. Submit a Pull Request:
    Open a pull request on the main repository. Provide a detailed description of your changes and reference any issues your PR addresses.

For larger contributions, consider discussing your ideas by opening an issue first so that we can provide guidance before you start coding.

License

This project is licensed under MIT as per the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydytuesday-0.1.0.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydytuesday-0.1.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file pydytuesday-0.1.0.tar.gz.

File metadata

  • Download URL: pydytuesday-0.1.0.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pydytuesday-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cd00bad7da3c93735c2646216aeef23045858ec3a13a4fb21dac99b64b34a6a9
MD5 df07e4e92a9fdd5730c44313d5c9bb6f
BLAKE2b-256 f9a95bd88cc4d90843bdb84ca3fea0d822b3598055cedd7d229430a007b2666f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydytuesday-0.1.0.tar.gz:

Publisher: python-publish.yml on AndreasThinks/PyDyTuesday

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydytuesday-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pydytuesday-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pydytuesday-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8bc94643cc4942a3e28c3a6a385927274a205044470bbf54228da46699386d6
MD5 40c3f7d1542ba14cae2a4d2a18973189
BLAKE2b-256 afaa5dacf8c34f87d26fb69896276166c5747d78269583f0bade26440819636e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydytuesday-0.1.0-py3-none-any.whl:

Publisher: python-publish.yml on AndreasThinks/PyDyTuesday

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page