A Python package for accessing Toronto Open Data Portal

These details have not been verified by PyPI

Project links

Project description

TorontoOpenData Python Package

Overview

The TorontoOpenData package provides a Python interface to interact with the Toronto Open Data portal. It allows users to list, search, and download datasets, as well as load specific resources.

Installation

To install the package, run:

pip install toronto-open-data

Development Installation

For development and contributing:

git clone https://github.com/alexwolson/toronto-open-data.git
cd toronto-open-data
pip install -e ".[dev]"
make pre-commit  # Install pre-commit hooks

Dependencies

pandas
requests
tqdm
ckanapi

Usage

Initialization

Initialize the TorontoOpenData class:

from toronto_open_data import TorontoOpenData

tod = TorontoOpenData()

List All Datasets

List all available datasets:

datasets = tod.list_all_datasets()

Search Datasets

Search datasets by keyword:

search_results = tod.search_datasets('parks')

Download Dataset

Download a specific dataset:

tod.download_dataset('dataset_name')

Load Dataset

Load a specific file from a dataset:

file_path = tod.load('dataset_name', 'file_name.csv', smart_return=False)

Load a specific file, returning an object if supported (default behaviour):

file_object = tod.load('dataset_name', 'file_name.csv', smart_return=True)

Using the Datastore API (New!)

For datasets that support CKAN's datastore, you can query data directly without downloading files:

Basic Datastore Search

# Get type-enforced data directly from the datastore
data = tod.datastore_search('resource-id-here', limit=100)
print(data.dtypes)  # Shows proper data types (dates, numbers, etc.)

Filtered Search

# Search with filters and sorting
filtered_data = tod.datastore_search(
    'resource-id-here',
    filters={'status': 'active', 'year': 2023},
    sort='date_created desc',
    limit=50
)

Get Resource Metadata

# Get field information and descriptions
info = tod.datastore_info('resource-id-here')
for field in info['fields']:
    print(f"{field['id']}: {field.get('type')} - {field.get('info', {}).get('label', 'No description')}")

Custom SQL Queries

# Advanced querying with SQL
data = tod.datastore_search_sql('''
    SELECT category, COUNT(*) as count, AVG(value) as avg_value
    FROM "resource-id-here"
    WHERE status = 'active'
    GROUP BY category
    ORDER BY count DESC
    LIMIT 10
''')

Find Datastore Resources

# Check which resources support datastore
datastore_resources = tod.get_datastore_resources('dataset-name')
for resource in datastore_resources:
    print(f"Datastore resource: {resource['name']} (ID: {resource['id']})")

Datastore vs File Download

Feature	File Download (`load()`)	Datastore API
Data freshness	Static files	Real-time data
Type enforcement	Basic pandas inference	CKAN-defined types
Filtering	Client-side (after download)	Server-side
Metadata	Limited	Rich field descriptions
Query flexibility	None	Full SQL support
Network usage	Downloads entire file	Only requested data

Methods

Basic Dataset Operations

list_all_datasets(as_frame=True): List all datasets.
search_datasets(query, as_frame=True): Search datasets by keyword.
search_resources_by_name(name, as_frame=True): Get dataset by name.
download_dataset(name, file_path='./cache/', overwrite=False): Download resource.
load(name, filename, file_path='./cache/', reload=False, smart_return=True): Load a file from the dataset.

Datastore API Methods (New!)

datastore_search(resource_id, filters=None, q=None, limit=100, offset=0, fields=None, sort=None, as_frame=True): Search datastore records with type-enforced results and filtering.
datastore_info(resource_id): Get metadata about datastore resource fields, types, and descriptions.
datastore_search_sql(sql, as_frame=True): Execute SQL queries on datastore resources.
get_datastore_resources(name, as_frame=True): Get only datastore-enabled resources for a dataset.

Smart Return File Types

The package supports smart return for the following file types:

csv
docx
gpkg
geojson
jpeg
json
kml
pdf
sav
shp
txt
xlsm
xlsx
xml
xsd

Development

Running Tests

# Run all tests
make test

# Run tests with coverage
make test-cov

# Run linting checks
make lint

Code Quality

This project uses several tools to maintain code quality:

Black: Code formatting
isort: Import sorting
flake8: Linting
mypy: Type checking
pre-commit: Automated checks

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

MIT License

Changelog

See CHANGELOG.md for a list of changes and version history.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Aug 19, 2025

This version

0.2.0

Aug 18, 2025

0.1.4

Aug 18, 2025

0.1.3

Sep 14, 2023

0.1.2 yanked

Sep 14, 2023

0.1.1 yanked

Sep 14, 2023

0.1.0 yanked

Sep 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toronto_open_data-0.2.0.tar.gz (29.0 kB view details)

Uploaded Aug 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toronto_open_data-0.2.0-py3-none-any.whl (13.4 kB view details)

Uploaded Aug 18, 2025 Python 3

File details

Details for the file toronto_open_data-0.2.0.tar.gz.

File metadata

Download URL: toronto_open_data-0.2.0.tar.gz
Upload date: Aug 18, 2025
Size: 29.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for toronto_open_data-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`326a0e8d7cf8e6f16cde8f701bfccf6889a5054a66c35ba5f08f75306c8f0a0f`
MD5	`999ae3bb2330776f5e30087b3006e282`
BLAKE2b-256	`027cfb5d9a27a14a179984befc08c67b607e89731567e66146ef06f4b72df01d`

See more details on using hashes here.

File details

Details for the file toronto_open_data-0.2.0-py3-none-any.whl.

File metadata

Download URL: toronto_open_data-0.2.0-py3-none-any.whl
Upload date: Aug 18, 2025
Size: 13.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for toronto_open_data-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7f4d217aec361e47e1d692b448fcc2eb3f56302b515b57a4c26922d02abfa98`
MD5	`50bfaaf0c9f1d9c83c89fc6c4503b7a1`
BLAKE2b-256	`b98ea20547c1f994088f338092312420e3eb6237fdac4c2159da86b807e40179`

See more details on using hashes here.

toronto-open-data 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TorontoOpenData Python Package

Overview

Installation

Development Installation

Dependencies

Usage

Initialization

List All Datasets

Search Datasets

Download Dataset

Load Dataset

Using the Datastore API (New!)

Basic Datastore Search

Filtered Search

Get Resource Metadata

Custom SQL Queries

Find Datastore Resources

Datastore vs File Download

Methods

Basic Dataset Operations

Datastore API Methods (New!)

Smart Return File Types

Development

Running Tests

Code Quality

Contributing

License

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes