Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python.

These details have not been verified by PyPI

Project links

Project description

Canonada

Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python.

Why Canonada?

Standardized: Canonada provides a standardized way to build your data projects
Modular: Canonada is modular and allows you to build and visualize data pipelines with ease
Memory Efficient: Canonada is memory efficient and can handle large datasets by streaming data through the pipeline instead of loading it all at once

Features

Centralized control of data sources: Manage all your data sources in one place, enabling you to keep your team in sync
Centralized control of the project configuration: Manage all your project configurations in one place
Easy dataloading: Load data from various sources like CSV, JSON, Parquet, etc.
Use functions as nodes: Functions are the building blocks of Canonada. You can use any function as a node in your pipeline
Create streaming data pipelines: Create parallel and sequential data pipelines with ease
Visualize your data pipeline: Visualize your data pipelines, nodes and connections

Summary

The goal of Canonada is to help data scientists and engineers to organize their data projects with a standardized structure that facilitates more maintainable code compared to one-off scripts and notebooks.

Canonada allows you to define data projects as graphs, composed of nodes and edges, that stream data dynamically from your defined sources to memory, allowing the usage datasets bigger than memory. The system parallelizes the execution of your projects allowing you to focus exclusively on the data processing logic you care about.

Let's quickly define a data pipeline as an example:

We will define this simple pipeline that transforms a few timeseries signals:

Use canonada view to get a representation of your data pipelines

# Import example functions to transform the data
from .nodes import example_nodes

# Define the pipeline
streaming_pipe = Pipeline("streaming_pipe", [
        # Read each signal from the catalog and add an offset defined in the parameters
        Node(
            func=example_nodes.add_offset, 
            input=["raw_signals", "params:section_1.offset"], # Load inputs from the catalog
            output=["offset_signals"],
            name="create_offsets",
            description="Adds parametrized offset to the signals"
            ),
        # Save the previous output to disk with a dummy module
        Node(
            func=lambda x: x, # Just pass the input to the output
            input=["offset_signals"],
            output=["offset_signals_catalog"],
            name="save_offsets",
            description="Saves the offset signals using the datahandler specified in the catalog"
        ),
        # Calculate the maximum value of each signal
        Node(
            func=example_nodes.get_signal_max,
            input=["offset_signals"],
            output=["max_values"],
            name="get_signal_max",
            description="Calculates the maximum value of the signals"
        ),
        # Calculate the mean value of each signal
        Node(
            func=example_nodes.calculate_mean,
            input=["offset_signals"],
            output=["mean_values"],
            name="calculate_mean",
            description="Calculates the mean value of the signals"
        ),
        # Save the stats of the signals in a CSV file
        Node(
            func=example_nodes.list_stats,
            input=["offset_signals", "max_values", "mean_values"],
            output=["stats"], # It will be saved in the defined file in the catalog
            name="list_stats",
            description="Returns the stats of the signals"
        )
    ],
    description="This pipeline reads signals from the catalog, adds an offset, calculates the maximum and mean values, and saves the stats to disk"
)

Done! Defining a data pipeline is as simple as that. To execute it you can type canonada run pipelines streaming_pipe on your terminal or use the .run() method of your pipeline object. Canonada will take care of the rest and parallelize the execution without any extra effort.

Checkout the Getting Started guide for more information.

Usage

Available commands:

Usage: canonada <command> <args>
Commands:
    new <project_name> - Create a new project
    catalog [list/params] - List all available datasets or get the project parameters
    registry [pipelines/systems] - List all available pipelines or systems
    run [pipelines/systems] <name(s)> - Run a pipeline or system
    view [pipelines/systems] <name(s)> - View a pipeline or system
    version - Print the version of Canonada

Installation

Canonada is available on PyPI and can be installed using pip:

pip install canonada

Check out the Getting Started guide to learn how to create a new project with Canonada.

Documentation

Check out the project's documentation here

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.7

Mar 16, 2026

This version

0.6.6

Mar 6, 2026

0.6.5

Jan 8, 2026

0.6.3

Nov 11, 2025

0.6.2

Oct 26, 2025

0.6.1

Oct 8, 2025

0.6.0

Oct 7, 2025

0.5.0

Oct 4, 2025

0.4.0

Sep 5, 2025

0.3.9

Jun 17, 2025

0.3.8

Jun 14, 2025

0.3.7

Jun 5, 2025

0.3.6

Jun 3, 2025

0.3.5

Jun 3, 2025

0.3.4

Mar 30, 2025

0.3.3

Mar 23, 2025

0.3.2

Mar 15, 2025

0.3.1

Mar 15, 2025

0.3.0

Mar 8, 2025

0.2.0

Feb 16, 2025

0.1.2

Dec 16, 2024

0.1.1

Dec 10, 2024

0.1.0

Nov 1, 2024

0.0.8

Sep 16, 2024

0.0.7

Sep 13, 2024

0.0.6

Sep 10, 2024

0.0.4

Sep 2, 2024

0.0.3

Sep 1, 2024

0.0.2

Aug 29, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canonada-0.6.6.tar.gz (26.0 kB view details)

Uploaded Mar 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

canonada-0.6.6-py3-none-any.whl (25.9 kB view details)

Uploaded Mar 6, 2026 Python 3

File details

Details for the file canonada-0.6.6.tar.gz.

File metadata

Download URL: canonada-0.6.6.tar.gz
Upload date: Mar 6, 2026
Size: 26.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for canonada-0.6.6.tar.gz
Algorithm	Hash digest
SHA256	`10bb44a01b8814ba199d264f03407f64e234e24eed8da2f590d6623c97d2e23d`
MD5	`e82fc913c31637bbed2e0dc9561a98d8`
BLAKE2b-256	`29973588ea1324d6d5d6b37be3272366d21fde0e26e4ba00120d038a6226e440`

See more details on using hashes here.

File details

Details for the file canonada-0.6.6-py3-none-any.whl.

File metadata

Download URL: canonada-0.6.6-py3-none-any.whl
Upload date: Mar 6, 2026
Size: 25.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for canonada-0.6.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b92d71f051214145dc347a4789f75eaf067c98de9632da2ac81f2745eea74454`
MD5	`e60afe579721fa4c4de80b7c42230403`
BLAKE2b-256	`303567c117d4a047f2aae3897810a46ba8b7381c56e04c08868ac1b68a920c45`

See more details on using hashes here.

canonada 0.6.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Canonada

Why Canonada?

Features

Summary

Usage

Installation

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes