Skip to main content

Kedro helps you build production-ready data and analytics pipelines

Project description

Kedro Logo Banner

Python version PyPI version Conda version License Discourse users CircleCI - Master Branch Develop Branch Build Documentation DOI

What is Kedro?

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering and applies them to machine-learning code; applied concepts include modularity, separation of concerns and versioning.

How do I install Kedro?

To install Kedro from the Python Package Index (PyPI) simply run:

pip install kedro

It is also possible to install Kedro using conda:

conda install -c conda-forge kedro

Our Get Started guide contains full installation instructions, and includes how to set up Python virtual environments.

What are the main features of Kedro?

Kedro-Viz Pipeline Visualisation A pipeline visualisation generated using Kedro-Viz

Feature What is this?
Project Template A standard, modifiable and easy-to-use project template based on Cookiecutter Data Science.
Data Catalog A series of lightweight data connectors used to save and load data across many different file formats and file systems, including local and network file systems, cloud object stores, and HDFS. The Data Catalog also includes data and model versioning for file-based systems.
Pipeline Abstraction Automatic resolution of dependencies between pure Python functions and data pipeline visualisation using Kedro-Viz.
Coding Standards Test-driven development using pytest, produce well-documented code using Sphinx, create linted code with support for flake8, isort and black and make use of the standard Python logging library.
Flexible Deployment Deployment strategies that include single or distributed-machine deployment as well as additional support for deploying on Argo, Prefect, Kubeflow, AWS Batch and Databricks.

How do I use Kedro?

The Kedro documentation includes three examples to help get you started:

Why does Kedro exist?

Kedro is built upon our collective best-practice (and mistakes) trying to deliver real-world ML applications that have vast amounts of raw unvetted data. We developed Kedro to achieve the following:

  • To address the main shortcomings of Jupyter notebooks, one-off scripts, and glue-code because there is a focus on creating maintainable data science code
  • To enhance team collaboration when different team members have varied exposure to software engineering concepts
  • To increase efficiency, because applied concepts like modularity and separation of concerns inspire the creation of reusable analytics code

The humans behind Kedro

Kedro is maintained by a product team from QuantumBlack and a number of contributors from across the world.

Can I contribute?

Yes! Want to help build Kedro? Check out our guide to contributing to Kedro.

Where can I learn more?

There is a growing community around Kedro. Have a look at the Kedro FAQs to find projects using Kedro and links to articles, podcasts and talks.

Who likes Kedro?

There are Kedro users across the world, who work at start-ups, major enterprises and academic institutions like Absa, Acensi, AI Singapore, AXA UK, Caterpillar, CRIM, Dendra Systems, ElementAI, GMO, Imperial College London, Jungle Scout, Helvetas, Leapfrog, McKinsey & Company, Mercado Libre Argentina, Modec, Mosaic Data Science, NaranjaX, Open Data Science LatAm, QuantumBlack, Retrieva, Roche, Telkomsel, UrbanLogiq, Universidad Rey Juan Carlos and XP.

Kedro has also won Best Technical Tool or Framework for AI in the 2019 Awards AI competition and a merit award for the 2020 UK Technical Communication Awards. It is listed on the 2020 ThoughtWorks Technology Radar and the 2020 Data & AI Landscape.

How can I cite Kedro?

If you're an academic, Kedro can also help you, for example, as a tool to solve the problem of reproducible research. Find our citation reference on Zenodo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro-0.17.2.tar.gz (127.8 kB view details)

Uploaded Source

Built Distribution

kedro-0.17.2-py3-none-any.whl (16.0 MB view details)

Uploaded Python 3

File details

Details for the file kedro-0.17.2.tar.gz.

File metadata

  • Download URL: kedro-0.17.2.tar.gz
  • Upload date:
  • Size: 127.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/3.7.2 packaging/20.9 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.13

File hashes

Hashes for kedro-0.17.2.tar.gz
Algorithm Hash digest
SHA256 aa0cccf265c5e95ce2c4ddf44db57d56c615a6863dfd707a0a827fc7c282c14b
MD5 fed05a5657a45abe24b515401526f15b
BLAKE2b-256 00f11f7546546eac37aab5bb283de36c18216b997ab2e8d33dd433b52ded6bf7

See more details on using hashes here.

File details

Details for the file kedro-0.17.2-py3-none-any.whl.

File metadata

  • Download URL: kedro-0.17.2-py3-none-any.whl
  • Upload date:
  • Size: 16.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/3.7.2 packaging/20.9 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.13

File hashes

Hashes for kedro-0.17.2-py3-none-any.whl
Algorithm Hash digest
SHA256 82005853d7e3dbc806017ff364c42016c511cbf4e393dbc5bcaa7de5470c36ee
MD5 ea746e5d8e1e9bcd68a8d1013079983b
BLAKE2b-256 b6b2218099b0d4f831ff75c3a5e46fe45e4671ce33538546ad1768b86dec3e67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page