Skip to main content

NextData is a framework for building data pipelines with a focus on simplicity and scalability.

Project description

What is NextData?

NextData answers the question: "What would a nextjs for data worfklows look like?"

It is a framework that provides convention over configuration for data workflows.

A "data directory" is the single source of truth for your data. It is a directory that contains all of your data, and all of your code that operates on that data.

How to use NextData

Installing NextData

pip install nextdata

This will install the NextData framework and the ndx command line tool.

Creating a new NextData project

ndx create-ndx-app my_project

Configuring your project

NextData uses Pulumi under the hood to manage your infrastructure. Make sure you have pulumi installed and configured. You'll need AWS credentials in your environment, but NextData will confiure an IAM role specifically for Pulumi to use that only has access to the resources you need.

Running your project

ndx dev

The dev server starts 4 processes:

  1. A Pulumi stack that manages your infrastructure
  2. A watch dog that watches your data directory and reruns your pulumi stack when you make changes
  3. A FastAPI server that serves the NextData API
  4. A NextJS server that serves the NextData Dashboard

In the future, this can be dockerized so you can self host just the components you need.

Adding data to your project

Much like the app router in NextJS, NextData uses the data directory to represent your data. When you hadd a directory to your data directory, NextData will automatically generate an S3 table for you.

NextData also checks for certain magic files in each data table directory to determine how to process the data.

For example, adding an etl.py file will tell NextData to configure a Glue job to process the data. You only need to provide a connection name, and NextData will use the connection to read and write data to S3 using sensible defaults. Of course, you can customize the ETL script if you want using the @glue_job decorator.

Speaking of connections, NextData uses the connections directory to represent your data sources. Each connection is a set of credentials and configuration for a data source. NextData currently supports arbitrary connections through JDBC and DSQL, but in the future will support more data sources like Snowflake, BigQuery, etc.

NextData Dashboard

The NextData Dashboard is a NextJS app that provides a UI for you to explore your data. It is powered by the NextData API, which is a FastAPI server that you can use to build your own custom APIs.

The Dashboard is where you can build your own queries, visualizations, and data products.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nextdata-0.1.19.tar.gz (27.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nextdata-0.1.19-py3-none-any.whl (31.4 MB view details)

Uploaded Python 3

File details

Details for the file nextdata-0.1.19.tar.gz.

File metadata

  • Download URL: nextdata-0.1.19.tar.gz
  • Upload date:
  • Size: 27.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for nextdata-0.1.19.tar.gz
Algorithm Hash digest
SHA256 f0b2bd94c4af2da2252a2ef40afb7cd78d9d9696622c5776f53fad2e4d2cfb98
MD5 39e806b0428ead48da69a22cc8f04d02
BLAKE2b-256 7fca2b6e6857bc1dbe90fb86621e5f2b03c9afe8ae3e3ff3331d7dd8748c4bad

See more details on using hashes here.

File details

Details for the file nextdata-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: nextdata-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 31.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for nextdata-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 582ceb5f8ad75bfc1c1d5bbf53cf86531e5ff94e8f9ef87928cbf6e32cf8e115
MD5 32da0b8c52d59d970ee3f9c46c02b155
BLAKE2b-256 c3941b9f156789ac0e5325df08a16b9ab77b1838e1bcff3a3603ccbf175b5d29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page