Skip to main content

A Python package for extracting data from Notion using Polars dataframes.

Project description

Notion ETL

PyPI License Code Quality check

A Python package for extracting, transforming, and loading data from Notion using Polars DataFrames and the Notion API Client.

The package provides a simple API for loading raw and clean data from Notion databases into Polars DataFrames, allowing for efficient data manipulation and analysis.

Installation

The package is available on PyPI and can be installed using pip:

pip install notion-etl

Usage

Authentication

Create a Notion integration and get your Notion API key. You can find instructions on how to do this in the Notion API documentation. Remember to share the pages and databases you want to access with your integration.

To authenticate, set your Notion API key as an environment variable:

export NOTION_TOKEN=secret_...

You can also set the token in your code:

import os
from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader(os.environ["NOTION_TOKEN"])

Loading Data from a Notion Database

Use the NotionDataLoader class to load data from a Notion database. The get_database method retrieves the database and its records.

The database id can be found in the URL of the database page. For example, in the URL https://www.notion.so/your_workspace/Database-Name-1234567890abcdef1234567890abcdef, the database id is 1234567890abcdef1234567890abcdef.

from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader()
database = loader.get_database("database_id")
database.records # List of records in the database
database.to_dataframe() # Convert to clean Polars DataFrame
database.to_dataframe(clean=False) # Convert to raw Polars DataFrame

Loading Data from a Notion Page

For loading data from a Notion page, use the get_page_contents method. The results of a page can be converted to a Polars DataFrame, plain text, or markdown.

Same as with the database, the page id can be found in the URL of the page. For example, in the URL https://www.notion.so/your_workspace/Page-Name-1234567890abcdef1234567890abcdef, the page id is 1234567890abcdef1234567890abcdef.

from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader()
page = loader.get_page_contents("page_id")
print(page.as_plain_text()) # Print the page content as plain text
print(page.as_markdown()) # Print the page content as markdown
page.as_dataframe() # Convert to Polars DataFrame, every block in the page is a row in the DataFrame

Contributing

You can install the package using uv

First install uv with:

pip install uv

Then create the environment with:

uv sync

You can activate the virtual environment with source venv/bin/activate or you can run commands with uv run. For example:

uv run pytest tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notion_etl-0.1.3.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

notion_etl-0.1.3-py3-none-any.whl (45.9 kB view details)

Uploaded Python 3

File details

Details for the file notion_etl-0.1.3.tar.gz.

File metadata

  • Download URL: notion_etl-0.1.3.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.4

File hashes

Hashes for notion_etl-0.1.3.tar.gz
Algorithm Hash digest
SHA256 604e7310a5f7e5145cb901e2d30d6567beedc106085d9961f4dcd2032750d273
MD5 500fa1acd96e3d33358cddc14d4b8d31
BLAKE2b-256 4560e5d371487e5bd66009411a4866cd802464e6209f90d73b0cdc33f23e0239

See more details on using hashes here.

File details

Details for the file notion_etl-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: notion_etl-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 45.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.4

File hashes

Hashes for notion_etl-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0b217c581a832f188e8a5458f376c045cd47b4dac33649e9b6733f04de6a597e
MD5 86be33ce09591a13ef0825ebb329d8cb
BLAKE2b-256 59b7093fac279f778b93fb95241b102d4a5a2297e085b66d4cc2e1236d171c38

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page