Skip to main content

A Python package for extracting data from Notion using Polars dataframes.

Project description

Notion ETL

PyPI License Code Quality check

A Python package for extracting, transforming, and loading data from Notion using Polars DataFrames and the Notion API Client.

The package provides a simple API for loading raw and clean data from Notion databases into Polars DataFrames, allowing for efficient data manipulation and analysis.

Installation

The package is available on PyPI and can be installed using pip:

pip install notion-etl

Usage

Authentication

Create a Notion integration and get your Notion API key. You can find instructions on how to do this in the Notion API documentation. Remember to share the pages and databases you want to access with your integration.

To authenticate, set your Notion API key as an environment variable:

export NOTION_TOKEN=secret_...

You can also set the token in your code:

import os
from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader(os.environ["NOTION_TOKEN"])

Loading Data from a Notion Database

Use the NotionDataLoader class to load data from a Notion database. The get_database method retrieves the database and its records.

The database id can be found in the URL of the database page. For example, in the URL https://www.notion.so/your_workspace/Database-Name-1234567890abcdef1234567890abcdef, the database id is 1234567890abcdef1234567890abcdef.

from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader()
database = loader.get_database("database_id")
database.records # List of records in the database
database.to_dataframe() # Convert to clean Polars DataFrame
database.to_dataframe(clean=False) # Convert to raw Polars DataFrame

Loading Data from a Notion Page

For loading data from a Notion page, use the get_page_contents method. The results of a page can be converted to a Polars DataFrame, plain text, or markdown.

Same as with the database, the page id can be found in the URL of the page. For example, in the URL https://www.notion.so/your_workspace/Page-Name-1234567890abcdef1234567890abcdef, the page id is 1234567890abcdef1234567890abcdef.

from notion_etl.loader import NotionDataLoader

loader = NotionDataLoader()
page = loader.get_page_contents("page_id")
print(page.as_plain_text()) # Print the page content as plain text
print(page.as_markdown()) # Print the page content as markdown
page.as_dataframe() # Convert to Polars DataFrame, every block in the page is a row in the DataFrame

Contributing

You can install the package using uv

First install uv with:

pip install uv

Then create the environment with:

uv sync

You can activate the virtual environment with source venv/bin/activate or you can run commands with uv run. For example:

uv run pytest tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notion_etl-0.1.2.tar.gz (31.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

notion_etl-0.1.2-py3-none-any.whl (45.8 kB view details)

Uploaded Python 3

File details

Details for the file notion_etl-0.1.2.tar.gz.

File metadata

  • Download URL: notion_etl-0.1.2.tar.gz
  • Upload date:
  • Size: 31.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.4

File hashes

Hashes for notion_etl-0.1.2.tar.gz
Algorithm Hash digest
SHA256 bcfce69185719beda1309e1f806a78ab054149aab90e9685486ae2be00def781
MD5 0e275b97e27a85fcf94d9bc7f7a5cacc
BLAKE2b-256 67f1eebe3813e716a3e8c835a1be83c6b18e4fc01e6ebb65740fff7ca9a13d7d

See more details on using hashes here.

File details

Details for the file notion_etl-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: notion_etl-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 45.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.4

File hashes

Hashes for notion_etl-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e94f1abb86d82e4dbaa9b3b8e5597ba7a0361586c48b4bdac5d7774812745c0a
MD5 8a1206c7ffb8fcf629acee77615a16d4
BLAKE2b-256 fc456e2d55d486f46971fd958003ffcdb99e866e6ea024f176101e52ab57e022

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page