A Python package for extracting data from Notion using Polars dataframes.
Project description
Notion ETL
A Python package for extracting, transforming, and loading data from Notion using Polars DataFrames and the Notion API Client.
The package provides a simple API for loading raw and clean data from Notion databases into Polars DataFrames, allowing for efficient data manipulation and analysis.
Installation
The package is available on PyPI and can be installed using pip:
pip install notion-etl
Usage
Authentication
Create a Notion integration and get your Notion API key. You can find instructions on how to do this in the Notion API documentation. Remember to share the pages and databases you want to access with your integration.
To authenticate, set your Notion API key as an environment variable:
export NOTION_TOKEN=secret_...
You can also set the token in your code:
import os
from notion_etl.loader import NotionDataLoader
loader = NotionDataLoader(os.environ["NOTION_TOKEN"])
Loading Data from a Notion Database
Use the NotionDataLoader class to load data from a Notion database. The get_database method retrieves the database and its records.
The database id can be found in the URL of the database page. For example, in the URL https://www.notion.so/your_workspace/Database-Name-1234567890abcdef1234567890abcdef, the database id is 1234567890abcdef1234567890abcdef.
from notion_etl.loader import NotionDataLoader
loader = NotionDataLoader()
database = loader.get_database("database_id")
database.records # List of records in the database
database.to_dataframe() # Convert to clean Polars DataFrame
database.to_dataframe(clean=False) # Convert to raw Polars DataFrame
Loading Data from a Notion Page
For loading data from a Notion page, use the get_page_contents method. The results of a page can be converted to a Polars DataFrame, plain text, or markdown.
Same as with the database, the page id can be found in the URL of the page. For example, in the URL https://www.notion.so/your_workspace/Page-Name-1234567890abcdef1234567890abcdef, the page id is 1234567890abcdef1234567890abcdef.
from notion_etl.loader import NotionDataLoader
loader = NotionDataLoader()
page = loader.get_page_contents("page_id")
print(page.as_plain_text()) # Print the page content as plain text
print(page.as_markdown()) # Print the page content as markdown
page.as_dataframe() # Convert to Polars DataFrame, every block in the page is a row in the DataFrame
Contributing
You can install the package using uv
First install uv with:
pip install uv
Then create the environment with:
uv sync
You can activate the virtual environment with source venv/bin/activate or you can run commands with uv run. For example:
uv run pytest tests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file notion_etl-0.1.3.tar.gz.
File metadata
- Download URL: notion_etl-0.1.3.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
604e7310a5f7e5145cb901e2d30d6567beedc106085d9961f4dcd2032750d273
|
|
| MD5 |
500fa1acd96e3d33358cddc14d4b8d31
|
|
| BLAKE2b-256 |
4560e5d371487e5bd66009411a4866cd802464e6209f90d73b0cdc33f23e0239
|
File details
Details for the file notion_etl-0.1.3-py3-none-any.whl.
File metadata
- Download URL: notion_etl-0.1.3-py3-none-any.whl
- Upload date:
- Size: 45.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b217c581a832f188e8a5458f376c045cd47b4dac33649e9b6733f04de6a597e
|
|
| MD5 |
86be33ce09591a13ef0825ebb329d8cb
|
|
| BLAKE2b-256 |
59b7093fac279f778b93fb95241b102d4a5a2297e085b66d4cc2e1236d171c38
|