Skip to main content

NotionHelper is a Python library that simplifies interactions with the Notion API, enabling easy management of databases, pages, and files within Notion workspaces.

Project description

NotionHelper

NotionHelper

NotionHelper is a Python library that provides a convenient interface for interacting with the Notion API, specifically designed to leverage the Notion API Version 2025-09-03 for core objects while also supporting Notion's native 2026-03-11 markdown endpoints. It simplifies common tasks such as managing databases, data sources, pages, markdown content, and file uploads, allowing you to integrate Notion's powerful features into your applications with ease.

For help constructing the JSON for the properties, use the Notion API - JSON Builder Streamlit app.

Features

  • Synchronous Operations: Uses notion-client and requests for straightforward API interactions.
  • Type Safety: Full type hints for all methods ensuring better development experience and IDE support.
  • Error Handling: Robust error handling for API calls and file operations.
  • Database & Data Source Management: Create, retrieve, query, and update Notion databases and their associated data sources.
  • Page Operations: Add new pages to data sources, append content to existing pages, and work with Notion's native markdown endpoints.
  • File Handling: Upload files and attach them to pages or page properties with built-in validation.
  • Pandas Integration: Convert Notion data source pages into a Pandas DataFrame for easy data manipulation.
  • API Version 2025-09-03 Compliance: Fully updated to support the latest Notion API changes, including the separation of databases and data sources.
  • Native Markdown Endpoints: Create, read, and update page content through /v1/pages/{id}/markdown and markdown-backed page creation.

Installation

To install NotionHelper, you can use pip:

pip install notionhelper

This will also install all the necessary dependencies, including notion-client, pandas, and requests.

Authentication

To use the Notion API, you need to create an integration and obtain an integration token.

  1. Create an Integration: Go to My Integrations and create a new integration.
  2. Get the Token: Copy the "Internal Integration Token".
  3. Share with a Page/Database: For your integration to access a page or database, you must share it with your integration from the "Share" menu in Notion.

It is recommended to store your Notion token as an environment variable for security.

export NOTION_TOKEN="your_secret_token"

Usage

Here is an example of how to use the library:

import os
from notionhelper import NotionHelper, MLNotionHelper

Initialize the NotionHelper class

notion_token = os.getenv("NOTION_TOKEN")

# For core Notion operations
helper = NotionHelper(notion_token)

# For ML experiment tracking (includes all NotionHelper methods)
ml_helper = MLNotionHelper(notion_token)

Retrieve a Database (Container)

With API version 2025-09-03, get_database now returns the database object, which acts as a container for one or more data sources. To get the actual schema (properties), you need to retrieve a specific data source using get_data_source.

database_id = "your_database_id" # ID of the database container
database_object = helper.get_database(database_id)
print(database_object)

# To get the schema of a specific data source within this database:
data_source_id = database_object["data_sources"][0]["id"] # Get the ID of the first data source
data_source_schema = helper.get_data_source(data_source_id)
print(data_source_schema)

Create a New Page in a Data Source

With API version 2025-09-03, pages are parented by data_source_id, not database_id. When creating a new page, ensure you use the data_source_id of the specific table you want to add the page to.

Important Note on Property Definitions: When defining properties for the schema of a database or data source, use an empty object {} for the property type (e.g., "My Title Column": {"title": {}}). However, when defining properties for a new page (as shown below), you provide the actual content using rich text arrays or other specific property value objects.

data_source_id = "your_data_source_id" # The ID of the specific data source (table)

page_properties = {
    "Task Name": { # This must match a 'title' property in your data source schema
        "title": [
            {
                "text": {
                    "content": "New Task from NotionHelper"
                }
            }
        ]
    },
    "Status": { # This must match a 'select' property in your data source schema
        "select": {
            "name": "Not Started" # Must be one of the options defined in your data source
        }
    },
    "Due Date": { # This must match a 'date' property in your data source schema
        "date": {
            "start": "2025-12-31"
        }
    }
}
new_page = helper.new_page_to_data_source(data_source_id, page_properties)
print(new_page)

You can also create the page body with Notion's native markdown API:

new_page = helper.new_page_to_data_source(
    data_source_id,
    page_properties={
        "Task Name": {
            "title": [{"text": {"content": "Markdown-backed task"}}]
        }
    },
    markdown="# Task Notes\n\n- [ ] Draft proposal\n- [ ] Schedule follow-up",
)
print(new_page["id"])

Append Content to the New Page

blocks = [
    {
        "object": "block",
        "type": "heading_2",
        "heading_2": {
            "rich_text": [{"type": "text", "text": {"content": "Hello from NotionHelper!"}}]
        }
    },
    {
        "object": "block",
        "type": "paragraph",
        "paragraph": {
            "rich_text": [
                {
                    "type": "text",
                    "text": {
                        "content": "This content was appended synchronously."
                    }
                }
            ]
        }
    }
]
helper.append_page_body(page_id, blocks)
print(f"Successfully appended content to page ID: {page_id}")

You can also pass raw Markdown directly:

markdown_body = """# Hello from Markdown

This paragraph is converted to Notion blocks internally.
"""
helper.append_page_body(page_id, markdown_body)

Retrieve a Page and Convert to Markdown

NotionHelper can retrieve page content and optionally return markdown format for easy use in documents, blogs, or other applications.

Get Page as JSON (Default)

page_id = "your_page_id"
result = helper.get_page(page_id)
properties = result["properties"]  # Page properties
content = result["content"]        # List of block objects (JSON)

Get Page as Markdown

page_id = "your_page_id"
result = helper.get_page(page_id, return_markdown=True)
properties = result["properties"]  # Page properties
markdown_content = result["content"]  # String from Notion's native markdown endpoint
print(markdown_content)

Use return_markdown as the canonical parameter name. Deprecated aliases returnmarkdown and markdownformat are still accepted with warnings.

By default, get_page(..., return_markdown=True) uses Notion's native markdown API (/v1/pages/{page_id}/markdown). To force the legacy block-to-markdown conversion path, pass use_markdown_api=False.

You can also call the native endpoint directly to access Notion's markdown metadata:

markdown_result = helper.get_page_markdown(page_id, include_transcript=True)
print(markdown_result["markdown"])
print(markdown_result["truncated"])
print(markdown_result["unknown_block_ids"])

The markdown conversion supports:

  • Headings (H1, H2, H3)
  • Text formatting (bold, italic, strikethrough, code, links)
  • Lists (bulleted and numbered)
  • Code blocks with language syntax highlighting
  • Images
  • Dividers and block quotes

This is useful for:

  • Exporting Notion pages to markdown files
  • Integrating with static site generators
  • Creating blog posts from Notion content
  • Storing content in version control
  • Converting documentation to other formats

Update Page Content with Native Markdown Commands

Use update_page_markdown(...) to access Notion's native markdown update API:

helper.update_page_markdown(
    page_id,
    "update_content",
    content_updates=[
        {
            "old_str": "Draft proposal",
            "new_str": "Draft proposal (due Friday)",
        }
    ],
)

helper.update_page_markdown(
    page_id,
    "replace_content",
    new_str="# Fresh Start\n\nThis replaces all previous content.",
)

Supported commands:

  • update_content
  • replace_content
  • insert_content
  • replace_content_range

Get all pages from a Data Source as a Pandas DataFrame

data_source_id = "your_data_source_id" # The ID of the specific data source (table)
df = helper.get_data_source_pages_as_dataframe(data_source_id)
print(df.head())

Timezone-safe filtering example

import pandas as pd

data_source_id = "your_data_source_id"
df = helper.get_data_source_pages_as_dataframe(data_source_id, utc=True)

# Due values are normalized to ISO 8601; parse with utc=True for stable comparisons
due_ts = pd.to_datetime(df["Due"], utc=True, errors="coerce")
window_start = pd.Timestamp("2026-03-01T00:00:00Z")
window_end = pd.Timestamp("2026-03-08T00:00:00Z")
filtered = df[(due_ts >= window_start) & (due_ts < window_end)]
print(filtered[["Name", "Due"]])

Streaming pages before DataFrame conversion

for page in helper.iter_data_source_pages("your_data_source_id", page_size=50):
    print(page["id"])

Retry policy (global + per-call override)

from notionhelper import RetryPolicy

helper.set_retry_policy(RetryPolicy(max_retries=4, base_delay=0.5, jitter_ratio=0.25, timeout=20))

# Per-call override
rows = helper.get_data_source_pages_as_dataframe(
    "your_data_source_id",
    retry_policy=RetryPolicy(max_retries=1, base_delay=0.2, jitter_ratio=0.1, timeout=10),
)

Update a Data Source

This example demonstrates how to update the schema (properties/columns), title, icon, or other attributes of an existing data source.

data_source_id = "your_data_source_id" # The ID of the data source to update

# Example 1: Rename a property and add a new one
update_payload_1 = {
    "properties": {
        "Old Property Name": { # Existing property name or ID
            "name": "New Property Name" # New name for the property
        },
        "New Text Property": { # Add a new rich text property
            "rich_text": {}
        }
    }
}
updated_data_source_1 = helper.update_data_source(data_source_id, properties=update_payload_1["properties"])
print(f"Updated data source (rename and add): {updated_data_source_1}")

# Example 2: Update data source title and remove a property
update_payload_2 = {
    "title": [
        {
            "type": "text",
            "text": {
                "content": "Updated Data Source Title"
            }
        }
    ],
    "properties": {
        "Property To Remove": None # Set to None to remove a property
    }
}
updated_data_source_2 = helper.update_data_source(data_source_id, title=update_payload_2["title"], properties=update_payload_2["properties"])
print(f"Updated data source (title and remove): {updated_data_source_2}")

# Example 3: Update a select property's options
update_payload_3 = {
    "properties": {
        "Status": { # Assuming 'Status' is an existing select property
            "select": {
                "options": [
                    {"name": "To Do", "color": "gray"},
                    {"name": "In Progress", "color": "blue"},
                    {"name": "Done", "color": "green"},
                    {"name": "Blocked", "color": "red"} # Add a new option
                ]
            }
        }
    }
}
updated_data_source_3 = helper.update_data_source(data_source_id, properties=update_payload_3["properties"])
print(f"Updated data source (select options): {updated_data_source_3}")

Upload a File and Attach to a Page

try:
    file_path = "path/to/your/file.pdf"  # Replace with your file path
    upload_response = helper.upload_file(file_path)
    file_upload_id = upload_response["id"]
    # Replace with your page_id
    page_id = "your_page_id"
    attach_response = helper.attach_file_to_page(page_id, file_upload_id)
    print(f"Successfully uploaded and attached file: {attach_response}")
except Exception as e:
    print(f"Error uploading file: {e}")

Simplified File Operations

NotionHelper provides convenient one-step methods that combine file upload and attachment operations:

one_step_image_embed()

Uploads an image and embeds it in a Notion page in a single call, combining what would normally require:

  1. Uploading the file
  2. Embedding it in the page
page_id = "your_page_id"
image_path = "path/to/image.png"
response = helper.one_step_image_embed(page_id, image_path)
print(f"Successfully embedded image: {response}")

one_step_file_to_page()

Uploads a file and attaches it to a Notion page in one step, combining:

  1. Uploading the file
  2. Attaching it to the page
page_id = "your_page_id"
file_path = "path/to/document.pdf"
response = helper.one_step_file_to_page(page_id, file_path)
print(f"Successfully attached file: {response}")

one_step_file_to_page_property()

Uploads a file and attaches it to a specific Files & Media property on a page, combining:

  1. Uploading the file
  2. Attaching it to the page property
page_id = "your_page_id"
property_name = "Files"  # Name of your Files & Media property
file_path = "path/to/document.pdf"
file_name = "Custom Display Name.pdf"  # Optional display name
response = helper.one_step_file_to_page_property(page_id, property_name, file_path, file_name)
print(f"Successfully attached file to property: {response}")

These methods handle all the intermediate steps automatically, making file operations with Notion much simpler.

Machine Learning Experiment Tracking

NotionHelper includes specialized functions for tracking machine learning experiments, making it easy to log configurations, metrics, plots, and output files to Notion databases. These functions automatically handle leaderboard tracking and provide a structured way to organize ML workflows.

create_ml_database()

Creates a new Notion database specifically designed for ML experiment tracking by analyzing your config and metrics dictionaries to automatically generate the appropriate schema.

# Define your typical experiment configuration and metrics
config = {
    "Experiment Name": "LSTM Forecast v1",
    "model_type": "LSTM",
    "learning_rate": 0.001,
    "batch_size": 32
}

metrics = {
    "sMAPE": 12.5,
    "MAE": 0.85,
    "training_time": 45.2
}

# Create a new ML tracking database
parent_page_id = "your_parent_page_id"
data_source_id = helper.create_ml_database(
    parent_page_id=parent_page_id,
    db_title="ML Experiments - Time Series",
    config=config,
    metrics=metrics,
    file_property_name="Output Files"  # Optional, defaults to "Output Files"
)
print(f"Created ML database with data source ID: {data_source_id}")

The function automatically:

  • Maps numeric values to Number properties
  • Maps booleans to Checkbox properties
  • Maps strings to Rich Text properties
  • Uses the first config key as the Title property
  • Adds a "Run Status" property for tracking improvements
  • Adds a Files & Media property for attaching output files

log_ml_experiment()

Logs a complete ML experiment run including configuration, metrics, plots, and output files. It automatically compares metrics against previous runs to identify improvements and track the best performing models.

# Experiment configuration
config = {
    "Experiment Name": "LSTM Forecast v2",
    "model_type": "LSTM",
    "layers": 3,
    "learning_rate": 0.001,
    "dropout": 0.2
}

# Training metrics
metrics = {
    "sMAPE": 11.8,
    "MAE": 0.78,
    "RMSE": 1.23,
    "training_time": 52.1
}

# Paths to plots and output files
plots = [
    "path/to/training_loss.png",
    "path/to/predictions.png"
]

output_files = [
    "path/to/model.h5",
    "path/to/scaler.pkl",
    "path/to/results.csv"
]

# Log the experiment
page_id = helper.log_ml_experiment(
    data_source_id=data_source_id,
    config=config,
    metrics=metrics,
    plots=plots,  # Will be embedded in page body
    target_metric="sMAPE",  # Metric to track for improvements
    higher_is_better=False,  # Lower sMAPE is better
    file_paths=output_files,  # Will be attached to Files & Media property
    file_property_name="Output Files"
)
print(f"Logged experiment to page: {page_id}")

Features:

  • Automatic Leaderboard Tracking: Compares new results against previous runs
  • Champion Detection: Automatically tags new best scores with 🏆
  • Performance Comparison: Shows delta from current best when not improving
  • Plot Embedding: Embeds visualization plots directly in the page body
  • File Attachments: Attaches model files, scalers, and other outputs
  • Timestamp Tracking: Automatically adds timestamps to experiment names

Run Status Examples:

  • 🏆 NEW BEST sMAPE (Prev: 12.50) - New champion found
  • No Improvement (+0.70 sMAPE) - Score wasn't better
  • Standard Run - First run or metric tracking disabled

upload_multiple_files_to_property()

Uploads multiple files and attaches them all to a single Files & Media property on a page.

page_id = "your_page_id"
property_name = "Output Files"
file_paths = [
    "path/to/model.h5",
    "path/to/scaler.pkl",
    "path/to/predictions.csv"
]

response = helper.upload_multiple_files_to_property(page_id, property_name, file_paths)
print(f"Successfully attached {len(file_paths)} files to property")

dict_to_notion_props()

Converts a Python dictionary to Notion property format, handling type conversions automatically.

data = {
    "Experiment Name": "Model v1",
    "accuracy": 0.95,
    "epochs": 100,
    "is_best": True
}

properties = helper.dict_to_notion_props(data, title_key="Experiment Name")
# Properties are now formatted for Notion API

Example ML Workflow:

# 1. Create ML tracking database (one-time setup)
data_source_id = helper.create_ml_database(
    parent_page_id="parent_page_id",
    db_title="Computer Vision Experiments",
    config={"Model Name": "ResNet50", "dataset": "ImageNet"},
    metrics={"accuracy": 0.0, "f1_score": 0.0}
)

# 2. Run multiple experiments
for lr in [0.001, 0.01, 0.1]:
    # Train your model
    model, metrics, plots = train_model(learning_rate=lr)

    # Log to Notion
    helper.log_ml_experiment(
        data_source_id=data_source_id,
        config={"Model Name": f"ResNet50_lr{lr}", "learning_rate": lr},
        metrics=metrics,
        plots=plots,
        target_metric="accuracy",
        higher_is_better=True
    )

# 3. Review results in Notion
df = helper.get_data_source_pages_as_dataframe(data_source_id)
print(df[["Model Name", "accuracy", "Run Status"]].sort_values("accuracy", ascending=False))

Code Quality

The NotionHelper library includes several quality improvements:

  • Type Hints: All methods include comprehensive type annotations for better IDE support and code clarity
  • Error Handling: Built-in validation and exception handling for common failure scenarios
  • Clean Imports: Explicit imports with __all__ declaration for better namespace management
  • Production Ready: Removed debug output and implemented proper error reporting

Complete Function Reference

The NotionHelper class provides the following methods:

Database & Data Source Operations

  • get_database(database_id) - Retrieves the database object (container), which includes a list of its data sources.
  • get_data_source(data_source_id) - Retrieves a specific data source, including its properties (schema).
  • create_database(parent_page_id, database_title, initial_data_source_properties, initial_data_source_title=None) - Creates a new database with an initial data source.
  • update_data_source(data_source_id, properties=None, title=None, icon=None, in_trash=None, parent=None) - Updates the attributes of a specified data source.
  • notion_search_db(query="", filter_object_type="page") - Searches for pages or data sources in Notion.

Page Operations

  • new_page_to_data_source(data_source_id, page_properties=None, markdown=None) - Adds a new page to a Notion data source, optionally using Notion's native markdown page creation.
  • trash_page(page_id) - Moves a page to Notion trash.
  • restore_page(page_id) - Restores a page from Notion trash.
  • append_page_body(page_id, body=None, sanitize=True, blocks=None, batch_size=100) - Appends either Notion blocks (list[dict]) or raw Markdown (str) to a Notion page body with optional sanitization and automatic batching.
  • get_page(page_id, return_markdown=False, use_markdown_api=None, include_transcript=False) - Retrieves page properties and page content; return_markdown=True uses Notion's native markdown endpoint by default.
  • get_page_markdown(page_id, include_transcript=False) - Retrieves the raw response from Notion's native markdown endpoint, including truncation metadata.
  • update_page_markdown(page_id, command, ...) - Updates page content through Notion's markdown update API.
  • extract_page_id_from_url(page_url_or_id, with_hyphens=True) - Extracts and normalizes a Notion page ID from either a Notion URL or raw ID.

Data Retrieval & Conversion

  • get_data_source_page_ids(data_source_id) - Returns the IDs of all pages in a given data source.
  • get_data_source_pages_as_json(data_source_id, limit=None) - Returns a list of JSON objects representing all pages in the given data source, with all properties.
  • get_data_source_pages_as_dataframe(data_source_id, limit=None, include_page_ids=True, utc=True) - Retrieves all pages as a Pandas DataFrame with timezone-safe UTC ISO 8601 normalization by default.
  • iter_data_source_pages(data_source_id, ...) - Streams paginated page payloads as a generator.
  • iter_data_source_page_records(data_source_id, ...) - Streams flattened records before DataFrame conversion.
  • parse_datetime_utc(value, utc=True) - Parses datetime-like values into pandas timestamps with UTC-safe defaults.
  • normalize_datetime_iso(value, utc=True) - Normalizes datetime-like values to ISO 8601 strings (UTC by default for datetimes).
  • normalize_notion_date(date_value, utc=True) - Normalizes Notion date objects (start, end, time_zone) to consistent ISO 8601 output.
  • set_converter_adapter(converter_adapter) - Sets a converter adapter so application code can use a single wrapper for markdown <-> block conversion.
  • set_retry_policy(retry_policy) - Sets the global retry policy (max retries, timeout, backoff, jitter, retry statuses).

File Operations

  • upload_file(file_path) - Uploads a file to Notion and returns the file upload object
  • attach_file_to_page(page_id, file_upload_id) - Attaches an uploaded file to a specific page
  • embed_image_to_page(page_id, file_upload_id) - Embeds an uploaded image into a page
  • attach_file_to_page_property(page_id, property_name, file_upload_id, file_name) - Attaches a file to a Files & Media property

Error Handling

  • Structured exceptions include metadata (status_code, request_path, notion_code):
  • AuthError, RateLimitError, NotFoundError, ValidationError, TimeoutError
  • Base class: NotionAPIError

One-Step Convenience Methods

  • one_step_image_embed(page_id, file_path) - Uploads and embeds an image in one operation
  • one_step_file_to_page(page_id, file_path) - Uploads and attaches a file to a page in one operation
  • one_step_file_to_page_property(page_id, property_name, file_path, file_name) - Uploads and attaches a file to a page property in one operation

Machine Learning Experiment Tracking

  • create_ml_database(parent_page_id, db_title, config, metrics, file_property_name="Output Files") - Creates a new Notion database specifically designed for ML experiment tracking with automatic schema generation
  • log_ml_experiment(data_source_id, config, metrics, plots=None, target_metric="sMAPE", higher_is_better=False, file_paths=None, file_property_name="Output Files") - Logs a complete ML experiment run including configuration, metrics, plots, and output files with automatic leaderboard tracking
  • upload_multiple_files_to_property(page_id, property_name, file_paths) - Uploads multiple files and attaches them all to a single Files & Media property
  • dict_to_notion_props(data, title_key) - Converts a Python dictionary to Notion property format with automatic type handling

Requirements

  • Python 3.10+
  • pandas >= 2.3.1
  • requests >= 2.32.4
  • mimetype >= 0.1.5

Optional for richer markdown import:

  • notion-blockify

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notionhelper-0.6.1.tar.gz (7.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

notionhelper-0.6.1-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file notionhelper-0.6.1.tar.gz.

File metadata

  • Download URL: notionhelper-0.6.1.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for notionhelper-0.6.1.tar.gz
Algorithm Hash digest
SHA256 557b9029c1ed520c2b57d08e197d22e101063686b53f01a8cbbf8e68d949f812
MD5 2dc47b5619ae42e10dd4d15497be4022
BLAKE2b-256 0507632436d414487bc5eb8467879bb092b555abc1469bfe626ef70743532e27

See more details on using hashes here.

File details

Details for the file notionhelper-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: notionhelper-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for notionhelper-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5856a386bf530a83dd4cc938598f9bcbe4a7e6dbe8ded15767c35cf64091d6c8
MD5 4307b5039f9a28ac6bf818393a95ea19
BLAKE2b-256 b71c200b2ae852febe7fdaefe7864702299147f46658219e2796dafdde9eee98

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page