No project description provided
Project description
notion-haystack: Export Notion pages to Haystack Document
This python package allows you to easily export your Notion pages to Haystack Documents by providing a Notion API token.
Given that the Notion API is subject to some rate limits,
this component will automatically retry failed requests and wait for the rate limit to reset before retrying. This is
especially useful when exporting a large number of pages. Furthermore, this component uses asyncio
to make requests in
parallel, which can significantly speed up the export process.
Installation
pip install notion-haystack
Usage
The following minimal example demonstrates how to export a list of pages to Haystack Documents:
from notion_haystack import NotionExporter
exporter = NotionExporter(api_token="<your-token>")
exported_pages = exporter.run(file_paths=["<list-of-page-ids>"])
# exported_pages will be a list of Haystack Documents where each Document corresponds to a Notion page
The following example shows how to use the NotionExporter
inside an indexing pipeline:
from notion_haystack import NotionExporter
from haystack.document_stores import InMemoryDocumentStore
from haystack import Pipeline
document_store = InMemoryDocumentStore()
exporter = NotionExporter(api_token="<your-token>")
indexing_pipeline = Pipeline()
indexing_pipeline.add_node(component=exporter, name="exporter", inputs=["File"])
indexing_pipeline.add_node(component=document_store, name="document_store", inputs=["exporter"])
indexing_pipeline.run(file_paths=["<list-of-page-ids>"])
# The pages will now be indexed in the document store
The NotionExporter
class takes the following arguments:
api_token
: Your Notion API token. You can find information on how to get an API token in Notion's documentationexport_child_pages
: Whether to recursively export all child pages of the provided page ids. Defaults toFalse
.extract_page_metadata
: Whether to extract metadata from the page and add it as a frontmatter to the markdown. Extracted metadata includes title, author, path, URL, last editor, and last editing time of the page. Defaults toFalse
.exclude_title_containing
: If specified, pages with titles containing this string will be excluded. This might be useful for example to exclude pages that are archived. Defaults toNone
.
The NotionExporter.run
method takes the following arguments:
file_paths
: A list of page ids to export. Ifexport_child_pages
isTrue
, all child pages of these pages will be exported as well.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for notion_haystack-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b23e72528d7a738c8e3d33fd7666bf38e5c842a8d60be4cfc9de1a5b1256d96 |
|
MD5 | 97a1a1379c1239828b2ce166200e8935 |
|
BLAKE2b-256 | 47648eb272ffbae00321739ca9f7d35ae76ddeb5dfc84c5d614ad66304654b1d |