Input document loading utilities for GraphRAG

These details have not been verified by PyPI

Project links

Source

Project description

GraphRAG Inputs

This package provides input document loading utilities for GraphRAG, supporting multiple file formats including CSV, JSON, JSON Lines, and plain text.

Supported File Types

The following four standard file formats are supported out of the box:

CSV - Tabular data with configurable column mappings
JSON - JSON files with configurable property paths
JSON Lines - Line-delimited JSON records
Text - Plain text files

Markitdown Support

Additionally, we support the InputType.MarkItDown format, which uses the MarkItDown library to import any supported file type. The MarkItDown converter can handle a wide variety of file formats including Office documents, PDFs, HTML, and more.

Note: Additional optional dependencies may need to be installed depending on the file type you're processing. The choice of converter is determined by MarkItDowns's processing logic, which primarily uses the file extension to select the appropriate converter. Please refer to the MarkItDown repository for installation instructions and detailed information about supported formats.

Examples

Basic usage with the factory:

from graphrag_input import create_input_reader, InputConfig, InputType
from graphrag_storage import StorageConfig, create_storage

config = InputConfig(
    type=InputType.Csv,
    text_column="content",
    title_column="title",
)
storage = create_storage(StorageConfig(base_dir="./input"))
reader = create_input_reader(config, storage)
documents = await reader.read_files()

Import a pdf with MarkItDown:

pip install 'markitdown[pdf]' # required dependency for pdf processing

from graphrag_input import create_input_reader, InputConfig, InputType
from graphrag_storage import StorageConfig, create_storage

config = InputConfig(
    type=InputType.MarkitDown,
    file_pattern=".*\\.pdf$"
)
storage = create_storage(StorageConfig(base_dir="./input"))
reader = create_input_reader(config, storage)
documents = await reader.read_files()

YAML config example for above:

input:
  type: markitdown
  file_pattern: ".*\\.pdf$$"
input_storage:
    type: file
    base_dir: "input"

Note that when specifying column names for data extraction, we can handle nested objects (e.g., in JSON) with dot notation:

from graphrag_input import get_property

data = {"user": {"profile": {"name": "Alice"}}}
name = get_property(data, "user.profile.name")  # Returns "Alice"

Project details

These details have not been verified by PyPI

Project links

Source

Release history Release notifications | RSS feed

3.0.9

Apr 13, 2026

3.0.8

Mar 27, 2026

3.0.7

Mar 24, 2026

3.0.6

Mar 6, 2026

3.0.5

Feb 27, 2026

3.0.4

Feb 24, 2026

3.0.3

Feb 24, 2026

3.0.2

Feb 13, 2026

3.0.1

Jan 28, 2026

This version

3.0.0

Jan 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_input-3.0.0.tar.gz (7.8 kB view details)

Uploaded Jan 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

graphrag_input-3.0.0-py3-none-any.whl (14.0 kB view details)

Uploaded Jan 27, 2026 Python 3

File details

Details for the file graphrag_input-3.0.0.tar.gz.

File metadata

Download URL: graphrag_input-3.0.0.tar.gz
Upload date: Jan 27, 2026
Size: 7.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.4

File hashes

Hashes for graphrag_input-3.0.0.tar.gz
Algorithm	Hash digest
SHA256	`e6293e66a1272795dc4556d45c0aa1ecb73d706803481153e0dc14e5c576c460`
MD5	`0a07ab75997846bff983b4ceced36a38`
BLAKE2b-256	`6fd38fb99d5121c054ef7a977cb4fb182fb13d86063b35d2a20cd754e1fdb05e`

See more details on using hashes here.

File details

Details for the file graphrag_input-3.0.0-py3-none-any.whl.

File metadata

Download URL: graphrag_input-3.0.0-py3-none-any.whl
Upload date: Jan 27, 2026
Size: 14.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.4

File hashes

Hashes for graphrag_input-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`070823bc8c14717effb969f1666aba4eb719db5e78f22c83c2a3f28e0f6afcff`
MD5	`7ee597404a4cb54f4948f7fdc6513e93`
BLAKE2b-256	`bdbdd1248bec704108ff8d5c3b98c725ac0500c634af04143ec0db286dea3db6`

See more details on using hashes here.

graphrag-input 3.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GraphRAG Inputs

Supported File Types

Markitdown Support

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes