Skip to main content

Chunking utilities for GraphRAG

Project description

GraphRAG Chunking

This package contains a collection of text chunkers, a core config model, and a factory for acquiring instances.

Examples

Basic sentence chunking with nltk

The SentenceChunker class splits text into individual sentences by identifying sentence boundaries. It takes input text and returns a list where each element is a separate sentence, making it easy to process text at the sentence level.

Open the notebook to explore the basic sentence example code

Token chunking

The TokenChunker splits text into fixed-size chunks based on token count rather than sentence boundaries. It uses a tokenizer to encode text into tokens, then creates chunks of a specified size with configurable overlap between chunks.

Open the notebook to explore the token chunking example code

Using the factory via helper util

The create_chunker factory function provides a configuration-driven approach to instantiate chunkers by accepting a ChunkingConfig object that specifies the chunking strategy and parameters. This allows for more flexible and maintainable code by separating chunker configuration from direct instantiation.

Open the notebook to explore the factory helper util example code

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_chunking-3.1.0.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_chunking-3.1.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_chunking-3.1.0.tar.gz.

File metadata

  • Download URL: graphrag_chunking-3.1.0.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for graphrag_chunking-3.1.0.tar.gz
Algorithm Hash digest
SHA256 c7cf0168dcc24287806c3b8f2e7d39ba64eae2a82b4d0d9ed8ca15bd5d73a5c3
MD5 f03a53cce82d3559c8d972645cfa61a9
BLAKE2b-256 d06df691fb5afd3f418be70cd5db2c8ed94a7ba8ba0b9d9c61e7992d80acf3e3

See more details on using hashes here.

File details

Details for the file graphrag_chunking-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: graphrag_chunking-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for graphrag_chunking-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca25a1e79f91b8af33b3fc6d5288f55ad7728ad85bef6ba5e40f97d0ffb7d806
MD5 8832eb4379f2325ba523a24f6dbb0a56
BLAKE2b-256 29758c8c2bf049e02f38f84e89fa3818eebd71a5b9c687d9db6657259d37bd79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page