AST-based code chunking with dynamic multi-language support (fork of astchunk)
Project description
astchunk-extended
Drop-in replacement for astchunk with dynamic multi-language support.
What's different
- 15 languages instead of 4 (C, C++, Go, Rust, Ruby, Bash, Kotlin, Scala, Lua, PHP, Zig + original Python, Java, C#, TypeScript)
- Dynamic parser discovery — if a tree-sitter parser is installed, it works automatically
- No hardcoded imports — parsers loaded via
importlibon demand - Optional extras — install only the languages you need
Installation
# All languages
pip install astchunk-extended[all]
# Or pick what you need
pip install astchunk-extended[core] # Python, Java, C#, TypeScript
pip install astchunk-extended[systems] # C, C++, Go, Rust, Zig
pip install astchunk-extended[scripting] # Ruby, Bash, Lua, PHP
pip install astchunk-extended[jvm] # Java, Kotlin, Scala
With LEANN:
uv tool install leann-core --with leann --with "astchunk-extended[all]"
python -c "from astchunk.patch_leann import apply; apply()"
Usage
from astchunk import ASTChunkBuilder, get_available_languages
# See what's installed
print(get_available_languages())
# ['bash', 'c', 'cpp', 'go', 'java', 'kotlin', 'lua', 'php', 'python', 'rust', ...]
# Chunk C++ code
builder = ASTChunkBuilder(max_chunk_size=300, language="cpp", metadata_template="default")
chunks = builder.chunkify(code)
Custom languages
from astchunk import register_language
register_language("haskell", "tree_sitter_haskell")
builder = ASTChunkBuilder(max_chunk_size=300, language="haskell", metadata_template="default")
License
MIT (same as upstream astchunk)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
astchunk_extended-0.2.0.tar.gz
(14.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file astchunk_extended-0.2.0.tar.gz.
File metadata
- Download URL: astchunk_extended-0.2.0.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82c93e5b16c632635cd8c5284e39aa08fae8385755d631b5c6c4c2e1e1fe16fe
|
|
| MD5 |
af84698eb699d29b8ce3bd4f34357fd0
|
|
| BLAKE2b-256 |
ad007d7c6bdf189c96c9fa91350c8715eb02822c58eeec5e77ba6148c1f75563
|
File details
Details for the file astchunk_extended-0.2.0-py3-none-any.whl.
File metadata
- Download URL: astchunk_extended-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c58507f473c8d8a717f196557a399e83a4c5de02d243e4c0828324027d263c70
|
|
| MD5 |
99b9f84780c4effc53cecbf646c2d23e
|
|
| BLAKE2b-256 |
b2b1d3c7afcaf38cb0456d91a4ad7b6e3be1c9b5631f7e97d3340263c7dda696
|