Break down docs, build up knowledge.
Project description
Break down your docs. Build up your knowledge.
A Markdown text splitter for modular docs and maximum flexibility.
What is SplitmeAI?
SplitmeAI is a Python module that addresses challenges in managing large Markdown files, particularly when creating and maintaining structured static documentation websites such as Mkdocs.
Key Features:
- Section Splitting: Breaks down large Markdown files into smaller, manageable sections based on specified heading levels.
- Filename Sanitization: Generates clean, unique filenames for each section, ensuring compatibility and readability.
- Reference Link Management: Extracts and appends reference-style links used within each section.
- Hierarchy Preservation: Maintains parent heading context within each split file.
- Thematic Break Handling: Recognizes and handles line breaks (
---,***,___) for intelligent content segmentation. - MkDocs Integration: Automatically generates an
mkdocs.ymlconfiguration file based on the split sections. - CLI Support: Provides a user-friendly Command-Line Interface for seamless operation.
Quick Start
Installation
Install from PyPI using any of the package managers listed below.
pip
Use pip (recommended for most users):
pip install -U splitme-ai
pipx
Install in an isolated environment with pipx:
❯ pipx install splitme-ai
uv
For the fastest installation use uv:
❯ uv tool install splitme-ai
Usage
Using the CLI
Let's take a look at some examples of how to use the splitme-ai CLI.
Example 1: Split a Markdown file on heading level 2 (default setting):
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h2
Example 2: Split on heading level 2 and generate an mkdocs.yml configuration file:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h2 \
--split.settings.mkdocs
View the output generated for splitting on heading level 2 here.
Example 3: Split on heading level 3:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h3 \
--split.settings.hl "###"
View the output generated for splitting on heading level 3 here.
Example 4: Split on heading level 4:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h4 \
--split.settings.hl "####"
View the output generated for splitting on heading level 4 here.
[!NOTE] The Official Documentation site with extensive examples and usage instructions is under development Stay tuned for updates!
Roadmap
- Enhance CLI usability and user experience.
- Integrate AI-powered content analysis and segmentation.
- Add robust chunking and splitting algorithms for LLM applications.
- Add support for additional static site generators.
- Add support for additional input and output formats.
Contributing
Contributions are welcome! For bug reports, feature requests, or questions, please open an issue or submit a pull request on GitHub.
License
Copyright © 2024 splitme-ai.
Released under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file splitme_ai-0.1.8.tar.gz.
File metadata
- Download URL: splitme_ai-0.1.8.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abfd16467e812a36d1102935168716afa9fd971c2b3355be10186f85eb52b16c
|
|
| MD5 |
48d70d665644195a610e275e53d98059
|
|
| BLAKE2b-256 |
81a721e0e23cbec92505f9df1f87cdf57a3c065662577778613512a8153740d4
|
Provenance
The following attestation bundles were made for splitme_ai-0.1.8.tar.gz:
Publisher:
ci.yml on eli64s/splitme-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
splitme_ai-0.1.8.tar.gz -
Subject digest:
abfd16467e812a36d1102935168716afa9fd971c2b3355be10186f85eb52b16c - Sigstore transparency entry: 157644391
- Sigstore integration time:
-
Permalink:
eli64s/splitme-ai@aab4f1269743e20e2e2d8a09b9f70085cb181cc5 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/eli64s
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@aab4f1269743e20e2e2d8a09b9f70085cb181cc5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file splitme_ai-0.1.8-py3-none-any.whl.
File metadata
- Download URL: splitme_ai-0.1.8-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b502f1e02f5130683c5f126b82e77aa7d5c2e43c5fc80dbfee2a963344f3c3c
|
|
| MD5 |
a33d2e4e0f7a56100b775d5d8b780683
|
|
| BLAKE2b-256 |
769988f2c5dab66b0b3a02b02d47bde8df6e1ad2b801dd3e759934c7e8a6176e
|
Provenance
The following attestation bundles were made for splitme_ai-0.1.8-py3-none-any.whl:
Publisher:
ci.yml on eli64s/splitme-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
splitme_ai-0.1.8-py3-none-any.whl -
Subject digest:
2b502f1e02f5130683c5f126b82e77aa7d5c2e43c5fc80dbfee2a963344f3c3c - Sigstore transparency entry: 157644392
- Sigstore integration time:
-
Permalink:
eli64s/splitme-ai@aab4f1269743e20e2e2d8a09b9f70085cb181cc5 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/eli64s
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@aab4f1269743e20e2e2d8a09b9f70085cb181cc5 -
Trigger Event:
push
-
Statement type: