A tool to combine multiple text files into one for LLM training and prompts
Project description
TextMeld
A CLI tool to combine text files into one. Perfect for preparing LLM training data and prompt engineering.
Features
- Combine multiple text files into a single file
- Automatic recognition of .gitignore patterns
- Automatic skipping of binary files and hidden files
- Option to limit output character count
- Flexible file exclusion patterns
Installation
pip install textmeld
Using Poetry:
poetry add textmeld
Usage
Basic Usage
# Basic usage (outputs to stdout)
textmeld /path/to/your/directory
# Specify output file
textmeld /path/to/your/directory -o output.txt
# Limit maximum character count
textmeld /path/to/your/directory --max-chars 100000
Available Options
usage: textmeld [-h] [-o OUTPUT] [-e EXCLUDE] [-m MAX_CHARS] directory
A tool to merge multiple text files into one file
positional arguments:
directory Target directory path
options:
-h, --help Show help message and exit
-o OUTPUT, --output OUTPUT
Output file path (if not specified, outputs to stdout)
-e EXCLUDE, --exclude EXCLUDE
File patterns to exclude (can specify multiple)
-m MAX_CHARS, --max-chars MAX_CHARS
Maximum character count for output
Using Exclusion Patterns
To exclude specific files or directories:
# Exclude specific extensions
textmeld /path/to/your/directory -e "*.log" -e "*.tmp"
# Exclude specific directories
textmeld /path/to/your/directory -e "node_modules/" -e "venv/"
Output Format
TextMeld's output consists of two parts:
- Directory Structure: A tree view of the target directory
- Merged Content: Combined contents of all text files (each file has a header)
Directory Structure:
====================
└── project/
├── README.md
├── main.py
└── utils/
└── helper.py
Merged Content:
====================
==========
File: project/README.md
==========
# Project Documentation
...
==========
File: project/main.py
==========
def main():
print("Hello World")
...
==========
File: project/utils/helper.py
==========
def helper_function():
return True
...
Supported File Formats
TextMeld automatically detects text files. Generally supported file formats include:
- Markdown (.md)
- Text (.txt)
- YAML (.yaml, .yml)
- JSON (.json)
- Python (.py)
- JavaScript (.js)
- TypeScript (.ts)
- JSX/TSX (.jsx, .tsx)
- HTML (.html)
- CSS (.css)
- Other text-based file formats
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file textmeld-0.3.0.tar.gz.
File metadata
- Download URL: textmeld-0.3.0.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e50d921357b68c4163812e5515e139278de70d23f7913ef196589d255147756c
|
|
| MD5 |
242d3ce0e712116869931b7192619788
|
|
| BLAKE2b-256 |
6c992f0c69e0d136d2d11bcf10e8118f431ce15212d64b729843cb1fb1268c6f
|
File details
Details for the file textmeld-0.3.0-py3-none-any.whl.
File metadata
- Download URL: textmeld-0.3.0-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77724a3108e105587109d321c7deb9cb51b7f941fe2faab3eea5fbd96c5c5ac3
|
|
| MD5 |
b1b5e02913ca2cb310304b8d011060c8
|
|
| BLAKE2b-256 |
570fa3da315c57176695b389dc032de01ecaa10834f33bdd542711a1fa989747
|