Skip to main content

Automatically aggregate source code files into a single text file for LLM context

Project description

Introduction

sync2llmtxt is a script designed to automatically aggregate source code files from a specified directory into a single text file in a target directory. This facilitates providing the entire project code as context to large language models (e.g., LLM/LLMtxt). The script supports automatic directory monitoring for real-time updates to the aggregated document, as well as manual execution for one-time generation.

中文 English

Use Cases

You can specify the target file as a directory in Google Drive, allowing the project code to be aggregated into Google Drive for easy use with Gemini.

Key Features

  • Multi-file Type Support: Aggregates .py, .ts, .tsx, .js, .json, .md, and other common code and text files.
  • Smart Ignore:
    • Automatically applies .gitignore rules.
    • Supports custom IGNORE_PATTERNS.
    • Excludes binary files, images, and other irrelevant content.
  • Directory Structure Export: Generates a clean tree-like directory structure (including ignored markers).
  • Flexible Configuration:
    • Supports YAML configuration files.
    • Command-line parameter overrides.
  • Advanced Filtering:
    • Filters by file size (--max-size).
    • Filters by modification time (--since-days).
  • Detailed Logging: Multi-level logging for debugging.

Installation

Option 1: Install from PyPI (Recommended)

pip install sync2llmtxt

Option 2: Install from Source

# Clone the repository
git clone https://github.com/yourusername/sync2llmtxt.git
cd sync2llmtxt

# Install in development mode
pip install -e .

# For development dependencies (optional)
pip install -r requirements-test.txt

Configuration

1. Configuration File (YAML)

MONITORED_CODE_DIR: /path/to/code
OUTPUT_DOCUMENT_PATH: /path/to/output.txt
CODE_FILE_PATTERNS: 
  - '*.py'
  - '*.md'
IGNORE_PATTERNS:
  - node_modules
ENABLE_AUTOMATIC_MONITORING: true
DEBOUNCE_TIME: 2.0

2. Command-Line Arguments

Argument Description Example
-s/--src Source code directory -s ./src
-o/--out Output file path -o output.txt
-c/--config Configuration file path -c config.yaml
--max-size Maximum file size (MB) --max-size 2
--since-days Days since last modification --since-days 7

Usage Examples

# Basic usage
sync2llmtxt -s ./project -o output.txt

# Use config file + filter large files
sync2llmtxt -c config.yaml --max-size 1.5

# Sync only recently modified files
sync2llmtxt -s ./src -o out.txt --since-days 3

Testing (Pending)

# Run tests
pytest --cov=.

# Generate test coverage report
pytest --cov=. --cov-report=html

Output Format Example

--- Project Code Context (Manual Run @ 2023-11-15 10:00:00) ---

Included Files (2):
- main.py
- utils/helper.py

--- File: main.py ---

import utils

if __name__ == "__main__":
    print("Hello")

--- File: utils/helper.py ---

def help():
    return "Help message"

--- Directory Structure ---
project/
├── main.py
└── utils/
    └── helper.py

Development Guide

Code Structure

  • src/sync2llmtxt/sync2llmtxt.py: Main program.
  • src/sync2llmtxt/directory_tree.py: Directory tree generation module.
  • tests/: Unit tests.

Extension Suggestions

  1. Add support for more file types.
  2. Implement incremental update mode.
  3. Support remote storage for output.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sync2llmtxt-0.1.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sync2llmtxt-0.1.0-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file sync2llmtxt-0.1.0.tar.gz.

File metadata

  • Download URL: sync2llmtxt-0.1.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for sync2llmtxt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f183ef8ccc4fb645d54646559034844ec0e603e8647759eee62d8a713967bd72
MD5 f1571bf169db38da464017377bf19271
BLAKE2b-256 cfd5a3f03e3eb9e387f529af460a8eba31d8f387abf0c0f8255cb8c027cc260a

See more details on using hashes here.

File details

Details for the file sync2llmtxt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sync2llmtxt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for sync2llmtxt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e9aae7b5d9dd337801e458052d902f7a93e984927f73216f386745d5aa1d4a6b
MD5 e01a11d1aff55c427036243f9b7b01e4
BLAKE2b-256 b6089d826a430391e0328f3e05a9f793098802e41db25ae239ed603e320032eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page