Skip to main content

A tool to serialize repository contents into a single file

Project description

Repo Serializer

A Python utility for serializing local Git repositories into a structured text file, capturing the directory structure (in ASCII format), file names, and contents of source files. Ideal for providing a comprehensive snapshot of a repository for code review, documentation, or interaction with large language models (LLMs).

Installation

# Install from PyPI
pip install repo-serializer

Usage

Command Line

# Basic usage
repo-serializer /path/to/repository

# Specify output file
repo-serializer /path/to/repository -o output.txt

# Copy to clipboard in addition to saving to file
repo-serializer /path/to/repository -c

# Use structure-only mode to output only the directory structure and filenames
repo-serializer /path/to/repository -s

# Skip specific directories (can be used multiple times or as a comma-separated list)
repo-serializer /path/to/repository --skip-dir build,dist
repo-serializer /path/to/repository --skip-dir build --skip-dir dist

# Only include Python files (.py, .ipynb)
repo-serializer /path/to/repository --python

# Only include JavaScript/TypeScript files
repo-serializer /path/to/repository --javascript

# Combine with other options
repo-serializer /path/to/repository --python -s -c  # Python files, structure only, copy to clipboard

Python API

from repo_serializer import serialize

# Serialize a repository, skipping specific directories
serialize("/path/to/repository", "output.txt", skip_dirs=["build", "dist"])

Features

  • Directory Structure: Clearly visualize repository structure in ASCII format.
  • Structure-Only Mode: Option to output only the directory structure and filenames without file contents.
  • File Filtering: Excludes common binary files, cache directories, hidden files, and irrelevant artifacts to keep outputs concise and focused.
  • Smart Content Handling:
    • Parses Jupyter notebooks to extract markdown and code cells with sample outputs
    • Limits CSV files to first 5 lines
    • Truncates large text files after 1000 lines
    • Handles non-UTF-8 and binary files gracefully
  • Extensive Filtering: Skips common configuration files, build artifacts, test directories, and more.
  • Clipboard Integration: Option to copy output directly to clipboard.

Example

# Create a serialized snapshot of your project
repo-serializer /Users/example_user/projects/my_repo -o repo_snapshot.txt

Contributing

Pull requests and improvements are welcome! Please ensure your contributions are clearly documented and tested.

Development

Quick Testing

For quick testing during development:

# Install in development mode
pip install -e .

# Now any changes to the source code take effect immediately
repo-serializer /path/to/test/repo -o test_output.txt

Full Test Suite

Run the test script:

./dev/test_dev.py

This will:

  1. Install the package in development mode
  2. Run multiple test scenarios
  3. Generate test outputs for review

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repo_serializer-1.1.2.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repo_serializer-1.1.2-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file repo_serializer-1.1.2.tar.gz.

File metadata

  • Download URL: repo_serializer-1.1.2.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for repo_serializer-1.1.2.tar.gz
Algorithm Hash digest
SHA256 b2998d4983c8bada57cee033dde780537e2afd07a226ceb8cf3b86757bc0eb04
MD5 9face640bfcdc678ea80d72f84fa9ad3
BLAKE2b-256 df08103b8cf533a629bf63f576090b9644e313d21c0af136268cca58fa06697b

See more details on using hashes here.

File details

Details for the file repo_serializer-1.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for repo_serializer-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 29a41c9c16658813a3351eba1cb3d17bb47b4ee4e9402369e80e08c717cb35db
MD5 c7f2fccc3e4c15255fc4f0b7b3481180
BLAKE2b-256 89f673cfa7bedd62fba5f69254f8001025d27b8e367e8da00303de916f052b81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page