Skip to main content

A tool for converting git repositories into documents

Project description

git2doc 📚

A powerful Python library for converting git repositories into documents. git2doc allows you to extract and analyze code from GitHub repositories, making it easier to understand and work with large codebases.

Why git2doc? 🚀

Working with large repositories can be overwhelming, especially when trying to understand the structure and content of the code. git2doc simplifies this process by converting repositories into documents, allowing you to easily search, analyze, and understand the codebase.

Table of Contents 📖

Installation 💻

pip install git2doc

Usage 🛠️

Fetching Repositories

from git2doc import get_repos_orchestrator

repos = get_repos_orchestrator(
    n_repos=10,
    last_n_days=30,
    language="Python"
)

Loading Repository Data

from git2doc import pull_code_from_repo

repo_data = pull_code_from_repo(
    repo="https://github.com/voynow/git2doc",
    branch="main"
)

Writing Data to Parquet Files

from git2doc import pipeline_fetch_and_load

pipeline_fetch_and_load(
    n_repos=1000,
    last_n_days=365,
    language="Python",
    write_batch_size=100,
    delete=True,
)

Badges 🏅

PyPI version GitHub stars GitHub forks GitHub issues

Contributing 🤝

Contributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.

License 📄

This project is licensed under the MIT License. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git2doc-0.2.3.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

git2doc-0.2.3-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file git2doc-0.2.3.tar.gz.

File metadata

  • Download URL: git2doc-0.2.3.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.2.3.tar.gz
Algorithm Hash digest
SHA256 e6a9d25da0a1b6ae940dc45183333489347cf33c9d99df39bdb25d42c2ccfc21
MD5 189f976ca4e0ae7d0089b7977c60aa61
BLAKE2b-256 bed9815d50cc4f5e4518332438e926373529a45cbba01f8b1eef9a2a80719bcb

See more details on using hashes here.

File details

Details for the file git2doc-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: git2doc-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 46cb2813bad39c33f7cbed7394bb16816207e4070eb4e208e5c7e72949ca1661
MD5 066b774df4d9a05d49c0856927256182
BLAKE2b-256 a5dc4571065865d7db022c9e22fca92c6c9ca77d0b5d0a6cf0605b56754c08b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page