Skip to main content

A tool for converting git repositories into documents

Project description

git2doc 📚

A powerful Python library for converting git repositories into documents. git2doc allows you to extract and analyze code from GitHub repositories, making it easier to understand and work with large codebases.

Why git2doc? 🚀

Working with large repositories can be overwhelming, especially when trying to understand the structure and content of the code. git2doc simplifies this process by converting repositories into documents, allowing you to easily search, analyze, and understand the codebase.

Table of Contents 📖

Installation 💻

pip install git2doc

Usage 🛠️

Fetching Repositories

from git2doc import get_repos_orchestrator

repos = get_repos_orchestrator(
    n_repos=10,
    last_n_days=30,
    language="Python"
)

Loading Repository Data

from git2doc import pull_code_from_repo

repo_data = pull_code_from_repo(
    repo="https://github.com/voynow/git2doc",
    branch="main"
)

Writing Data to Parquet Files

from git2doc import pipeline_fetch_and_load

pipeline_fetch_and_load(
    n_repos=1000,
    last_n_days=365,
    language="Python",
    write_batch_size=100,
    delete=True,
)

Badges 🏅

PyPI version GitHub stars GitHub forks GitHub issues

Contributing 🤝

Contributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.

License 📄

This project is licensed under the MIT License. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git2doc-0.2.4.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

git2doc-0.2.4-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file git2doc-0.2.4.tar.gz.

File metadata

  • Download URL: git2doc-0.2.4.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.2.4.tar.gz
Algorithm Hash digest
SHA256 2421a3f4762d358acd51a6036c9bdb7115948d96b7019403b332dfae8f9cc08d
MD5 f955cfa15a72160eba040e7c43d2a6ea
BLAKE2b-256 aca5762ce241ea17a99a2ea243e919b103678c93cd64f7cbdbd443723b3d56ad

See more details on using hashes here.

File details

Details for the file git2doc-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: git2doc-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f0765f2e0ec8fbb3a2f1e5c0a0b09cc107394d962ebaf7aff09892b0e7f077ec
MD5 1fde86a3356ec4cae74509543ba4f4cd
BLAKE2b-256 a0ca677df399819be7acc3e9f6d8dc5454d7a36b55006eb3f893139839c4edaf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page