Skip to main content

A tool for converting git repositories into documents

Project description

Git2Doc

Git2Doc is a Python package for handling Git data. It provides functionality to load and process Git repositories, and supports concurrent file loading for improved performance. The package can be found on PyPI.

Table of Contents

Installation

To install Git2Doc, run the following command:

pip install git2doc

Setup

Before using Git2Doc, make sure to have the following dependencies installed:

  • langchain
  • tiktoken
  • gitpython
  • python-dotenv
  • pandas

You can install them using the following command:

pip install -r requirements.txt

Usage

Loading Git Repositories

The main functionality of Git2Doc is provided by the loader.py module. Here's an example of how to use the pull_code_from_repo function to load a Git repository:

from git2doc.loader import pull_code_from_repo

repo_url = "https://github.com/username/repo.git"
branch = "main"

repo_data = pull_code_from_repo(repo_url, branch)

Getting Top Repositories

You can use the get_top_repos function to fetch the top repositories based on certain criteria:

from git2doc.loader import get_top_repos

n_repos = 10
last_n_days = 30
language = "Python"
sort = "stars"
order = "desc"

top_repos = get_top_repos(n_repos, last_n_days, language, sort, order)

Pipeline Fetch and Load

The pipeline_fetch_and_load function can be used to fetch and load repositories in a single step:

from git2doc.loader import pipeline_fetch_and_load

n_repos = 10
last_n_days = 30
language = "Python"
sort = "stars"
order = "desc"

github_data = pipeline_fetch_and_load(n_repos, last_n_days, language, sort, order)

Contributing

If you'd like to contribute to Git2Doc, feel free to fork the repository and submit a pull request. If you have any questions or issues, please open an issue on the GitHub repository.

License

Git2Doc is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git2doc-0.1.8.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

git2doc-0.1.8-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file git2doc-0.1.8.tar.gz.

File metadata

  • Download URL: git2doc-0.1.8.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.1.8.tar.gz
Algorithm Hash digest
SHA256 60cb45cb8c0deac98515e4099c7c3a9d84f04067905e1fd86e3fd4145b246350
MD5 7581c8b766119bd4d3fc6a77f4d34a1d
BLAKE2b-256 cdb7d926d67a34ab9b423e1eefda9937fef6e3d4445f9661721dcc68632a94d0

See more details on using hashes here.

File details

Details for the file git2doc-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: git2doc-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2doc-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 22fe0057444694ef3e76023d8c8cbc6e5b0a0865ef61868c8513a36454f842b8
MD5 1e8ecf8b35c3e13c360d2c2d250cc1f2
BLAKE2b-256 6ced58047df774aae6bbfccdea172f63533a1f4d04e85dec48fce9957b4fa455

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page