Skip to main content

A useful module for handling Git data.

Project description

Git2Vec

Git2Vec is a Python package for handling Git data. It provides functionality to load and process Git repositories, and supports concurrent file loading for improved performance. The package can be found on PyPI.

Installation

To install Git2Vec, run the following command:

pip install git2vec

Setup

Before using Git2Vec, make sure to have the following dependencies installed:

  • langchain
  • pinecone-client
  • tiktoken
  • gitpython
  • python-dotenv
  • pandas

You can install them using the following command:

pip install -r requirements.txt

Usage

Loading Git Repositories

The main functionality of Git2Vec is provided by the loader.py module. Here's an example of how to use the pull_code_from_repo function to load a Git repository:

from git2vec.loader import pull_code_from_repo

repo_url = "https://github.com/username/repo.git"
branch = "main"

repo_data = pull_code_from_repo(repo_url, branch)

Getting Top Repositories

You can use the get_top_repos function to fetch the top repositories based on certain criteria:

from git2vec.loader import get_top_repos

n_repos = 10
last_n_days = 30
language = "Python"
sort = "stars"
order = "desc"

top_repos = get_top_repos(n_repos, last_n_days, language, sort, order)

Pipeline Fetch and Load

The pipeline_fetch_and_load function can be used to fetch and load repositories in a single step:

from git2vec.loader import pipeline_fetch_and_load

n_repos = 10
last_n_days = 30
language = "Python"
sort = "stars"
order = "desc"

github_data = pipeline_fetch_and_load(n_repos, last_n_days, language, sort, order)

Contributing

If you'd like to contribute to Git2Vec, feel free to fork the repository and submit a pull request. If you have any questions or issues, please open an issue on the GitHub repository.

License

Git2Vec is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git2vec-0.1.5.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

git2vec-0.1.5-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file git2vec-0.1.5.tar.gz.

File metadata

  • Download URL: git2vec-0.1.5.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2vec-0.1.5.tar.gz
Algorithm Hash digest
SHA256 f29b231999400f5a08d0925d1b7c41da0858bc8dbbfcd4576b2cea9f3a3fbe7b
MD5 508d6dd9a07e221f33bc63e96434db4b
BLAKE2b-256 b7264fdd46f245e3bf09fc737bfe4d95833296832c39d8491a8cb1df4373a172

See more details on using hashes here.

File details

Details for the file git2vec-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: git2vec-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for git2vec-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2927e2c33194536cf8b1b3da97a2d0c793ffdb349f71c50b7989268106327338
MD5 16aa2fe5aa60d0d46b2b36e1d8e878f7
BLAKE2b-256 adb9fda0a3ba8a5a7dc933f93455907abc7ffd1f10188c44b2fc4604f8487fd2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page