Read Git repository data source
Project description
data-source-git
Extracts content from Git repositories along with associated file metadata.
Usage
pip install vdk-data-source-git
Extracted Data Schema
The extracted data is returned in a DataSourcePayload
object with two main components: content
and metadata
.
content
The content
field contains the actual content of the file as a string.
metadata
The metadata
field contains a dictionary with the following schema:
Key | Description | Data Type | Example |
---|---|---|---|
size |
The size of the file in bytes | Integer | 12345 |
path |
The path of the file in the repository | String | "src/main.py" |
num_lines |
The number of lines in the file | Integer | 678 |
file_extension |
The file extension | String | ".py" |
programming_language |
The detected programming language of the file | String | "Python" |
is_likely_test_file |
Flag indicating if the file is likely a test file | Boolean | false |
Configuration
(vdk config-help
is useful command to browse all config options of your installation of vdk)
Name | Description | (example) Value |
---|---|---|
git_url | URL of the Git repository to be cloned. | "https://github.com/user/repo" |
Build and testing
pip install -r requirements.txt
pip install -e .
pytest
In VDK repo ../build-plugin.sh script can be used also.
Note about the CICD:
.plugin-ci.yaml is needed only for plugins part of Versatile Data Kit Plugin repo.
The CI/CD is separated in two stages, a build stage and a release stage. The build stage is made up of a few jobs, all which inherit from the same job configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and 3.10). They run according to rules, which are ordered in a way such that changes to a plugin's directory trigger the plugin CI, but changes to a different plugin does not.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file vdk_data_source_git-0.1.1431637373.tar.gz
.
File metadata
- Download URL: vdk_data_source_git-0.1.1431637373.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9703a6dfa9da3b8b05d650a21303014513c839811d9dff38b4c89c626ed62ee |
|
MD5 | 5d8d79069a08d9a6408ff9933f5f04b2 |
|
BLAKE2b-256 | b15ad5ef86db529d5615908ba398b2036d0af71f5bd55000a7397b9b25c0e96e |