Skip to main content

Read Git repository data source

Project description

data-source-git

monthly download count for vdk-data-source-git

Extracts content from Git repositories along with associated file metadata.

Usage

pip install vdk-data-source-git

Extracted Data Schema

The extracted data is returned in a DataSourcePayload object with two main components: content and metadata.

content

The content field contains the actual content of the file as a string.

metadata

The metadata field contains a dictionary with the following schema:

Key Description Data Type Example
size The size of the file in bytes Integer 12345
path The path of the file in the repository String "src/main.py"
num_lines The number of lines in the file Integer 678
file_extension The file extension String ".py"
programming_language The detected programming language of the file String "Python"
is_likely_test_file Flag indicating if the file is likely a test file Boolean false

Configuration

(vdk config-help is useful command to browse all config options of your installation of vdk)

Name Description (example) Value
git_url URL of the Git repository to be cloned. "https://github.com/user/repo"

Build and testing

pip install -r requirements.txt
pip install -e .
pytest

In VDK repo ../build-plugin.sh script can be used also.

Note about the CICD:

.plugin-ci.yaml is needed only for plugins part of Versatile Data Kit Plugin repo.

The CI/CD is separated in two stages, a build stage and a release stage. The build stage is made up of a few jobs, all which inherit from the same job configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and 3.10). They run according to rules, which are ordered in a way such that changes to a plugin's directory trigger the plugin CI, but changes to a different plugin does not.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vdk_data_source_git-0.1.1431637373.tar.gz (4.7 kB view details)

Uploaded Source

File details

Details for the file vdk_data_source_git-0.1.1431637373.tar.gz.

File metadata

File hashes

Hashes for vdk_data_source_git-0.1.1431637373.tar.gz
Algorithm Hash digest
SHA256 b9703a6dfa9da3b8b05d650a21303014513c839811d9dff38b4c89c626ed62ee
MD5 5d8d79069a08d9a6408ff9933f5f04b2
BLAKE2b-256 b15ad5ef86db529d5615908ba398b2036d0af71f5bd55000a7397b9b25c0e96e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page