Skip to main content

A Python client for interacting with ScribeHub.

Project description


title: Scribe MCP emoji: 🧠 colorFrom: blue colorTo: indigo sdk: docker pinned: false

Scribe Python Client

The Scribe Python Client is a library for interacting with the ScribeHub API. It provides a simple interface for accessing datasets, querying vulnerabilities, and managing products.

Installation

Install the package using pip:

pip install scribe-python-client

Installation Options

The Scribe Python Client supports several installation options for additional features:

  • Base install: Installs the core client.
    pip install scribe-python-client
    
  • MCP support: Installs the client with Model Context Protocol (MCP) server support and its dependencies.
    pip install "scribe-python-client[mcp]"
    
  • Graph support: Installs the client with graph and visualization dependencies (for lineage graph and related features).
    pip install "scribe-python-client[graph]"
    
  • All features: Installs the client with all optional dependencies (including MCP, graph, and any future extras).
    pip install "scribe-python-client[all]"
    

Use the appropriate option depending on your needs:

  • If you only require the basic API/CLI, the base install is sufficient.
  • For advanced features like the MCP server, use the [mcp] or [all] options.
  • For graph/visualization features (e.g., lineage graph), use the [graph] or [all] options.

Usage

The client requires an API token for authentication. You can obtain your API token from the ScribeHub dashboard. The CLI supports providing the SCRIBE_TOKEN as an argument, --api-key. You can set the SCRIBE_TOKEN environment variable to avoid passing the --api_token argument:

export SCRIBE_TOKEN=YOUR_API_TOKEN
scribe-client --api_call get-products

CLI Usage

The package includes a CLI tool for quick interactions. After installation, you can use the scribe-client command. Below are examples for all supported commands:

Examples

Get Products

Retrieve a list of products managed in Scribe:

scribe-client --api-call get-products --api-token YOUR_API_TOKEN

Get Product Vulnerabilities

Retrieve vulnerabilities for a specific product:

scribe-client --api-call get-product-vulnerabilities --product-name YOUR_PRODUCT_NAME --api-token YOUR_API_TOKEN

Get Policy Results

Retrieve policy results for a specific product:

scribe-client --api-call get-policy-results --product-name YOUR_PRODUCT_NAME --api-token YOUR_API_TOKEN

Get Datasets

Retrieve all datasets:

scribe-client --api-call get-datasets --api-token YOUR_API_TOKEN

List Attestations

List all attestations:

scribe-client --api-call list-attestations --api-token YOUR_API_TOKEN

Get Attestation

Retrieve a specific attestation by ID:

scribe-client --api-call get-attestation --attestation-id YOUR_ATTESTATION_ID --api-token YOUR_API_TOKEN

Attestation IDs ca n be obtained from the list of attestations - search for 'id' in the output.

Get Latest Attestation

Retrieve the latest attestation for a specific product:

scribe-client --api-call get-latest-attestation --product-name YOUR_PRODUCT_NAME --api-token YOUR_API_TOKEN

Specific Dataset Commands

The Scribe Python Client allows you to interact with specific datasets for advanced queries and data retrieval. Below are details about these commands and examples of how to use them.

Querying Specific Datasets

You can query specific datasets such as vulnerabilities, products, policies, and lineage. These commands allow you to run custom queries and retrieve detailed information.

Query Vulnerabilities Dataset

Run a custom query on the vulnerabilities dataset:

scribe-client --api-call query-vulnerabilities --query "{\"columns\": [\"vulnerability_id\", \"severity\"], \"filters\": [{\"col\": \"severity\", \"op\": \"==\", \"val\": \"High\"}], \"orderby\": [], \"row_limit\": 10}"

Query Products Dataset

Run a custom query on the products dataset:

scribe-client --api-call query-products --query "{\"columns\": [\"logical_app\", \"logical_app_version\"], \"filters\": [{\"col\": \"logical_app\", \"op\": \"like\", \"val\": \"%example%\"}], \"orderby\": [], \"row_limit\": 5}"

Query Policy Results Dataset

Run a custom query on the policy results dataset:

scribe-client --api-call query-policy-results --query "{\"columns\": [\"status\", \"time_evaluated\"], \"filters\": [{\"col\": \"status\", \"op\": \"==\", \"val\": \"Passed\"}], \"orderby\": [], \"row_limit\": 10}"

Query Lineage Dataset

Run a custom query on the lineage dataset:

scribe-client --api-call query-lineage --query "{\"columns\": [\"asset_name\", \"asset_type\"], \"filters\": [{\"col\": \"asset_type\", \"op\": \"==\", \"val\": \"repo\"}], \"orderby\": [], \"row_limit\": 10}"

Run a custom query on the lineage dataset and create a graph of the lineage:

scribe-client --api-call query-lineage --query "{\"columns\": [\"asset_name\", \"asset_type\", \"parent_name\", \"parent_type\", \"external_id\", \"parent_external_id\", \"uri\"], \"filters\": [{\"col\": \"logical_app\", \"op\": \"==\", \"val\": \"Astro-Analytics-Discovery\"}, {\"col\": \"logical_app_version\", \"op\": \"==\", \"val\": \"36\"}], \"orderby\": []}" --lineage-graph-file lineage-graph.html

Note that the columns in the query are the minimal set required to create a lineage graph.

Notes

  • Replace the --query argument with your desired query in JSON format.
  • Ensure that the query structure matches the dataset schema for accurate results.
  • Use the --api-token argument or set the SCRIBE_TOKEN environment variable for authentication.

Library Usage

You can also use the library programmatically in your Python code:

from scribe_python_client.client import ScribeClient

# Initialize the client
client = ScribeClient(api_token="YOUR_API_TOKEN")

# Get products
products = client.get_products()
print(products)

# Get datasets
datasets = client.get_datasets()
print(datasets)

Features

  • Get Products: Retrieve a list of products managed in Scribe.
  • Query Datasets: Query datasets for vulnerabilities, policy results, and more.
  • CLI Support: Use the scribe-client command for quick API interactions.

Function Groups

The library provides the following hierarchical function groups:

1. Product Management

  • Get Products: Retrieve a list of products managed in Scribe.
  • Get Product Vulnerabilities: Retrieve vulnerabilities for a specific product.

2. Dataset Management

  • Get Datasets: Retrieve all datasets.
  • Query Datasets: Query datasets for vulnerabilities, policy results, and more.

3. Policy Management

  • Get Policy Results: Retrieve policy results for a specific product.

4. Attestation Management

  • List Attestations: List all attestations.
  • Get Attestation: Retrieve a specific attestation by ID.
  • Get Latest Attestation: Retrieve the latest attestation for a specific product.

Tables Description

This description is informative only; it may be partial and may change as the API evolves.

query_vulnerabilities Columns

Column Name Description
advisory_justification Justification for advisory decision
advisory_modified Advisory creation timestamp
advisory_status Advisory decision status
advisory_text Additional advisory information
attestation_ids IDs for SBOM attestations
attestation_name SBOM attestation name
base_score CVSS base score
component_id Dependency ID
component_locations Dependency locations in the product
component_name Dependency name
component_purl Dependency Package URL
component_version Dependency version
cvss_score CVSS score
epssProbability Exploitability probability
final_severity Updated severity by user
has_fix Is a patch available?
has_kev Known Exploited Vulnerability?
id ID
is_latest_logical_version Is this the latest product version?
labels User-defined labels for SBOM
logical_app Product name
logical_app_version Product version
severity Original severity (integer, cvss score)
source_layer Image layer source of vulnerability
targetName Container/component name
vector CVSS vector
version_timestamp Timestamp of version
vul_component_created Dependency creation date
vul_component_fixed_in_versions Fixed versions for the vulnerability
vul_published_on Vulnerability publication date
vulnerability_id Vulnerability ID (e.g., CVE-2024-5535)

query_products Columns

When a user says component he means a container, and when he says dependency he means what the table calls components. All conditions should be in the filter part, NOT in the group by.

Column Name Description
base_layer "TRUE" if the dependency is part of the base layer, otherwise "FALSE"
component_name Dependency name
component_purl Dependency URL
component_version Dependency version
license_expression License information
logical_app Product name
logical_app_version Product version
high_severity_cves Count of critical/high vulnerabilities
labels User-defined labels
version_is_up_to_date Is the dependency version up-to-date?
targetName The name of a part of a products (high level compoenent, like a docker image)
tag The tag/version of a part of a products (high level compoenent, like a docker image)

query_policy_results Columns

Column Name Description
time_evaluated Timestamp when the policy was evaluated.
logical_app Product name.
logical_app_version Product version.
initiative_id Identifier for the specific initiative associated with the policy.
version_id Identifier for the version of the initiative or rule.
gen_rule_id Unique identifier for the general rule.
gen_rule_name Name of the general rule.
status Rule result (e.g., pass, fail).
status_string Detailed textual description of the rule result status.
targetName Name of the specific target being evaluated (component)
gate Checkpoint where the rule was evaluated.
count Number of results.
more Additional information or metadata about the evaluation (if available).

query_lineage Columns

When asked about products - use the logical_app and logical_app version columns and not the parent_name.

Column Name Description
asset_name Name of the asset.
asset_type Type of the asset (e.g., repo, image, pod).
external_id External identifier for the asset.
logical_app Product name.
logical_app_version Product version.
owner Owner of the asset, if applicable.
parent_external_id External identifier of the parent asset.
parent_id Unique identifier of the parent asset.
parent_name Name of the parent asset.
parent_type Type of the parent asset.
path Relative or absolute path to the asset.
platform_name Name of the platform hosting the asset.
platform_type Type of platform (e.g., SCM, namespace).
product_id Unique identifier for the product.
properties Additional properties of the asset, as a json string
timestamp Timestamp when the asset was recorded.
uri URI linking to the asset, if available.

Contributing

Contributions are welcome! Please submit a pull request or open an issue for any bugs or feature requests.

License

This project is licensed under the MIT License. See the LICENSE file for details.# test

Superset Table and Metadata Export

The following --api-call commands export metadata from Superset:

  • get-dataset-links
  • get-dataset-data
  • get-dataset-tables-dict
  • get-dataset-tables-md

These commands require a Superset username and password, passed via CLI arguments. Example:

python -m scribe_python_client.cli --api-call get-dataset-tables-md --env dev --username 'your_username' --password 'your_password'

Upload/Update dataset data using yaml_export.py

Log in to Scribe Superset (dev or prod). Go to the Datasets tab and search for the dataset you want to export. Click on the dataset name, then select Export Dataset. This will download a .zip file. Extract its contents. After extracting the zip file, open the datasets/ folder inside.

Find the .yaml file for your dataset.
Example path:
dataset_export_20250729/datasets/your_dataset.yaml

Open the script at:
scribe-python-client/scribe_python_client/yaml_export.py

Locate the line that sets the yaml_input_path variable and update it to point to your YAML file, the CSV file you want to extract from, and update the output file. Then run the script using this command:

python scribe-python-client/scribe_python_client/yaml_export.py

The metadata of the dataset is updated through a Google Sheet here. Download it to csv for the update process.

Scribe Python Client MCP

Installation

Install the package with MCP support using pip:

pip install "scribe-python-client[mcp]"

Usage

The client requires two API tokens for authentication:

  • Scribe API token — Obtain it from the ScribeHub dashboard.
  • OpenAI API token — Get it from the OpenAI website.

CLI Usage

The package includes a CLI tool for quick interactions. After installation, you can use the scribe-client command with the --mcp flag to start the MCP server:

scribe-client --mcp

Tool Usage

Once the server is started via the CLI:

  1. Open the mcp.json configuration file.
  2. Start the ScribeMCPLocal server.
  3. The tools should then become active and accessible in Copilot.

The MCP server supports all methods through both:

  • Direct REST queries
  • Query generation from natural language questions

Documentation

Comprehensive documentation is available for all POST endpoints, including:

  • Valid query parameters
  • Allowed columns
  • Query examples
  • Function descriptions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scribe_python_client-0.2.24.tar.gz (56.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scribe_python_client-0.2.24-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file scribe_python_client-0.2.24.tar.gz.

File metadata

  • Download URL: scribe_python_client-0.2.24.tar.gz
  • Upload date:
  • Size: 56.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scribe_python_client-0.2.24.tar.gz
Algorithm Hash digest
SHA256 637d114416b99063dc63055bf18f2918011519d6593bc638ad086f8ee6637f96
MD5 03cd0e8d5a075a4ad5a14c8cf36610d2
BLAKE2b-256 6c37995c8f163d09878cfb8d9514b19ec81c0a0d18365c7ce807aef3fbf66af5

See more details on using hashes here.

File details

Details for the file scribe_python_client-0.2.24-py3-none-any.whl.

File metadata

File hashes

Hashes for scribe_python_client-0.2.24-py3-none-any.whl
Algorithm Hash digest
SHA256 503436eaca24ff34bda1c2c5e0e2cad7dc6ef63678304369a8ced353238ebd07
MD5 16f1a1ff863c516fa0f4434775134047
BLAKE2b-256 a55c8ddb2cce6cbe28a9ed414b4e709de4d82aca1b700a18b270ff7f0de92055

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page