Skip to main content

AppThreat's vulnerability database and package search library with a built-in sqlite based storage. OSV, CVE, GitHub, npm are the primary sources of vulnerabilities.

Project description

Introduction

This repo is a vulnerability database and package search for sources such as AppThreat vuln-list, OSV, NVD, and GitHub. Vulnerability data are downloaded from the sources and stored in a sqlite based storage with indexes to allow offline access and efficient searches.

Why vulnerability db?

A good vulnerability database must have the following properties:

Multiple upstream sources are used by vdb to improve accuracy and reduce false negatives. SQLite database containing data in CVE 5.2 schema format is precompiled and distributed as files via ghcr to simplify download. With automatic purl prefix generation even for git repos, searches on the database can be performed with purl, cpe, or even http git url string. Every row in the database uses an open specification such as CVE 5.2 or Package URL (purl and vers) thus preventing the possibility of vendor lock-in.

Vulnerability Data sources

  • Linux vuln-list (Forked from AquaSecurity)
  • OSV (1)
  • NVD
  • GitHub

1 - We exclude Linux and oss-fuzz feeds by default. Set the environment variable OSV_INCLUDE_FUZZ=true to include them. 2 - Malware feeds are included by default, thus increasing the db size slightly. Set the environment variable OSV_EXCLUDE_MALWARE=true to exclude them.

Linux distros

  • AlmaLinux
  • Debian
  • Alpine
  • Amazon Linux
  • Arch Linux
  • RHEL/CentOS
  • Rocky Linux
  • Ubuntu
  • OpenSUSE
  • Photon
  • Chainguard
  • Wolfi OS

Installation

pip install appthreat-vulnerability-db>=6.7.0

To install vdb with optional dependencies such as oras use the [oras] or [all] dependency group.

pip install appthreat-vulnerability-db[all]

NOTE: VDB v6 is a major rewrite to use SQLite database. Current users of depscan v5 must continue using version 5.8.x

pip install appthreat-vulnerability-db==5.8.0

Usage

This package is ideal as a library for managing vulnerabilities. This is used by owasp-dep-scan, a free open-source dependency audit tool. However, there is a limited cli capability available with few features to test this tool directly.

[!IMPORTANT] The AppThreat-hosted database images and workflows are best treated as bootstrap or evaluation defaults. For production use, especially when you need larger variants such as app + OS, we strongly recommend creating and publishing your own pre-built database versions with your own CI/CD workflows and storage.

Why:

  • Security / provenance: Your team controls when data is built, where it is published, and which upstream sources and retention windows are allowed.
  • Performance: You can publish smaller, faster-to-download databases that match your environment instead of pulling a one-size-fits-all image.
  • Cost control: Large variants such as app + OS require significant compute, disk, and network bandwidth. Running scheduled builds on self-hosted infrastructure lets you scale them intentionally and budget for them explicitly.

Option 1: Download AppThreat pre-built database (Quick start)

To download a pre-built SQLite database (refreshed every 12 hours) containing all application vulnerabilities (~ 700MB). This is the fastest way to evaluate vdb, bootstrap a workstation, or validate an integration.

# pip install appthreat-vulnerability-db[all]
vdb --download-image

You can execute this command daily or when a fresh database is required. For long-running production workflows, prefer mirroring or rebuilding this database inside your own environment and then distributing it from your own registry, object store, or artifact repository.

Metadata searches such as full-text, alias, reference, package-name, symbol, source, severity, date, and malware-aware filters require an extended database. To download the app-only extended artifact with the same command, override the app-only URL:

export VDB_APP_ONLY_DATABASE_URL=ghcr.io/appthreat/vdbxz-app-extended:v6.7.x
vdb --download-image

To perform containers and OS scans, download the full image (~ 7.5GB) which includes all application and OS vulnerabilities.

vdb --download-full-image

For the app+OS extended artifact, override the full database URL:

export VDB_DATABASE_URL=ghcr.io/appthreat/vdbxz-extended:v6.7.x
vdb --download-full-image

Because the full image is substantially larger and more expensive to build, test, and distribute, teams scanning containers or operating system packages should strongly prefer their own scheduled workflow that produces a tailored variant for the distros and time windows they actually support.

Use any sqlite browser or cli tools to load and query the two databases.

data.index.vdb6 - index db with purl prefix and vers

data.vdb6 - Contains CVE 5.2 source records normalized into cve_source_data and package/version locator rows in cve_data.

Database layout note for v6.7+

VDB 6.7 normalizes repeated CVE 5.2 source blobs into a hash-keyed cve_source_data table and stores package/version locator rows separately in cve_data. It also keeps extended search metadata opt-in: default public databases leave cve_metadata and cve_metadata_text empty to minimize data.index.vdb6 size, while *-extended database variants populate those tables for text, alias, reference, symbol, severity, source, and date-aware searches. Public database artifacts are rebuilt from scratch by the release workflows, so schema migrations for older .vdb6 files are not maintained.

Option 2: Download pre-built database (ORAS)

Using ORAS cli might be slightly faster.

export VDB_HOME=$HOME/vdb
oras pull ghcr.io/appthreat/vdbxz:v6.7.x -o $VDB_HOME
tar -xvf *.tar.xz
rm *.tar.xz

Use the matching *-extended image, for example ghcr.io/appthreat/vdbxz-app-extended:v6.7.x or ghcr.io/appthreat/vdbxz-extended:v6.7.x, when metadata search APIs are required.

Option 3: Use HuggingFace cli

Download one of the databases.

pip install -U "huggingface_hub[cli]"

app only database

export VDB_HOME=$(pwd)/app
huggingface-cli download AppThreat/vdb --include "app/*.vdb6" --repo-type dataset --local-dir .

app only 10 year database

export VDB_HOME=$(pwd)/app-10y
huggingface-cli download AppThreat/vdb --include "app-10y/*.vdb6" --repo-type dataset --local-dir .

app and os database

export VDB_HOME=$(pwd)/app-os
huggingface-cli download AppThreat/vdb --include "app-os/*.vdb6" --repo-type dataset --local-dir .

app and os 10 year database

export VDB_HOME=$(pwd)/app-os-10y
huggingface-cli download AppThreat/vdb --include "app-os-10y/*.vdb6" --repo-type dataset --local-dir .

Extended variants use the same names as the database variation table, such as app-extended/, app-2y-extended/, app-10y-extended/, app-os-extended/, and app-os-10y-extended/:

export VDB_HOME=$(pwd)/app-extended
huggingface-cli download AppThreat/vdb --include "app-extended/*.vdb6" --repo-type dataset --local-dir .

Citation

Use the below citation in your research.

@misc{vdb,
  author = {Team AppThreat},
  month = Feb,
  title = {{AppThreat vulnerability-db}},
  howpublished = {{https://huggingface.co/datasets/AppThreat/vdb}},
  year = {2025}
}

Option 4: Build and publish your own database workflow (Recommended for production)

If you depend on vdb regularly, build your own pre-built databases and publish them internally. This is the recommended approach for enterprises, security teams, and integrators.

Typical reasons to own the workflow:

  • Publish from infrastructure you trust and control.
  • Reduce supply-chain and availability dependencies on third-party hosted refresh jobs.
  • Tune the database scope for your environment to reduce artifact size and download time.
  • Use self-hosted runners or dedicated build machines for larger app + OS datasets, where compute, storage, and transfer costs are significant.

At a high level, the workflow is:

  1. Set the retention and distro selection environment variables for your environment.
  2. Run vdb --cache or vdb --cache-os on scheduled infrastructure.
  3. Package the resulting .vdb6 files.
  4. Publish them to your own OCI registry, object store, file share, or artifact repository.
  5. Point clients and integrations to your published URL instead of the AppThreat default.

Cache application vulnerabilities

vdb --cache

Build an extended database with metadata search support:

vdb --cache --include-metadata

The equivalent environment toggle is VDB_INCLUDE_METADATA=true.

To remove any existing databases:

vdb --clean

The typical size of this database is over 700 MB.

Cache from just OSV

vdb --cache --only-osv

It is possible to customize the cache behavior by increasing the historic data period to cache by setting the following environment variables. See the environment variables reference for the complete list.

  • NVD_START_YEAR - Default: 2020. Supports up to 2002
  • GITHUB_PAGE_COUNT - Default: 2. Supports up to 20

Cache application and OS vulnerabilities

vdb --cache-os

Build an extended application and OS database with metadata search support:

vdb --cache-os --include-metadata

Cache builds print minimal progress updates, such as the source year being fetched or the latest CVE stored. Use --quiet to suppress the logo, logs, and progress output:

vdb --cache-os --quiet

Note the size of the database with OS vulnerabilities is over 7.5 GB. It is possible to ignore or include specific OS distros using environment variables.

Example to ignore almalinux and ubuntu data from getting included, set the below environment variables:

export VDB_IGNORE_ALMALINUX=true
export VDB_IGNORE_UBUNTU=true

Refer to the variable LINUX_DISTRO_VULN_LIST_PATHS in config.py for the full list of distro strings supported.

For example, a team that only scans modern application dependencies can build a much smaller artifact by using a recent NVD_START_YEAR and sticking to vdb --cache. A platform team that only supports a subset of Linux distros can use VDB_IGNORE_* or VDB_INCLUDE_* environment variables before running vdb --cache-os to avoid paying the build and distribution cost for irrelevant data.

Environment variables

Most boolean toggles treat true or 1 as enabled. Set path and build-scope variables before importing vdb.lib.config in long-running Python processes, because those values are read at import time.

Paths and local storage

Variable Default Used by Description
VDB_HOME Platform user data directory for vdb CLI and library Directory containing data.vdb6, data.index.vdb6, and vdb.meta.
VDB_CACHE Platform user cache directory for vdb vdb --cache-os / AquaSource Cache directory. If $VDB_CACHE/vuln-list.zip exists, it is used instead of downloading AppThreat vuln-list.
VDB_TEMP_DIR System temp directory SQLite setup Directory for SQLite temporary files during large builds and VACUUM operations. Use a partition with enough free space for app+OS builds.

Pre-built database downloads

Variable Default Used by Description
VDB_APP_ONLY_DATABASE_URL ghcr.io/appthreat/vdbxz-app:v6.7.x vdb --download-image, MCP auto-download, integrations OCI image URL for the default app-only database. Override this to consume an internally published artifact.
VDB_DATABASE_URL ghcr.io/appthreat/vdbxz:v6.7.x, or ghcr.io/appthreat/vdbxz-10y:v6.7.x when USE_VDB_10Y=true vdb --download-full-image, integrations OCI image URL for the app+OS database.
VDB_EXTENDED_DATABASE_URL ghcr.io/appthreat/vdbxz-extended:v6.7.x Integrations Config constant for the app+OS extended database with metadata tables populated.
VDB_APP_ONLY_EXTENDED_DATABASE_URL ghcr.io/appthreat/vdbxz-app-extended:v6.7.x Integrations Config constant for the app-only extended database with metadata tables populated.
USE_VDB_10Y unset VDB_DATABASE_URL default selection When enabled, changes the default app+OS download URL to the 10-year image.

Source and feed selection for builds

Variable Default Used by Description
NVD_START_YEAR 2020 NVD/AppThreat vuln-list conversion Start year for NVD-style CVE data. Older years increase coverage, build time, and database size.
GITHUB_TOKEN unset GitHub advisory ingestion Token used for the GitHub GraphQL API. Avoid printing this value in logs.
GITHUB_GRAPHQL_URL https://api.github.com/graphql GitHub advisory ingestion Alternate GitHub GraphQL endpoint, useful for testing or enterprise proxies.
GITHUB_PAGE_COUNT 2 GitHub advisory ingestion Number of GitHub advisory GraphQL pages to fetch during a full refresh.
NPM_PAGE_COUNT 2 npm advisory configuration Number of npm advisory pages to fetch where npm advisory ingestion is used.
OSV_INCLUDE_FUZZ unset OSV ingestion Include Linux, OSS-Fuzz, and Android OSV feeds that are excluded by default to reduce false positives. Any non-empty value enables this.
OSV_EXCLUDE_MALWARE unset OSV conversion Exclude OSV malware advisories whose identifiers start with MAL. Any non-empty value enables this.
VDB_OSV_STORE_BATCH_SIZE 100 OSV ingestion Number of converted OSV records to store per database batch when building directly from OSV feeds. Invalid values fall back to 100; minimum is 1.
VDB_INCLUDE_METADATA unset CLI, storage, search metadata indexes Populate extended metadata tables for full-text, alias, reference, package-name, symbol, severity, source, and date-aware searches. Equivalent to --include-metadata.
VDB_METADATA_* unset vdb.meta creation Adds custom build metadata to vdb.meta; the prefix is stripped and the key is lowercased. Values true/1 and false/0 are stored as booleans.

OS distro filtering for builds

Variable Default Used by Description
VDB_IGNORE_OS unset OSV source configuration Skip OSV operating-system feeds added by default. Use app-only workflows for the smallest app database.
VDB_IGNORE_<DISTRO> unset AppThreat vuln-list and selected OSV distro feeds Exclude distro-specific data. Static OSV toggles are VDB_IGNORE_ALMALINUX, VDB_IGNORE_ALPINE, VDB_IGNORE_REDHAT, VDB_IGNORE_DEBIAN, VDB_IGNORE_ROCKYLINUX, VDB_IGNORE_MAGEIA, VDB_IGNORE_ALPAQUITA, and VDB_IGNORE_MINIMOS; AppThreat vuln-list also supports the distro keys in LINUX_DISTRO_VULN_LIST_PATHS.
VDB_EXCLUDE_<DISTRO> unset AppThreat vuln-list filtering Alias for excluding AppThreat vuln-list distro paths.
VDB_INCLUDE_<DISTRO> unset AppThreat vuln-list filtering Force-include distro paths from LINUX_DISTRO_VULN_LIST_PATHS, such as VDB_INCLUDE_ALPINE=true or VDB_INCLUDE_SUSE=true.
VDB_INCLUDE_SUSE unset AppThreat vuln-list filtering Include SUSE vuln-list data, which is ignored by default for performance. This is also covered by VDB_INCLUDE_<DISTRO>.

Output and SQLite tuning

Variable Default Used by Description
VDB_QUIET unset CLI Suppress logo, logs, and cache progress output. Equivalent to --quiet.
VDB_PROGRESS_INTERVAL 10000 Source progress callbacks Minimum number of records between progress messages. Invalid values fall back to 10000; minimum is 1.
VDB_SQLITE_IMMUTABLE unset Search database connections Open file-backed search databases with SQLite's immutable URI option for read-only deployments where .vdb6 files are not modified while the process is running.
VDB_SQLITE_CACHE_SIZE -65536 SQLite setup Value passed to PRAGMA cache_size. The default is approximately 64 MiB using SQLite's negative KiB convention.
VDB_SQLITE_JOURNAL_MODE DELETE SQLite setup Value passed to PRAGMA journal_mode. Valid values are DELETE, TRUNCATE, PERSIST, MEMORY, WAL, and OFF; invalid values fall back to DELETE.
VDB_SQLITE_SYNCHRONOUS OFF SQLite setup Value passed to PRAGMA synchronous. Valid values are OFF, NORMAL, FULL, EXTRA, or 0-3; invalid values fall back to OFF.
PYTHONIOENCODING unset Windows CLI/MCP startup If unset on Windows, VDB reconfigures standard streams to UTF-8.

MCP server

Variable Default Used by Description
VDB_AGE_DAYS 2 MCP server Number of days before the MCP server considers the local database stale and attempts an app-only ORAS download. Non-numeric values are passed through to the freshness check, so prefer an integer string.

Available Database Variations

VDB provides multiple pre-built databases optimized for different use cases, balancing data depth and file size. Both ORAS (ghcr.io) and HuggingFace datasets are updated every 12 hours.

Treat the variants below as reference baselines. They are useful defaults, but many teams should create their own equivalents with narrower scope, longer retention, or distro-specific filtering and publish them through their own delivery pipeline.

Note for AI Agents: Use this table to decide which database URL to pass to the download_image() function based on the user's requirements.

Database Scope Time Context ORAS Image URL (v6.7.x) HuggingFace Path Uncompressed data.vdb6 GiB Uncompressed index GiB .tar.xz data GiB .tar.xz index GiB .zst data GiB .zst index GiB Recommended Use Case
App Only 2 Years (2024+) ghcr.io/appthreat/vdbxz-app-2y:v6.7.x app-2y/ 2.05 0.26 0.13 0.03 0.19 0.04 Fast, lightweight scans for very modern applications.
App Only 2 Years Extended (2024+) ghcr.io/appthreat/vdbxz-app-2y-extended:v6.7.x app-2y-extended/ 2.05 0.73 0.13 0.07 0.19 0.10 App-only scans that need metadata search APIs.
App Only Default (2020+) ghcr.io/appthreat/vdbxz-app:v6.7.x app/ 2.96 0.43 0.21 0.05 0.31 0.07 Standard application dependency scanning.
App Only Default Extended (2020+) ghcr.io/appthreat/vdbxz-app-extended:v6.7.x app-extended/ 2.96 1.05 0.21 0.11 0.31 0.15 Default app database with all metadata search APIs.
App Only 10 Years (2016+) ghcr.io/appthreat/vdbxz-app-10y:v6.7.x app-10y/ 3.52 0.55 0.24 0.06 0.37 0.08 Deep auditing of legacy application software.
App Only 10 Years Extended (2016+) ghcr.io/appthreat/vdbxz-app-10y-extended:v6.7.x app-10y-extended/ 3.52 1.26 0.24 0.13 0.37 0.17 Legacy app audits that need metadata search APIs.
App + OS Default (2020+) ghcr.io/appthreat/vdbxz:v6.7.x app-os/ 42.36 4.66 3.42 0.20 5.23 0.30 Standard container and OS-level package scanning.
App + OS Default Extended (2020+) ghcr.io/appthreat/vdbxz-extended:v6.7.x app-os-extended/ 42.36 8.14 3.42 0.37 5.23 0.56 Container and OS scans that need metadata search APIs.
App + OS 10 Years (2016+) ghcr.io/appthreat/vdbxz-10y:v6.7.x app-os-10y/ 47.55 5.28 3.74 0.23 5.71 0.35 Deep auditing of legacy Linux containers/VMs.
App + OS 10 Years Extended (2016+) ghcr.io/appthreat/vdbxz-10y-extended:v6.7.x app-os-10y-extended/ 47.55 9.19 3.74 0.43 5.71 0.65 Legacy OS audits that need metadata search APIs.

(Note: The ORAS URLs above use .tar.xz compression. You can replace vdbxz with vdbzst in the URL if you prefer Zstandard compression).

If you operate your own workflow, keep the same naming pattern internally if it helps downstream tooling, but publish from infrastructure you control. This lets you swap in smaller app-only images, distro-restricted OS images, or longer-retention images without waiting on shared hosted workflows.


Custom Vulnerability Data

VDB supports loading custom vulnerability data from a local directory at runtime. This allows you to:

  1. Add Private Vulnerabilities: Include internal CVEs that are not public.
  2. Override False Positives: Correct data returned by the official database by marking specific versions as unaffected.

Custom data must follow the CVE 5.2 JSON Schema. Supported file extensions are .json, .yaml, .yml, and .toml.

To use custom data, pass the directory path to the --custom-data argument.

vdb --search pkg:npm/my-lib@1.0.0 --custom-data /path/to/custom/vulns

Example 1: Adding a Private Vulnerability

Create a file private-vuln.yaml. Since you are defining a new vulnerability record, you use the cna container.

dataType: CVE_RECORD
dataVersion: "5.2"
cveMetadata:
  cveId: PRIVATE-2025-001
  assignerOrgId: 00000000-0000-4000-8000-000000000000
  state: PUBLISHED
  datePublished: "2025-01-01T00:00:00Z"
  dateUpdated: "2025-01-01T00:00:00Z"
containers:
  cna:
    providerMetadata:
      orgId: 00000000-0000-4000-8000-000000000000
    descriptions:
      - lang: en
        value: "Private vulnerability in internal library"
    affected:
      - vendor: internal
        product: my-lib
        packageName: my-lib
        packageURL: pkg:npm/my-lib
        versions:
          - version: "1.0.0"
            status: affected
            versionType: semver
            lessThan: "2.0.0"

Example 2: Overriding a False Positive

If the official database reports CVE-2023-9999 for pkg:pypi/requests but you have determined it is a false positive for your specific version, you can override it using an ADP (Authorized Data Publisher) container. This is the recommended way to append or dispute existing vulnerability data.

Logic: If a CVE ID and Package URL combination exists in your custom data, VDB will ignore the entry from the official database and use yours instead.

Create override.yaml:

dataType: CVE_RECORD
dataVersion: "5.2"
cveMetadata:
  cveId: CVE-2023-9999
  assignerOrgId: 00000000-0000-4000-8000-000000000000
  state: PUBLISHED
containers:
  # Use 'adp' to append/modify existing vulnerability data
  adp:
    - providerMetadata:
        orgId: 00000000-0000-4000-8000-000000000000
        shortName: "MySecTeam"
      descriptions:
        - lang: en
          value: "Override to mark specific version as unaffected"
      affected:
        - product: requests
          packageName: requests
          packageURL: pkg:pypi/requests
          versions:
            # Explicitly mark your version as unaffected
            - version: "2.31.0"
              status: unaffected
              versionType: semver

CLI Usage

usage: vdb [-h] [--clean] [--cache] [--cache-os] [--only-osv] [--only-aqua] [--only-ghsa] [--include-metadata] [--quiet] [--search SEARCH] [--search-text SEARCH_TEXT] [--search-alias SEARCH_ALIAS]
           [--search-reference SEARCH_REFERENCE] [--search-package-name SEARCH_PACKAGE_NAME] [--search-symbol SEARCH_SYMBOL] [--search-packages SEARCH_PACKAGES]
           [--batch-size BATCH_SIZE] [--list-malware] [--bom BOM_FILE] [--download-image] [--download-full-image] [--print-vdb-metadata]
           [--custom-data CUSTOM_DATA]
AppThreat's vulnerability database and package search library with a sqlite storage.

options:
  -h, --help            show this help message and exit
  --clean               Clear the vulnerability database cache from platform specific user_data_dir.
  --cache               Cache vulnerability information in platform specific user_data_dir.
  --cache-os            Cache OS vulnerability information in platform specific user_data_dir.
  --only-osv            Use only OSV as the source. Use with --cache.
  --only-aqua           Use only Aqua vuln-list as the source. Use with --cache.
  --only-ghsa           Use only recent ghsa as the source. Use with --cache.
  --include-metadata    Populate extended metadata tables for text, alias, reference, symbol, severity, and source searches. Increases index database size.
  --quiet               Suppress logo, logs, and cache progress output.
  --search SEARCH       Search for the package or vulnerability ID (CVE, GHSA, ALSA, DSA, etc.) in the database. Use purl, cpe, or git http url.
  --search-text SEARCH_TEXT
                        Perform metadata/full-text search across vulnerability descriptions, aliases, references, and affected symbols.
  --search-alias SEARCH_ALIAS
                        Search vulnerability aliases such as GHSA, OSV, or vendor advisory identifiers.
  --search-reference SEARCH_REFERENCE
                        Search reference URLs and reference text in vulnerability metadata.
  --search-package-name SEARCH_PACKAGE_NAME
                        Search vulnerability metadata by package name or namespace/name.
  --search-symbol SEARCH_SYMBOL
                        Search affected functions or modules captured in vulnerability metadata.
  --search-packages SEARCH_PACKAGES
                        Path to a JSON file containing a list of package locators to search in bulk. Each item may include purl, cpe, url, alias, package_name, or search.
  --batch-size BATCH_SIZE
                        Batch size to use with --search-packages.
  --list-malware        List latest malwares with CVE ID beginning with MAL-.
  --bom BOM_FILE        Search for packages in the CycloneDX BOM file.
  --download-image      Downloaded pre-created vdb image to platform specific user_data_dir. Application vulnerabilities only.
  --download-full-image
                        Downloaded pre-created vdb image to platform specific user_data_dir. All vulnerabilities including OS.
  --print-vdb-metadata  Display metadata about the current vdb in user_data_dir.
  --custom-data CUSTOM_DATA
                        Path to directory containing custom vulnerability data (JSON/YAML/TOML) to override/augment results.

CLI search

It is possible to perform a range of searches using the cli.

vdb --search pkg:pypi/xml2dict@0.2.2

# Search based on a purl prefix
vdb --search pkg:pypi/xml2dict

# Full url and short form for swift
vdb --search "pkg:swift/github.com/vapor/vapor@4.39.0"

vdb --search "pkg:swift/vapor/vapor@4.89.0"

# Search by cpe
vdb --search "cpe:2.3:a:npm:gitblame:*:*:*:*:*:*:*:*"

# Search by colon separated values
vdb --search "npm:gitblame:0.0.1"

# Search by vulnerability id (CVE, GHSA, ALSA, DSA, etc.)
vdb --search CVE-2024-25169

# Search with wildcard for CVE
vdb --search CVE-2025-%

# Search by git url
vdb --search "https://github.com/electron/electron"

# Search by CycloneDX SBOM
vdb --bom bom.json

# Full-text search across descriptions, aliases, references, and symbols.
# These metadata searches require an extended database built with
# --include-metadata or downloaded from a *-extended artifact.
vdb --search-text "deserialization parser"

# Search explicit alias/reference/package metadata indexes
vdb --search-alias GHSA-gh-9999
vdb --search-reference github.com/example/widget-ui
vdb --search-package-name widget-ui
vdb --search-symbol parse_payload

# Bulk search from a JSON file containing package locator objects
vdb --search-packages packages.json

# Stream bulk lookups in smaller batches
vdb --search-packages packages.json --batch-size 25

Example packages.json for --search-packages:

[
  { "purl": "pkg:pypi/requests@2.31.0" },
  { "url": "https://github.com/pallets/flask" },
  { "alias": "GHSA-xxxx-yyyy-zzzz" },
  { "package_name": "flask" }
]

List recent malware

To list malware entries with the MAL- prefix, use the following command.

vdb --list-malware

For Integrators

The vulnerability-db is designed to be embedded into other Python applications.

Basic Integration

### AI/Agent Integration Script
# AI Agents: Use this script pattern to query vdb and get structured JSON output.

import json
import os
from vdb.lib import config, db6 as db_lib, search
from vdb.lib.orasclient import download_image

# 1. Prefer an organization-controlled URL. Fall back to the upstream default
# only for local testing or bootstrap scenarios.
DB_URL = os.getenv("VDB_APP_ONLY_DATABASE_URL", config.VDB_APP_ONLY_DATABASE_URL)

# 2. Download the database if missing or stale (older than 1 day)
if db_lib.needs_update(days=1):
    download_image(DB_URL, config.DATA_DIR)

# 3. Load Custom Data (Optional)
# This will override DB results with local YAML/JSON definitions
# search.load_custom_data("/path/to/custom_vulns")

# 4. Perform the search
target = "pkg:pypi/xml2dict@0.2.2"
results = search.search_by_any(target, with_data=True)

# 5. Extract and parse the Pydantic CVE 5.2 models into standard JSON
output = []
for res in results:
    vuln = {
        "cve_id": res['cve_id'],
        "fixed_in": res['fix_version'],
    }
    # res['source_data'] is a Pydantic model. Use model_dump to serialize.
    if res.get('source_data'):
        vuln['cve_data'] = res['source_data'].model_dump(mode='json')
    output.append(vuln)

# Print standard JSON for the agent to read via stdout
print(json.dumps(output, indent=2))

For production deployments, point VDB_APP_ONLY_DATABASE_URL or VDB_DATABASE_URL at the artifacts produced by your own workflow so application instances do not depend directly on AppThreat-hosted refresh jobs. If an integration needs metadata search APIs, use the corresponding extended artifact URL, such as VDB_APP_ONLY_EXTENDED_DATABASE_URL / VDB_EXTENDED_DATABASE_URL, or override the app/full download URL to your own *-extended image.

Advanced Usage

Batching and Generators When processing large SBOMs, search_by_cdx_bom yields a generator to reduce memory usage.

results_generator = search.search_by_cdx_bom("bom.json", with_data=True)
for result_batch in results_generator:
    for res in result_batch:
        # Process individual vulnerability result
        pass

Custom Database Locations If you are managing the database files manually or in a custom location, ensure config.DATA_DIR is set via environment variable VDB_HOME before importing the library, or update the vdb.lib.config paths dynamically.

For read-only deployments where .vdb6 files are not modified while the process is running, set VDB_SQLITE_IMMUTABLE=true to open search databases with SQLite's immutable URI option.

Result Structure The results returned by search functions are dictionaries containing:

  • cve_id: The vulnerability identifier.
  • source_data: A Pydantic model (vdb.lib.cve_model.CVE) of the CVE 5.2 data.
  • vers: The version range string from the index.
  • fix_version: The specific version where the issue is resolved (if applicable).

Bulk and Filtered Searches

VDB includes higher-level APIs for bulk package searches, BOM summaries, and filter-aware lookups.

from vdb.lib import search

filters = {
    "severity_threshold": "HIGH",
    "sources": ["osv", "github"],
    "exclude_malware": True,
    "package_ecosystem": "pypi",
    "page_size": 25,
}

package_results = search.search_packages(
    [
        {"purl": "pkg:pypi/requests@2.31.0"},
        {"url": "https://github.com/pallets/flask"},
        {"cpe": "cpe:2.3:a:npm:lodash:4.17.20:*:*:*:*:*:*:*"},
    ],
    with_data=False,
    filters=filters,
)

for pkg in package_results:
    print(pkg["locator"], pkg["result_count"], pkg["max_severity"])

For very large package sets, stream them in batches:

for batch in search.search_packages_batched(packages, batch_size=100, with_data=False):
    for pkg in batch:
        print(pkg["locator"], pkg["result_count"])

CycloneDX BOMs can be summarized or expanded into detailed package findings:

summary = search.search_bom_summary("bom.json", filters={"severity_threshold": "MEDIUM"})
detailed = search.search_bom_detailed("bom.json", with_data=True)

Text, Alias, Reference, and Symbol Search

In addition to locator-based lookups, VDB supports metadata and full-text search over aliases, references, descriptions, package names, and affected symbols. Default v6.7+ databases omit extended metadata to reduce index size, so text, alias, reference, symbol, source, severity, malware, and date-aware metadata searches require a database built with --include-metadata or downloaded from a *-extended artifact. When metadata is absent, metadata-specific APIs fail safe by returning no metadata matches.

search.search_by_alias("GHSA-xxxx-yyyy-zzzz", with_data=False)
search.search_by_reference("github.com/pallets/flask", with_data=False)
search.search_by_package_name("flask", with_data=False)
search.search_by_symbol("parse_payload", with_data=False)
search.search_full_text("deserialization parser", with_data=False)

These APIs use the SQLite metadata tables in data.index.vdb6. Normal locator searches still work without metadata; choose an extended database when your integration depends on these APIs or metadata filters.

Troubleshooting

Database Locked Errors

VDB uses SQLite. If you encounter apsw.BusyError or "database is locked":

  • Ensure you are not running multiple vdb --cache processes simultaneously.
  • If using VDB in a multi-threaded application, ensure you are treating the database connections as read-only where possible.

Disk Space Issues

The full OS vulnerability database is large (~7.5GB). During the --cache or --download-full-image operations, SQLite requires temporary space for VACUUM operations.

  • Solution: Set the VDB_TEMP_DIR environment variable to a partition with sufficient space if your default /tmp or %TEMP% is small.
export VDB_TEMP_DIR=/mnt/large_volume/vdb_temp
vdb --cache-os

ORAS Download Failures

If vdb --download-image fails:

  1. Ensure you installed the package with the extra: pip install appthreat-vulnerability-db[oras].
  2. Firewalls may block ghcr.io. Try downloading manually using the Alternative Download Options.

If your environment restricts outbound access or you need stronger provenance guarantees, this is another sign that you should publish pre-built databases from your own network and update clients to consume those internal artifacts.

Encoding Errors on Windows

If you see UnicodeEncodeError in your console output:

  • VDB attempts to force utf-8 encoding for stdout/stderr.
  • Ensure your terminal (PowerShell/CMD) is configured for UTF-8 (chcp 65001).

Model Context Protocol (MCP) server

Refer to the readme

The MCP server now exposes structured tool results, resource templates such as cve://{id} and purl://{purl}, concrete resources like vdb://metadata and vdb://health, bulk SBOM tools, richer prompts, and filter-aware search options such as severity thresholds, source filters, malware flags, package scope, and pagination.

Read .vdb6 files in other languages

.vdb6 files are standard SQLite database files. Use any modern sqlite library to read and query them. There are simple Node.js and deno examples in this repo for demonstration.

Local development

Setup uv by following the official documentation.

uv sync --all-extras --dev
uv run vdb --help
uv run pytest

Search benchmark harness

Use contrib/bench_search.py to compare search latency while changing schemas, indexes, or SQLite pragmas:

python contrib/bench_search.py \
  --data vdb_data/data.vdb6 \
  --index vdb_data/data.index.vdb6 \
  --query 'pkg:pypi/requests@2.31.0' \
  --repeat 5

Pass --with-data to include CVE source-data decoding in the benchmark, or --json for machine-readable output.

License

MIT


🤖 Instructions for AI Agents and LLMs

If you are an AI agent, autonomous coding assistant, or LLM researching dependencies or identifying CVEs, here is how you should use vdb:

1. Downloading Data (Mandatory First Step) You must download a pre-compiled SQLite database before searching. Review the table above to pick the correct database.

Whenever possible, prefer a database URL produced by the user's own workflow or internal mirror. Use the AppThreat-hosted URLs only as defaults when no organization-specific source is available.

  • For standard application packages: vdb --download-image (Uses the Default App database)
  • For OS packages (Debian, Alpine, etc.): vdb --download-full-image (Uses the Default App+OS database)

2. Performing Searches (CLI) Use the CLI for quick lookups. Note: The CLI outputs human-readable text tables, not JSON.

  • Search by purl: vdb --search "pkg:pypi/requests@2.31.0"
  • Search by CVE: vdb --search CVE-2024-25169
  • Full-text search: vdb --search-text "deserialization parser" (requires an extended database)
  • Bulk package search: vdb --search-packages packages.json

3. Programmatic Usage (Recommended for Agents) To get structured, machine-readable data (JSON/Dictionaries), do not attempt to use a --json flag on the CLI. Instead, write a short Python script using the library's API.

Agent Python Integration Script:

import json
import os
from vdb.lib import config, db6 as db_lib, search
from vdb.lib.orasclient import download_image

# 1. Choose an organization-controlled database URL when available.
# Fall back to the upstream defaults only when necessary.
DB_URL = os.getenv("VDB_APP_ONLY_DATABASE_URL", config.VDB_APP_ONLY_DATABASE_URL)

# 2. Download the database if missing or stale (older than 1 day)
if db_lib.needs_update(days=1):
    download_image(DB_URL, config.DATA_DIR)

# 3. Perform the search
target = "pkg:pypi/xml2dict@0.2.2"
results = search.search_by_any(target, with_data=True)

# 4. Extract and parse the Pydantic CVE 5.2 models into standard JSON
output = []
for res in results:
    vuln = {
        "cve_id": res['cve_id'],
        "fixed_in": res['fix_version'],
    }
    # res['source_data'] is a Pydantic model. Use model_dump to serialize.
    if res.get('source_data'):
        vuln['cve_data'] = res['source_data'].model_dump(mode='json')
    output.append(vuln)

# Print standard JSON for the agent to read via stdout
print(json.dumps(output, indent=2))

💡 Pro-Tip for AI Agents: Direct SQLite Querying

The .vdb6 files downloaded to your user_data_dir are standard SQLite databases. If you are an AI agent needing to perform complex aggregations, bulk exports, or custom filtering, you can query the database directly using sqlite3:

# Example: Find all entries for a specific purl prefix directly in the index
sqlite3 ~/.local/share/vdb/data.index.vdb6 "SELECT * FROM cve_index WHERE purl_prefix LIKE 'pkg:npm/react%';"

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

appthreat_vulnerability_db-6.7.0.tar.gz (158.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

appthreat_vulnerability_db-6.7.0-py3-none-any.whl (115.9 kB view details)

Uploaded Python 3

File details

Details for the file appthreat_vulnerability_db-6.7.0.tar.gz.

File metadata

  • Download URL: appthreat_vulnerability_db-6.7.0.tar.gz
  • Upload date:
  • Size: 158.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for appthreat_vulnerability_db-6.7.0.tar.gz
Algorithm Hash digest
SHA256 6b4ef95942f777a0574f762224fd580eaae7e593ceb5dfea9dd5033a9ab422fd
MD5 fc679de517ed3a5412b287915ed71b31
BLAKE2b-256 04ac01030fd7252afbd14eb9b6e02b518ba57a79e8fa7fb4dc43506070371ea2

See more details on using hashes here.

File details

Details for the file appthreat_vulnerability_db-6.7.0-py3-none-any.whl.

File metadata

  • Download URL: appthreat_vulnerability_db-6.7.0-py3-none-any.whl
  • Upload date:
  • Size: 115.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for appthreat_vulnerability_db-6.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c0e7e0923e54b4919fb39d46e1782004c3db884e6e4f7ce4f138da55214b5da0
MD5 6af065860582e00d4173ee26865bedf8
BLAKE2b-256 3732bb303c65db349837aed10dfe2437d283e736070e8310506045f8d7594f89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page