AppThreat's vulnerability database and package search library with a built-in sqlite based storage. OSV, CVE, GitHub, npm are the primary sources of vulnerabilities.
Project description
Introduction
This repo is a vulnerability database and package search for sources such as AppThreat vuln-list, OSV, NVD, and GitHub. Vulnerability data are downloaded from the sources and stored in a sqlite based storage with indexes to allow offline access and efficient searches.
Why vulnerability db?
A good vulnerability database must have the following properties:
Multiple upstream sources are used by vdb to improve accuracy and reduce false negatives. SQLite database containing data in CVE 5.2 schema format is precompiled and distributed as files via ghcr to simplify download. With automatic purl prefix generation even for git repos, searches on the database can be performed with purl, cpe, or even http git url string. Every row in the database uses an open specification such as CVE 5.2 or Package URL (purl and vers) thus preventing the possibility of vendor lock-in.
Vulnerability Data sources
- Linux vuln-list (Forked from AquaSecurity)
- OSV (1)
- NVD
- GitHub
1 - We exclude Linux and oss-fuzz feeds by default. Set the environment variable OSV_INCLUDE_FUZZ=true to include them.
2 - Malware feeds are included by default, thus increasing the db size slightly. Set the environment variable OSV_EXCLUDE_MALWARE=true to exclude them.
Linux distros
- AlmaLinux
- Debian
- Alpine
- Amazon Linux
- Arch Linux
- RHEL/CentOS
- Rocky Linux
- Ubuntu
- OpenSUSE
- Photon
- Chainguard
- Wolfi OS
Installation
pip install appthreat-vulnerability-db>=6.7.0
To install vdb with optional dependencies such as oras use the [oras] or [all] dependency group.
pip install appthreat-vulnerability-db[all]
NOTE: VDB v6 is a major rewrite to use SQLite database. Current users of depscan v5 must continue using version 5.8.x
pip install appthreat-vulnerability-db==5.8.0
Usage
This package is ideal as a library for managing vulnerabilities. This is used by owasp-dep-scan, a free open-source dependency audit tool. However, there is a limited cli capability available with few features to test this tool directly.
[!IMPORTANT] The AppThreat-hosted database images and workflows are best treated as bootstrap or evaluation defaults. For production use, especially when you need larger variants such as app + OS, we strongly recommend creating and publishing your own pre-built database versions with your own CI/CD workflows and storage.
Why:
- Security / provenance: Your team controls when data is built, where it is published, and which upstream sources and retention windows are allowed.
- Performance: You can publish smaller, faster-to-download databases that match your environment instead of pulling a one-size-fits-all image.
- Cost control: Large variants such as app + OS require significant compute, disk, and network bandwidth. Running scheduled builds on self-hosted infrastructure lets you scale them intentionally and budget for them explicitly.
Option 1: Download AppThreat pre-built database (Quick start)
To download a pre-built SQLite database (refreshed every 12 hours) containing all application vulnerabilities (~ 700MB). This is the fastest way to evaluate vdb, bootstrap a workstation, or validate an integration.
# pip install appthreat-vulnerability-db[all]
vdb --download-image
You can execute this command daily or when a fresh database is required. For long-running production workflows, prefer mirroring or rebuilding this database inside your own environment and then distributing it from your own registry, object store, or artifact repository.
Metadata searches such as full-text, alias, reference, package-name, symbol, source, severity, date, and malware-aware filters require an extended database. To download the app-only extended artifact with the same command, override the app-only URL:
export VDB_APP_ONLY_DATABASE_URL=ghcr.io/appthreat/vdbxz-app-extended:v6.7.x
vdb --download-image
To perform containers and OS scans, download the full image (~ 7.5GB) which includes all application and OS vulnerabilities.
vdb --download-full-image
For the app+OS extended artifact, override the full database URL:
export VDB_DATABASE_URL=ghcr.io/appthreat/vdbxz-extended:v6.7.x
vdb --download-full-image
Because the full image is substantially larger and more expensive to build, test, and distribute, teams scanning containers or operating system packages should strongly prefer their own scheduled workflow that produces a tailored variant for the distros and time windows they actually support.
Use any sqlite browser or cli tools to load and query the two databases.
data.index.vdb6 - index db with purl prefix and vers
data.vdb6 - Contains CVE 5.2 source records normalized into cve_source_data and package/version locator rows in cve_data.
Database layout note for v6.7+
VDB 6.7 normalizes repeated CVE 5.2 source blobs into a hash-keyed cve_source_data table and stores package/version locator rows separately in cve_data. It also keeps extended search metadata opt-in: default public databases leave cve_metadata and cve_metadata_text empty to minimize data.index.vdb6 size, while *-extended database variants populate those tables for text, alias, reference, symbol, severity, source, and date-aware searches. Public database artifacts are rebuilt from scratch by the release workflows, so schema migrations for older .vdb6 files are not maintained.
Option 2: Download pre-built database (ORAS)
Using ORAS cli might be slightly faster.
export VDB_HOME=$HOME/vdb
oras pull ghcr.io/appthreat/vdbxz:v6.7.x -o $VDB_HOME
tar -xvf *.tar.xz
rm *.tar.xz
Use the matching *-extended image, for example ghcr.io/appthreat/vdbxz-app-extended:v6.7.x or ghcr.io/appthreat/vdbxz-extended:v6.7.x, when metadata search APIs are required.
Option 3: Use HuggingFace cli
Download one of the databases.
pip install -U "huggingface_hub[cli]"
app only database
export VDB_HOME=$(pwd)/app
huggingface-cli download AppThreat/vdb --include "app/*.vdb6" --repo-type dataset --local-dir .
app only 10 year database
export VDB_HOME=$(pwd)/app-10y
huggingface-cli download AppThreat/vdb --include "app-10y/*.vdb6" --repo-type dataset --local-dir .
app and os database
export VDB_HOME=$(pwd)/app-os
huggingface-cli download AppThreat/vdb --include "app-os/*.vdb6" --repo-type dataset --local-dir .
app and os 10 year database
export VDB_HOME=$(pwd)/app-os-10y
huggingface-cli download AppThreat/vdb --include "app-os-10y/*.vdb6" --repo-type dataset --local-dir .
Extended variants use the same names as the database variation table, such as app-extended/, app-2y-extended/, app-10y-extended/, app-os-extended/, and app-os-10y-extended/:
export VDB_HOME=$(pwd)/app-extended
huggingface-cli download AppThreat/vdb --include "app-extended/*.vdb6" --repo-type dataset --local-dir .
Citation
Use the below citation in your research.
@misc{vdb,
author = {Team AppThreat},
month = Feb,
title = {{AppThreat vulnerability-db}},
howpublished = {{https://huggingface.co/datasets/AppThreat/vdb}},
year = {2025}
}
Option 4: Build and publish your own database workflow (Recommended for production)
If you depend on vdb regularly, build your own pre-built databases and publish them internally. This is the recommended approach for enterprises, security teams, and integrators.
Typical reasons to own the workflow:
- Publish from infrastructure you trust and control.
- Reduce supply-chain and availability dependencies on third-party hosted refresh jobs.
- Tune the database scope for your environment to reduce artifact size and download time.
- Use self-hosted runners or dedicated build machines for larger app + OS datasets, where compute, storage, and transfer costs are significant.
At a high level, the workflow is:
- Set the retention and distro selection environment variables for your environment.
- Run
vdb --cacheorvdb --cache-oson scheduled infrastructure. - Package the resulting
.vdb6files. - Publish them to your own OCI registry, object store, file share, or artifact repository.
- Point clients and integrations to your published URL instead of the AppThreat default.
Cache application vulnerabilities
vdb --cache
Build an extended database with metadata search support:
vdb --cache --include-metadata
The equivalent environment toggle is VDB_INCLUDE_METADATA=true.
To remove any existing databases:
vdb --clean
The typical size of this database is over 700 MB.
Cache from just OSV
vdb --cache --only-osv
It is possible to customize the cache behavior by increasing the historic data period to cache by setting the following environment variables. See the environment variables reference for the complete list.
- NVD_START_YEAR - Default: 2020. Supports up to 2002
- GITHUB_PAGE_COUNT - Default: 2. Supports up to 20
Cache application and OS vulnerabilities
vdb --cache-os
Build an extended application and OS database with metadata search support:
vdb --cache-os --include-metadata
Cache builds print minimal progress updates, such as the source year being fetched or the latest CVE stored. Use --quiet to suppress the logo, logs, and progress output:
vdb --cache-os --quiet
Note the size of the database with OS vulnerabilities is over 7.5 GB. It is possible to ignore or include specific OS distros using environment variables.
Example to ignore almalinux and ubuntu data from getting included, set the below environment variables:
export VDB_IGNORE_ALMALINUX=true
export VDB_IGNORE_UBUNTU=true
Refer to the variable LINUX_DISTRO_VULN_LIST_PATHS in config.py for the full list of distro strings supported.
For example, a team that only scans modern application dependencies can build a much smaller artifact by using a recent NVD_START_YEAR and sticking to vdb --cache. A platform team that only supports a subset of Linux distros can use VDB_IGNORE_* or VDB_INCLUDE_* environment variables before running vdb --cache-os to avoid paying the build and distribution cost for irrelevant data.
Environment variables
Most boolean toggles treat true or 1 as enabled. Set path and build-scope variables before importing vdb.lib.config in long-running Python processes, because those values are read at import time.
Paths and local storage
| Variable | Default | Used by | Description |
|---|---|---|---|
VDB_HOME |
Platform user data directory for vdb |
CLI and library | Directory containing data.vdb6, data.index.vdb6, and vdb.meta. |
VDB_CACHE |
Platform user cache directory for vdb |
vdb --cache-os / AquaSource |
Cache directory. If $VDB_CACHE/vuln-list.zip exists, it is used instead of downloading AppThreat vuln-list. |
VDB_TEMP_DIR |
System temp directory | SQLite setup | Directory for SQLite temporary files during large builds and VACUUM operations. Use a partition with enough free space for app+OS builds. |
Pre-built database downloads
| Variable | Default | Used by | Description |
|---|---|---|---|
VDB_APP_ONLY_DATABASE_URL |
ghcr.io/appthreat/vdbxz-app:v6.7.x |
vdb --download-image, MCP auto-download, integrations |
OCI image URL for the default app-only database. Override this to consume an internally published artifact. |
VDB_DATABASE_URL |
ghcr.io/appthreat/vdbxz:v6.7.x, or ghcr.io/appthreat/vdbxz-10y:v6.7.x when USE_VDB_10Y=true |
vdb --download-full-image, integrations |
OCI image URL for the app+OS database. |
VDB_EXTENDED_DATABASE_URL |
ghcr.io/appthreat/vdbxz-extended:v6.7.x |
Integrations | Config constant for the app+OS extended database with metadata tables populated. |
VDB_APP_ONLY_EXTENDED_DATABASE_URL |
ghcr.io/appthreat/vdbxz-app-extended:v6.7.x |
Integrations | Config constant for the app-only extended database with metadata tables populated. |
USE_VDB_10Y |
unset | VDB_DATABASE_URL default selection |
When enabled, changes the default app+OS download URL to the 10-year image. |
Source and feed selection for builds
| Variable | Default | Used by | Description |
|---|---|---|---|
NVD_START_YEAR |
2020 |
NVD/AppThreat vuln-list conversion | Start year for NVD-style CVE data. Older years increase coverage, build time, and database size. |
GITHUB_TOKEN |
unset | GitHub advisory ingestion | Token used for the GitHub GraphQL API. Avoid printing this value in logs. |
GITHUB_GRAPHQL_URL |
https://api.github.com/graphql |
GitHub advisory ingestion | Alternate GitHub GraphQL endpoint, useful for testing or enterprise proxies. |
GITHUB_PAGE_COUNT |
2 |
GitHub advisory ingestion | Number of GitHub advisory GraphQL pages to fetch during a full refresh. |
NPM_PAGE_COUNT |
2 |
npm advisory configuration | Number of npm advisory pages to fetch where npm advisory ingestion is used. |
OSV_INCLUDE_FUZZ |
unset | OSV ingestion | Include Linux, OSS-Fuzz, and Android OSV feeds that are excluded by default to reduce false positives. Any non-empty value enables this. |
OSV_EXCLUDE_MALWARE |
unset | OSV conversion | Exclude OSV malware advisories whose identifiers start with MAL. Any non-empty value enables this. |
VDB_OSV_STORE_BATCH_SIZE |
100 |
OSV ingestion | Number of converted OSV records to store per database batch when building directly from OSV feeds. Invalid values fall back to 100; minimum is 1. |
VDB_INCLUDE_METADATA |
unset | CLI, storage, search metadata indexes | Populate extended metadata tables for full-text, alias, reference, package-name, symbol, severity, source, and date-aware searches. Equivalent to --include-metadata. |
VDB_METADATA_* |
unset | vdb.meta creation |
Adds custom build metadata to vdb.meta; the prefix is stripped and the key is lowercased. Values true/1 and false/0 are stored as booleans. |
OS distro filtering for builds
| Variable | Default | Used by | Description |
|---|---|---|---|
VDB_IGNORE_OS |
unset | OSV source configuration | Skip OSV operating-system feeds added by default. Use app-only workflows for the smallest app database. |
VDB_IGNORE_<DISTRO> |
unset | AppThreat vuln-list and selected OSV distro feeds | Exclude distro-specific data. Static OSV toggles are VDB_IGNORE_ALMALINUX, VDB_IGNORE_ALPINE, VDB_IGNORE_REDHAT, VDB_IGNORE_DEBIAN, VDB_IGNORE_ROCKYLINUX, VDB_IGNORE_MAGEIA, VDB_IGNORE_ALPAQUITA, and VDB_IGNORE_MINIMOS; AppThreat vuln-list also supports the distro keys in LINUX_DISTRO_VULN_LIST_PATHS. |
VDB_EXCLUDE_<DISTRO> |
unset | AppThreat vuln-list filtering | Alias for excluding AppThreat vuln-list distro paths. |
VDB_INCLUDE_<DISTRO> |
unset | AppThreat vuln-list filtering | Force-include distro paths from LINUX_DISTRO_VULN_LIST_PATHS, such as VDB_INCLUDE_ALPINE=true or VDB_INCLUDE_SUSE=true. |
VDB_INCLUDE_SUSE |
unset | AppThreat vuln-list filtering | Include SUSE vuln-list data, which is ignored by default for performance. This is also covered by VDB_INCLUDE_<DISTRO>. |
Output and SQLite tuning
| Variable | Default | Used by | Description |
|---|---|---|---|
VDB_QUIET |
unset | CLI | Suppress logo, logs, and cache progress output. Equivalent to --quiet. |
VDB_PROGRESS_INTERVAL |
10000 |
Source progress callbacks | Minimum number of records between progress messages. Invalid values fall back to 10000; minimum is 1. |
VDB_SQLITE_IMMUTABLE |
unset | Search database connections | Open file-backed search databases with SQLite's immutable URI option for read-only deployments where .vdb6 files are not modified while the process is running. |
VDB_SQLITE_CACHE_SIZE |
-65536 |
SQLite setup | Value passed to PRAGMA cache_size. The default is approximately 64 MiB using SQLite's negative KiB convention. |
VDB_SQLITE_JOURNAL_MODE |
DELETE |
SQLite setup | Value passed to PRAGMA journal_mode. Valid values are DELETE, TRUNCATE, PERSIST, MEMORY, WAL, and OFF; invalid values fall back to DELETE. |
VDB_SQLITE_SYNCHRONOUS |
OFF |
SQLite setup | Value passed to PRAGMA synchronous. Valid values are OFF, NORMAL, FULL, EXTRA, or 0-3; invalid values fall back to OFF. |
PYTHONIOENCODING |
unset | Windows CLI/MCP startup | If unset on Windows, VDB reconfigures standard streams to UTF-8. |
MCP server
| Variable | Default | Used by | Description |
|---|---|---|---|
VDB_AGE_DAYS |
2 |
MCP server | Number of days before the MCP server considers the local database stale and attempts an app-only ORAS download. Non-numeric values are passed through to the freshness check, so prefer an integer string. |
Available Database Variations
VDB provides multiple pre-built databases optimized for different use cases, balancing data depth and file size. Both ORAS (ghcr.io) and HuggingFace datasets are updated every 12 hours.
Treat the variants below as reference baselines. They are useful defaults, but many teams should create their own equivalents with narrower scope, longer retention, or distro-specific filtering and publish them through their own delivery pipeline.
Note for AI Agents: Use this table to decide which database URL to pass to the download_image() function based on the user's requirements.
| Database Scope | Time Context | ORAS Image URL (v6.7.x) |
HuggingFace Path | Uncompressed data.vdb6 GiB |
Uncompressed index GiB | .tar.xz data GiB |
.tar.xz index GiB |
.zst data GiB |
.zst index GiB |
Recommended Use Case |
|---|---|---|---|---|---|---|---|---|---|---|
| App Only | 2 Years (2024+) | ghcr.io/appthreat/vdbxz-app-2y:v6.7.x | app-2y/ | 2.05 | 0.26 | 0.13 | 0.03 | 0.19 | 0.04 | Fast, lightweight scans for very modern applications. |
| App Only | 2 Years Extended (2024+) | ghcr.io/appthreat/vdbxz-app-2y-extended:v6.7.x | app-2y-extended/ | 2.05 | 0.73 | 0.13 | 0.07 | 0.19 | 0.10 | App-only scans that need metadata search APIs. |
| App Only | Default (2020+) | ghcr.io/appthreat/vdbxz-app:v6.7.x | app/ | 2.96 | 0.43 | 0.21 | 0.05 | 0.31 | 0.07 | Standard application dependency scanning. |
| App Only | Default Extended (2020+) | ghcr.io/appthreat/vdbxz-app-extended:v6.7.x | app-extended/ | 2.96 | 1.05 | 0.21 | 0.11 | 0.31 | 0.15 | Default app database with all metadata search APIs. |
| App Only | 10 Years (2016+) | ghcr.io/appthreat/vdbxz-app-10y:v6.7.x | app-10y/ | 3.52 | 0.55 | 0.24 | 0.06 | 0.37 | 0.08 | Deep auditing of legacy application software. |
| App Only | 10 Years Extended (2016+) | ghcr.io/appthreat/vdbxz-app-10y-extended:v6.7.x | app-10y-extended/ | 3.52 | 1.26 | 0.24 | 0.13 | 0.37 | 0.17 | Legacy app audits that need metadata search APIs. |
| App + OS | Default (2020+) | ghcr.io/appthreat/vdbxz:v6.7.x | app-os/ | 42.36 | 4.66 | 3.42 | 0.20 | 5.23 | 0.30 | Standard container and OS-level package scanning. |
| App + OS | Default Extended (2020+) | ghcr.io/appthreat/vdbxz-extended:v6.7.x | app-os-extended/ | 42.36 | 8.14 | 3.42 | 0.37 | 5.23 | 0.56 | Container and OS scans that need metadata search APIs. |
| App + OS | 10 Years (2016+) | ghcr.io/appthreat/vdbxz-10y:v6.7.x | app-os-10y/ | 47.55 | 5.28 | 3.74 | 0.23 | 5.71 | 0.35 | Deep auditing of legacy Linux containers/VMs. |
| App + OS | 10 Years Extended (2016+) | ghcr.io/appthreat/vdbxz-10y-extended:v6.7.x | app-os-10y-extended/ | 47.55 | 9.19 | 3.74 | 0.43 | 5.71 | 0.65 | Legacy OS audits that need metadata search APIs. |
(Note: The ORAS URLs above use .tar.xz compression. You can replace vdbxz with vdbzst in the URL if you prefer Zstandard compression).
If you operate your own workflow, keep the same naming pattern internally if it helps downstream tooling, but publish from infrastructure you control. This lets you swap in smaller app-only images, distro-restricted OS images, or longer-retention images without waiting on shared hosted workflows.
Custom Vulnerability Data
VDB supports loading custom vulnerability data from a local directory at runtime. This allows you to:
- Add Private Vulnerabilities: Include internal CVEs that are not public.
- Override False Positives: Correct data returned by the official database by marking specific versions as
unaffected.
Custom data must follow the CVE 5.2 JSON Schema. Supported file extensions are .json, .yaml, .yml, and .toml.
To use custom data, pass the directory path to the --custom-data argument.
vdb --search pkg:npm/my-lib@1.0.0 --custom-data /path/to/custom/vulns
Example 1: Adding a Private Vulnerability
Create a file private-vuln.yaml. Since you are defining a new vulnerability record, you use the cna container.
dataType: CVE_RECORD
dataVersion: "5.2"
cveMetadata:
cveId: PRIVATE-2025-001
assignerOrgId: 00000000-0000-4000-8000-000000000000
state: PUBLISHED
datePublished: "2025-01-01T00:00:00Z"
dateUpdated: "2025-01-01T00:00:00Z"
containers:
cna:
providerMetadata:
orgId: 00000000-0000-4000-8000-000000000000
descriptions:
- lang: en
value: "Private vulnerability in internal library"
affected:
- vendor: internal
product: my-lib
packageName: my-lib
packageURL: pkg:npm/my-lib
versions:
- version: "1.0.0"
status: affected
versionType: semver
lessThan: "2.0.0"
Example 2: Overriding a False Positive
If the official database reports CVE-2023-9999 for pkg:pypi/requests but you have determined it is a false positive for your specific version, you can override it using an ADP (Authorized Data Publisher) container. This is the recommended way to append or dispute existing vulnerability data.
Logic: If a CVE ID and Package URL combination exists in your custom data, VDB will ignore the entry from the official database and use yours instead.
Create override.yaml:
dataType: CVE_RECORD
dataVersion: "5.2"
cveMetadata:
cveId: CVE-2023-9999
assignerOrgId: 00000000-0000-4000-8000-000000000000
state: PUBLISHED
containers:
# Use 'adp' to append/modify existing vulnerability data
adp:
- providerMetadata:
orgId: 00000000-0000-4000-8000-000000000000
shortName: "MySecTeam"
descriptions:
- lang: en
value: "Override to mark specific version as unaffected"
affected:
- product: requests
packageName: requests
packageURL: pkg:pypi/requests
versions:
# Explicitly mark your version as unaffected
- version: "2.31.0"
status: unaffected
versionType: semver
CLI Usage
usage: vdb [-h] [--clean] [--cache] [--cache-os] [--only-osv] [--only-aqua] [--only-ghsa] [--include-metadata] [--quiet] [--search SEARCH] [--search-text SEARCH_TEXT] [--search-alias SEARCH_ALIAS]
[--search-reference SEARCH_REFERENCE] [--search-package-name SEARCH_PACKAGE_NAME] [--search-symbol SEARCH_SYMBOL] [--search-packages SEARCH_PACKAGES]
[--batch-size BATCH_SIZE] [--list-malware] [--bom BOM_FILE] [--download-image] [--download-full-image] [--print-vdb-metadata]
[--custom-data CUSTOM_DATA]
AppThreat's vulnerability database and package search library with a sqlite storage.
options:
-h, --help show this help message and exit
--clean Clear the vulnerability database cache from platform specific user_data_dir.
--cache Cache vulnerability information in platform specific user_data_dir.
--cache-os Cache OS vulnerability information in platform specific user_data_dir.
--only-osv Use only OSV as the source. Use with --cache.
--only-aqua Use only Aqua vuln-list as the source. Use with --cache.
--only-ghsa Use only recent ghsa as the source. Use with --cache.
--include-metadata Populate extended metadata tables for text, alias, reference, symbol, severity, and source searches. Increases index database size.
--quiet Suppress logo, logs, and cache progress output.
--search SEARCH Search for the package or vulnerability ID (CVE, GHSA, ALSA, DSA, etc.) in the database. Use purl, cpe, or git http url.
--search-text SEARCH_TEXT
Perform metadata/full-text search across vulnerability descriptions, aliases, references, and affected symbols.
--search-alias SEARCH_ALIAS
Search vulnerability aliases such as GHSA, OSV, or vendor advisory identifiers.
--search-reference SEARCH_REFERENCE
Search reference URLs and reference text in vulnerability metadata.
--search-package-name SEARCH_PACKAGE_NAME
Search vulnerability metadata by package name or namespace/name.
--search-symbol SEARCH_SYMBOL
Search affected functions or modules captured in vulnerability metadata.
--search-packages SEARCH_PACKAGES
Path to a JSON file containing a list of package locators to search in bulk. Each item may include purl, cpe, url, alias, package_name, or search.
--batch-size BATCH_SIZE
Batch size to use with --search-packages.
--list-malware List latest malwares with CVE ID beginning with MAL-.
--bom BOM_FILE Search for packages in the CycloneDX BOM file.
--download-image Downloaded pre-created vdb image to platform specific user_data_dir. Application vulnerabilities only.
--download-full-image
Downloaded pre-created vdb image to platform specific user_data_dir. All vulnerabilities including OS.
--print-vdb-metadata Display metadata about the current vdb in user_data_dir.
--custom-data CUSTOM_DATA
Path to directory containing custom vulnerability data (JSON/YAML/TOML) to override/augment results.
CLI search
It is possible to perform a range of searches using the cli.
vdb --search pkg:pypi/xml2dict@0.2.2
# Search based on a purl prefix
vdb --search pkg:pypi/xml2dict
# Full url and short form for swift
vdb --search "pkg:swift/github.com/vapor/vapor@4.39.0"
vdb --search "pkg:swift/vapor/vapor@4.89.0"
# Search by cpe
vdb --search "cpe:2.3:a:npm:gitblame:*:*:*:*:*:*:*:*"
# Search by colon separated values
vdb --search "npm:gitblame:0.0.1"
# Search by vulnerability id (CVE, GHSA, ALSA, DSA, etc.)
vdb --search CVE-2024-25169
# Search with wildcard for CVE
vdb --search CVE-2025-%
# Search by git url
vdb --search "https://github.com/electron/electron"
# Search by CycloneDX SBOM
vdb --bom bom.json
# Full-text search across descriptions, aliases, references, and symbols.
# These metadata searches require an extended database built with
# --include-metadata or downloaded from a *-extended artifact.
vdb --search-text "deserialization parser"
# Search explicit alias/reference/package metadata indexes
vdb --search-alias GHSA-gh-9999
vdb --search-reference github.com/example/widget-ui
vdb --search-package-name widget-ui
vdb --search-symbol parse_payload
# Bulk search from a JSON file containing package locator objects
vdb --search-packages packages.json
# Stream bulk lookups in smaller batches
vdb --search-packages packages.json --batch-size 25
Example packages.json for --search-packages:
[
{ "purl": "pkg:pypi/requests@2.31.0" },
{ "url": "https://github.com/pallets/flask" },
{ "alias": "GHSA-xxxx-yyyy-zzzz" },
{ "package_name": "flask" }
]
List recent malware
To list malware entries with the MAL- prefix, use the following command.
vdb --list-malware
For Integrators
The vulnerability-db is designed to be embedded into other Python applications.
Basic Integration
### AI/Agent Integration Script
# AI Agents: Use this script pattern to query vdb and get structured JSON output.
import json
import os
from vdb.lib import config, db6 as db_lib, search
from vdb.lib.orasclient import download_image
# 1. Prefer an organization-controlled URL. Fall back to the upstream default
# only for local testing or bootstrap scenarios.
DB_URL = os.getenv("VDB_APP_ONLY_DATABASE_URL", config.VDB_APP_ONLY_DATABASE_URL)
# 2. Download the database if missing or stale (older than 1 day)
if db_lib.needs_update(days=1):
download_image(DB_URL, config.DATA_DIR)
# 3. Load Custom Data (Optional)
# This will override DB results with local YAML/JSON definitions
# search.load_custom_data("/path/to/custom_vulns")
# 4. Perform the search
target = "pkg:pypi/xml2dict@0.2.2"
results = search.search_by_any(target, with_data=True)
# 5. Extract and parse the Pydantic CVE 5.2 models into standard JSON
output = []
for res in results:
vuln = {
"cve_id": res['cve_id'],
"fixed_in": res['fix_version'],
}
# res['source_data'] is a Pydantic model. Use model_dump to serialize.
if res.get('source_data'):
vuln['cve_data'] = res['source_data'].model_dump(mode='json')
output.append(vuln)
# Print standard JSON for the agent to read via stdout
print(json.dumps(output, indent=2))
For production deployments, point VDB_APP_ONLY_DATABASE_URL or VDB_DATABASE_URL at the artifacts produced by your own workflow so application instances do not depend directly on AppThreat-hosted refresh jobs. If an integration needs metadata search APIs, use the corresponding extended artifact URL, such as VDB_APP_ONLY_EXTENDED_DATABASE_URL / VDB_EXTENDED_DATABASE_URL, or override the app/full download URL to your own *-extended image.
Advanced Usage
Batching and Generators
When processing large SBOMs, search_by_cdx_bom yields a generator to reduce memory usage.
results_generator = search.search_by_cdx_bom("bom.json", with_data=True)
for result_batch in results_generator:
for res in result_batch:
# Process individual vulnerability result
pass
Custom Database Locations
If you are managing the database files manually or in a custom location, ensure config.DATA_DIR is set via environment variable VDB_HOME before importing the library, or update the vdb.lib.config paths dynamically.
For read-only deployments where .vdb6 files are not modified while the process is running, set VDB_SQLITE_IMMUTABLE=true to open search databases with SQLite's immutable URI option.
Result Structure The results returned by search functions are dictionaries containing:
cve_id: The vulnerability identifier.source_data: A Pydantic model (vdb.lib.cve_model.CVE) of the CVE 5.2 data.vers: The version range string from the index.fix_version: The specific version where the issue is resolved (if applicable).
Bulk and Filtered Searches
VDB includes higher-level APIs for bulk package searches, BOM summaries, and filter-aware lookups.
from vdb.lib import search
filters = {
"severity_threshold": "HIGH",
"sources": ["osv", "github"],
"exclude_malware": True,
"package_ecosystem": "pypi",
"page_size": 25,
}
package_results = search.search_packages(
[
{"purl": "pkg:pypi/requests@2.31.0"},
{"url": "https://github.com/pallets/flask"},
{"cpe": "cpe:2.3:a:npm:lodash:4.17.20:*:*:*:*:*:*:*"},
],
with_data=False,
filters=filters,
)
for pkg in package_results:
print(pkg["locator"], pkg["result_count"], pkg["max_severity"])
For very large package sets, stream them in batches:
for batch in search.search_packages_batched(packages, batch_size=100, with_data=False):
for pkg in batch:
print(pkg["locator"], pkg["result_count"])
CycloneDX BOMs can be summarized or expanded into detailed package findings:
summary = search.search_bom_summary("bom.json", filters={"severity_threshold": "MEDIUM"})
detailed = search.search_bom_detailed("bom.json", with_data=True)
Text, Alias, Reference, and Symbol Search
In addition to locator-based lookups, VDB supports metadata and full-text search over aliases, references, descriptions, package names, and affected symbols.
Default v6.7+ databases omit extended metadata to reduce index size, so text, alias, reference, symbol, source, severity, malware, and date-aware metadata searches require a database built with --include-metadata or downloaded from a *-extended artifact. When metadata is absent, metadata-specific APIs fail safe by returning no metadata matches.
search.search_by_alias("GHSA-xxxx-yyyy-zzzz", with_data=False)
search.search_by_reference("github.com/pallets/flask", with_data=False)
search.search_by_package_name("flask", with_data=False)
search.search_by_symbol("parse_payload", with_data=False)
search.search_full_text("deserialization parser", with_data=False)
These APIs use the SQLite metadata tables in data.index.vdb6. Normal locator searches still work without metadata; choose an extended database when your integration depends on these APIs or metadata filters.
Troubleshooting
Database Locked Errors
VDB uses SQLite. If you encounter apsw.BusyError or "database is locked":
- Ensure you are not running multiple
vdb --cacheprocesses simultaneously. - If using VDB in a multi-threaded application, ensure you are treating the database connections as read-only where possible.
Disk Space Issues
The full OS vulnerability database is large (~7.5GB). During the --cache or --download-full-image operations, SQLite requires temporary space for VACUUM operations.
- Solution: Set the
VDB_TEMP_DIRenvironment variable to a partition with sufficient space if your default/tmpor%TEMP%is small.
export VDB_TEMP_DIR=/mnt/large_volume/vdb_temp
vdb --cache-os
ORAS Download Failures
If vdb --download-image fails:
- Ensure you installed the package with the extra:
pip install appthreat-vulnerability-db[oras]. - Firewalls may block
ghcr.io. Try downloading manually using the Alternative Download Options.
If your environment restricts outbound access or you need stronger provenance guarantees, this is another sign that you should publish pre-built databases from your own network and update clients to consume those internal artifacts.
Encoding Errors on Windows
If you see UnicodeEncodeError in your console output:
- VDB attempts to force utf-8 encoding for stdout/stderr.
- Ensure your terminal (PowerShell/CMD) is configured for UTF-8 (
chcp 65001).
Model Context Protocol (MCP) server
Refer to the readme
The MCP server now exposes structured tool results, resource templates such as cve://{id} and purl://{purl}, concrete resources like vdb://metadata and vdb://health, bulk SBOM tools, richer prompts, and filter-aware search options such as severity thresholds, source filters, malware flags, package scope, and pagination.
Read .vdb6 files in other languages
.vdb6 files are standard SQLite database files. Use any modern sqlite library to read and query them. There are simple Node.js and deno examples in this repo for demonstration.
Local development
Setup uv by following the official documentation.
uv sync --all-extras --dev
uv run vdb --help
uv run pytest
Search benchmark harness
Use contrib/bench_search.py to compare search latency while changing schemas, indexes, or SQLite pragmas:
python contrib/bench_search.py \
--data vdb_data/data.vdb6 \
--index vdb_data/data.index.vdb6 \
--query 'pkg:pypi/requests@2.31.0' \
--repeat 5
Pass --with-data to include CVE source-data decoding in the benchmark, or --json for machine-readable output.
License
MIT
🤖 Instructions for AI Agents and LLMs
If you are an AI agent, autonomous coding assistant, or LLM researching dependencies or identifying CVEs, here is how you should use vdb:
1. Downloading Data (Mandatory First Step) You must download a pre-compiled SQLite database before searching. Review the table above to pick the correct database.
Whenever possible, prefer a database URL produced by the user's own workflow or internal mirror. Use the AppThreat-hosted URLs only as defaults when no organization-specific source is available.
- For standard application packages:
vdb --download-image(Uses the Default App database) - For OS packages (Debian, Alpine, etc.):
vdb --download-full-image(Uses the Default App+OS database)
2. Performing Searches (CLI) Use the CLI for quick lookups. Note: The CLI outputs human-readable text tables, not JSON.
- Search by purl:
vdb --search "pkg:pypi/requests@2.31.0" - Search by CVE:
vdb --search CVE-2024-25169 - Full-text search:
vdb --search-text "deserialization parser"(requires an extended database) - Bulk package search:
vdb --search-packages packages.json
3. Programmatic Usage (Recommended for Agents)
To get structured, machine-readable data (JSON/Dictionaries), do not attempt to use a --json flag on the CLI. Instead, write a short Python script using the library's API.
Agent Python Integration Script:
import json
import os
from vdb.lib import config, db6 as db_lib, search
from vdb.lib.orasclient import download_image
# 1. Choose an organization-controlled database URL when available.
# Fall back to the upstream defaults only when necessary.
DB_URL = os.getenv("VDB_APP_ONLY_DATABASE_URL", config.VDB_APP_ONLY_DATABASE_URL)
# 2. Download the database if missing or stale (older than 1 day)
if db_lib.needs_update(days=1):
download_image(DB_URL, config.DATA_DIR)
# 3. Perform the search
target = "pkg:pypi/xml2dict@0.2.2"
results = search.search_by_any(target, with_data=True)
# 4. Extract and parse the Pydantic CVE 5.2 models into standard JSON
output = []
for res in results:
vuln = {
"cve_id": res['cve_id'],
"fixed_in": res['fix_version'],
}
# res['source_data'] is a Pydantic model. Use model_dump to serialize.
if res.get('source_data'):
vuln['cve_data'] = res['source_data'].model_dump(mode='json')
output.append(vuln)
# Print standard JSON for the agent to read via stdout
print(json.dumps(output, indent=2))
💡 Pro-Tip for AI Agents: Direct SQLite Querying
The .vdb6 files downloaded to your user_data_dir are standard SQLite databases. If you are an AI agent needing to perform complex aggregations, bulk exports, or custom filtering, you can query the database directly using sqlite3:
# Example: Find all entries for a specific purl prefix directly in the index
sqlite3 ~/.local/share/vdb/data.index.vdb6 "SELECT * FROM cve_index WHERE purl_prefix LIKE 'pkg:npm/react%';"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file appthreat_vulnerability_db-6.7.0.tar.gz.
File metadata
- Download URL: appthreat_vulnerability_db-6.7.0.tar.gz
- Upload date:
- Size: 158.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b4ef95942f777a0574f762224fd580eaae7e593ceb5dfea9dd5033a9ab422fd
|
|
| MD5 |
fc679de517ed3a5412b287915ed71b31
|
|
| BLAKE2b-256 |
04ac01030fd7252afbd14eb9b6e02b518ba57a79e8fa7fb4dc43506070371ea2
|
File details
Details for the file appthreat_vulnerability_db-6.7.0-py3-none-any.whl.
File metadata
- Download URL: appthreat_vulnerability_db-6.7.0-py3-none-any.whl
- Upload date:
- Size: 115.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0e7e0923e54b4919fb39d46e1782004c3db884e6e4f7ce4f138da55214b5da0
|
|
| MD5 |
6af065860582e00d4173ee26865bedf8
|
|
| BLAKE2b-256 |
3732bb303c65db349837aed10dfe2437d283e736070e8310506045f8d7594f89
|