Skip to main content

Generate datasets amd models based on vulnerabilities data from Vulnerability-Lookup.

Project description

VulnTrain

Latest release License PyPi version

VulnTrain offers a suite of commands to generate diverse AI datasets and train models using comprehensive vulnerability data from Vulnerability-Lookup. It harnesses over one million JSON records from all supported advisory sources to build high-quality, domain-specific models.

Additionally, data from the vulnerability-lookup:meta container, including enrichment sources such as vulnrichment and Fraunhofer FKIE, is incorporated to enhance model quality.

Check out the datasets and models on Hugging Face:

Model on HF

For more information about the use of AI in Vulnerability-Lookup, please refer to the user manual.

Usage

Install VulnTrain:

$ pipx install VulnTrain

Three types of commands are available:

  • Dataset generation: Create and prepare datasets.
  • Model training: Train models using the prepared datasets.
    • Train a model to classify vulnerabilities by severity. Model on HF
    • Train a model for text generation to assist in writing vulnerability descriptions Model on HF
  • Model validation: Assess the performance of trained models (validations, benchmarks, etc.).

Check out the documentation for more information.

How to cite

Bonhomme, C., & Dulaunoy, A. (2025). VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification (Version 1.4.0) [Computer software]. https://doi.org/10.48550/arXiv.2507.03607

@misc{bonhomme2025vlai,
    title={VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification},
    author={Cédric Bonhomme and Alexandre Dulaunoy},
    year={2025},
    eprint={2507.03607},
    archivePrefix={arXiv},
    primaryClass={cs.CR}
}

License

VulnTrain is licensed under GNU General Public License version 3

Copyright (c) 2025-2026 Computer Incident Response Center Luxembourg (CIRCL)
Copyright (C) 2025-2026 Cédric Bonhomme - https://github.com/cedricbonhomme
Copyright (C) 2025 Léa Ulusan - https://github.com/3LS3-1F

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vulntrain-2.2.0.tar.gz (258.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vulntrain-2.2.0-py3-none-any.whl (268.0 kB view details)

Uploaded Python 3

File details

Details for the file vulntrain-2.2.0.tar.gz.

File metadata

  • Download URL: vulntrain-2.2.0.tar.gz
  • Upload date:
  • Size: 258.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vulntrain-2.2.0.tar.gz
Algorithm Hash digest
SHA256 c6cedf7f35403d49ef36576ff115ffff1e3f88084ef909085657c093fa78afb6
MD5 8b315c3bf6178b7375e8f4d542e69b31
BLAKE2b-256 a8e0a5b3a8794e0cd569db4c0b2988cc6d3881d2b7442d47934f494a4e2bcabd

See more details on using hashes here.

Provenance

The following attestation bundles were made for vulntrain-2.2.0.tar.gz:

Publisher: release.yml on vulnerability-lookup/VulnTrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vulntrain-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: vulntrain-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 268.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vulntrain-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 78428aeea928b48514fd1640e7f24dcaa1212cf712bae5b0819f68146ed6edb4
MD5 e5fc712a93d890c807a4cb1a4d21e169
BLAKE2b-256 0d9743f59f3f3f4aacb3d55b8c81a03899e5ed81982da4009fee31ee2402b7e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for vulntrain-2.2.0-py3-none-any.whl:

Publisher: release.yml on vulnerability-lookup/VulnTrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page