Skip to main content

AI-powered CVE prioritizer

Project description

HAIstings

HAIstings is an AI-powered companion designed to help you assess and prioritize Common Vulnerabilities and Exposures (CVEs) within your Kubernetes infrastructure. Drawing inspiration from Agatha Christie's legendary character Arthur Hastings, the crime-solving partner of Hercule Poirot, HAIstings partners with you to ensure robust security measures in your Kubernetes environments.

Overview

HAIstings analyzes vulnerability reports from tools like trivy-operator, generates prioritized reports, and engages in an interactive conversation to refine its recommendations based on your specific context and requirements.

Features

  • Vulnerability Prioritization: Automatically prioritizes vulnerabilities based on severity, impact, and context
  • Interactive Refinement: Engages in a conversation to gather more context and refine prioritization
  • Infrastructure Context: Ingests infrastructure repository information to provide more relevant recommendations
  • Persistent Memory: Maintains conversation history across sessions using checkpoints
  • Customizable Output: Adjusts recommendations based on user-provided context
  • Retrieval-Augmented Generation (RAG): Selectively includes only relevant infrastructure files in the context, reducing overall context size and improving performance

Installation

Prerequisites

  • Python 3.12
  • Kubernetes cluster with trivy-operator installed
  • Properly configured kubeconfig file

Using Poetry

# Clone the repository
git clone https://github.com/stacklok/HAIstings.git
cd HAIstings

# Install dependencies
poetry install

Using pip

pip install haistings

Usage

Basic Usage

Generate a vulnerability report showing the top 25 most critical vulnerabilities:

haistings

Customizing Output

Specify the number of vulnerabilities to show:

haistings --top 30

Providing Context

Provide additional context to improve prioritization:

haistings --notes usercontext.txt

Where usercontext.txt contains information about your infrastructure, such as:

example-service is a very critical service that is internet-facing. We should assign more priority to it.

Flux is critical to our infrastructure, so if it has a vulnerability on anything related to how it processes git requests, then we should assign it very high priority.

Ingesting Infrastructure Repository

Provide your infrastructure repository for additional context:

haistings --infra-repo https://github.com/yourusername/infra-repo --gh-token YOUR_GITHUB_TOKEN

For a specific subdirectory:

haistings --infra-repo https://github.com/yourusername/infra-repo --infra-repo-subdir kubernetes --gh-token YOUR_GITHUB_TOKEN

RAG Configuration

Control the Retrieval-Augmented Generation functionality:

# Disable RAG (use traditional approach)
haistings --use-vectordb false

# Specify maximum number of relevant files per component
haistings --max-relevant-files 10

Persistent Conversations

Use SQLite to persist conversation history:

haistings --checkpoint-saver-driver sqlite

Full Example

haistings --top 30 --notes usercontext.txt --infra-repo https://github.com/yourusername/infra-repo --max-relevant-files 8 --checkpoint-saver-driver sqlite

How It Works

  1. Vulnerability Collection: HAIstings connects to your Kubernetes cluster and collects vulnerability reports from trivy-operator.
  2. Prioritization: Vulnerabilities are prioritized based on severity (critical vulnerabilities are weighted 10x more than high vulnerabilities).
  3. Repository Ingestion: Infrastructure repository files are ingested and stored in a vector database for efficient retrieval.
  4. Relevant File Retrieval: Using RAG (Retrieval-Augmented Generation), only the most relevant files for each vulnerability are retrieved based on similarity search.
  5. Context Integration: User-provided context and relevant infrastructure files are integrated into the analysis.
  6. Report Generation: A prioritized report is generated in a conversational style inspired by Arthur Hastings.
  7. Interactive Refinement: HAIstings engages in a conversation to gather more context and refine its recommendations.

Command Line Options

Option Description Default
--top Number of vulnerabilities to show 25
--notes Path to a file containing additional context None
--infra-repo URL to your infrastructure repository None
--infra-repo-subdir Subdirectory in the repository to ingest None
--gh-token GitHub Personal Access Token for private repositories None
--checkpoint-saver-driver Memory persistence driver (memory or sqlite) memory
--use-vectordb Use vector database for repository ingestion true
--max-relevant-files Maximum number of relevant files per component 5
--debug Enable debug mode False
--model LLM model to use (when not using CodeGate) this-makes-no-difference-to-codegate
--model-provider Model provider openai
--api-key API key for the model provider (when not using CodeGate) fake-api-key
--base-url Base URL for the model provider http://127.0.0.1:8989/v1/mux

Example Output

# HAIsting's Security Report

## Introduction

Good day! Arthur Hastings at your service. I've meticulously examined the vulnerability reports from your Kubernetes infrastructure and prepared a prioritized assessment of the security concerns that require your immediate attention.

## Summary

After careful analysis, I've identified several critical vulnerabilities that demand prompt remediation:

1. **example-service (internet-facing service)**
   - Critical vulnerabilities: 3
   - High vulnerabilities: 7
   - Most concerning: CVE-2023-1234 (Remote code execution)
   
   This service is particularly concerning due to its internet-facing nature, as mentioned in your notes. I recommend addressing these vulnerabilities with the utmost urgency.

2. **Flux (GitOps controller)**
   - Critical vulnerabilities: 2
   - High vulnerabilities: 5
   - Most concerning: CVE-2023-5678 (Git request processing vulnerability)
   
   As you've noted, Flux is critical to your infrastructure, and this Git request processing vulnerability aligns with your specific concerns.

[Additional entries...]

## Conclusion

I say, these vulnerabilities require prompt attention, particularly the ones affecting your internet-facing services and deployment controllers. I recommend addressing the critical vulnerabilities in example-service and Flux as your top priorities. Should you require any further assistance or have additional context to share, I remain at your service.

Development

Setting Up Development Environment

# Clone the repository
git clone https://github.com/stacklok/HAIstings.git
cd HAIstings

# Install dependencies including development dependencies
poetry install

# Run tests
poetry run pytest

Code Style

This project uses:

  • Black for code formatting
  • isort for import sorting
  • mypy for type checking
  • flake8 for linting
# Format code
poetry run black .
poetry run isort .

# Type check
poetry run mypy .

# Lint
poetry run flake8

Future Improvements / TODO

  • Custom Vulnerability Scoring: Add support for custom vulnerability scoring based on user-defined criteria beyond just severity.
  • Integration with More Scanners: Extend beyond trivy-operator to support other vulnerability scanners.
  • Visualization Dashboard: Create a web interface to visualize vulnerability reports and trends over time.
  • Automated Remediation Suggestions: Provide specific remediation steps for common vulnerabilities.
  • Multi-Cluster Support: Add support for analyzing vulnerabilities across multiple Kubernetes clusters.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haistings-0.0.2.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haistings-0.0.2-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file haistings-0.0.2.tar.gz.

File metadata

  • Download URL: haistings-0.0.2.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for haistings-0.0.2.tar.gz
Algorithm Hash digest
SHA256 2542d9ed9df6001eb8c93c86fe3dfe301ca2f1fe1f5e95816c3a217500d6a75a
MD5 4311263dbdbdfb5b3ec0fc4e007b93e7
BLAKE2b-256 73d40312dda3e3b1eb8f3d62d6304ef7d364b4ce510559b1924db369da111bc6

See more details on using hashes here.

Provenance

The following attestation bundles were made for haistings-0.0.2.tar.gz:

Publisher: publish.yml on StacklokLabs/HAIstings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file haistings-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: haistings-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for haistings-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4b0e9345258ef400d845d663a2c4d224d2aaa798cbb24ca4a394e82618793945
MD5 00d2ea4e9404a1bf92a8ae706e3ba5e9
BLAKE2b-256 c0931b675dd352ebd2e05b9b9da59a6845c80cfce31f868bf1e240e42ebdf454

See more details on using hashes here.

Provenance

The following attestation bundles were made for haistings-0.0.2-py3-none-any.whl:

Publisher: publish.yml on StacklokLabs/HAIstings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page