Skip to main content

LLM-powered cybersecurity news aggregator.

Project description

🔍 Lightman AI

Build Status PyPI version Docker version Python Version Ruff License

LLM-Powered Cybersecurity News Intelligence Platform


Lightman AI is an intelligent cybersecurity news aggregation and risk assessment platform that helps organizations stay ahead of potential security threats. By leveraging advanced AI agents, it automatically monitors cybersecurity news sources, analyzes content for relevance, and integrates with service desk systems for streamlined threat intelligence workflows.

✨ Key Features

  • 🤖 AI-Powered Classification: Uses OpenAI GPT and Google Gemini models to intelligently classify cybersecurity news
  • 📰 Automated News Aggregation: Monitors multiple cybersecurity news sources (TheHackerNews for now)
  • 🎯 Risk Scoring: Configurable relevance scoring to filter noise and focus on critical threats
  • 🔗 Service Desk Integration: Automatically creates tickets for identified security risks
  • 📊 Evaluation Framework: Built-in tools to test and optimize AI agent performance
  • ⚙️ Flexible Configuration: TOML-based configuration with multiple prompt templates
  • 🚀 CLI Interface: Simple command-line interface for automation and scripting

📖 Table of Contents

🚀 Quick Start

Installation

pip

  1. Install Lightman AI:

    pip install lightman_ai
    
  2. Configure your AI agent (OpenAI or Gemini):

    export OPENAI_API_KEY="your-api-key"
    # or
    export GOOGLE_API_KEY="your-api-key"
    

    or store you API KEYs in a .env file

    OPENAI_API_KEY="your-api-key"
    # or
    GOOGLE_API_KEY="your-api-key"
    
  3. Run the scanner:

    lightman run --agent openai --score 7
    

    or let it pick up the default values from your lightman.toml file

    lightman run
    

Docker

  1. Pull the image

    docker pull elementsinteractive/lightman-ai:latest
    
  2. Create configuration file:

    echo '[default]
    agent = "openai"
    score_threshold = 8
    prompt = "development"
    
    [prompts]
    development = "Analyze cybersecurity news for relevance to our organization."' > lightman.toml
    
  3. Run with Docker:

    docker run --rm \
      -v $(pwd)/lightman.toml:/app/lightman.toml \
      -e OPENAI_API_KEY="your-api-key" \
      elementsinteractive/lightman-ai:latest \
      run --config-file /app/lightman.toml --score 8 --agent openai
    

    You use a .env file instead of setting the environment variables through the cli

       cp .env.example .env
    

    Fill it with your values and run:

    docker run --rm \
      -v $(pwd)/lightman.toml:/app/lightman.toml \
      --env-file .env \
      elementsinteractive/lightman-ai:latest \
      run --config-file /app/lightman.toml --score 8 --agent openai
    

🔧 Usage

CLI Options

Option Description Default
--agent AI agent to use (openai, gemini) From config file
--score Minimum relevance score (1-10) From config file
--prompt Prompt template name From config file
--config-file Path to configuration file lightman.toml
--config Configuration section to use default
--env-file Path to environment variables file .env
--dry-run Preview results without taking action false
--prompt-file File containing prompt templates lightman.toml
--start-date Start date to retrieve articles False
--today Retrieve articles from today False
--yesterday Retrieve articles from yesterday False
-v Be more verbose on output False

Environment Variables:

lightman-ai uses the following environment variables:

  • OPENAI_API_KEY - Your OpenAI API key
  • GOOGLE_API_KEY - Your Google Gemini API key
  • SERVICE_DESK_URL - Service desk instance URL (optional)
  • SERVICE_DESK_USER - Service desk username (optional)
  • SERVICE_DESK_TOKEN - Service desk API token (optional)
  • TIME_ZONE - Your time zone (optional, defaults to UTC. i.e. "Europe/Amsterdam".)

By default, it will try to load a .env file. You can also specify a different path with the --env-file option.

⚙️ Configuration

Lightman AI uses TOML configuration files for flexible setup. Create a lightman.toml file:

[default]
agent = 'openai'              # AI agent to use (openai, gemini)
score_threshold = 8           # Minimum relevance score (1-10)
prompt = 'development'        # Prompt template to use

# Optional: Service desk integration
service_desk_project_key = "SEC"
service_desk_request_id_type = "incident"

# alternative configuration
[malware]
agent = 'openai'              # AI agent to use (openai, gemini)
score_threshold = 8           # Minimum relevance score (1-10)
prompt = 'malware'            # Prompt template to use

# Optional: Service desk integration
service_desk_project_key = "SEC"
service_desk_request_id_type = "incident"

[prompts]
development = """
Analyze the following cybersecurity news articles and determine their relevance to our organization.
Rate each article from 1-10 based on potential impact and urgency.
Focus on vulnerabilities."""

malware = """
Analyze the following cybersecurity news articles and determine their relevance to our organization.
Rate each article from 1-10 based on potential impact and urgency.
Focus on malware."""

custom_prompt = """
Your custom analysis prompt here...
"""

Note how it supports different configurations and prompts.

It also supports having separate files for your prompts and your configuration settings. Specify the path with --prompt.

lightman.toml

[default]
agent = 'openai'              # AI agent to use (openai, gemini)
score_threshold = 8           # Minimum relevance score (1-10)
prompt = 'development'        # Prompt template to use

# Optional: Service desk integration
service_desk_project_key = "SEC"
service_desk_request_id_type = "incident"

prompts.toml

[prompts]
development = """
Analyze the following cybersecurity news articles and determine their relevance to our organization.
Rate each article from 1-10 based on potential impact and urgency.
Focus on: data breaches, malware, vulnerabilities, and threat intelligence.
"""

custom_prompt = """
Your custom analysis prompt here...
"""

Examples

# Run with default settings
lightman run

# Use specific AI agent and score threshold
lightman run --agent gemini --score 7

# Use custom prompt template
lightman run --prompt custom_prompt --config-file ./my-config.toml

# Use custom environment file
lightman run --env-file production.env --agent openai --score 8

# Dry run (preview results without creating service desk tickets)
lightman run --dry-run --agent openai --score 9

# Retrieve all the news from today
lightman run --agent openai --score 8 --prompt security_critical --today

# Retrieve all the news from yesterday
lightman run --agent openai --score 8 --prompt security_critical --yesterday

Development Installation

In order to fully use the provided setup for local development and testing, this project requires the following dependencies:

Then simply:

git clone git@github.com:elementsinteractive/lightman-ai.git
cd lightman_ai
just venv  # Creates virtual environment and installs dependencies
just test  # Runs the tests
just eval  # Runs the evaluation framework

📊 Evaluation & Testing

Lightman AI includes a comprehensive evaluation framework to test and optimize AI agent performance:

Running Evaluations

# Evaluate agent performance
just eval --agent openai --samples 3 --score 7

# Compare different agents
just eval --agent gemini --samples 5 

# Add tags to differentiate runs from one another
just eval --agent gemini --samples 5 --tag "first-run"
just eval --agent gemini --samples 5 --tag "second-run"

# Test custom prompts
just eval --prompt custom_security --samples 10

# Use custom environment file for evaluation
python -m eval.cli --env-file production.env --agent openai --samples 3

You can also provide defaults in a toml file for eval.

[eval]
agent = 'openai'
score_threshold = 8
prompt = 'classify'
samples = 3

Evaluation Metrics

The evaluation system measures:

  • Precision: Accuracy of threat identification
  • Recall: Coverage of actual security threats
  • F1 Score: Balanced performance metric
  • Score Distribution: Analysis of relevance scoring patterns

Evaluation Dataset

For precision evaluation, Lightman AI uses a curated set of unclassified cybersecurity articles that serve as ground truth data. These articles include:

  • Real-world news articles from various cybersecurity sources
  • Mixed relevance levels - both highly relevant and irrelevant security news
  • Diverse threat categories - malware, data breaches, vulnerabilities, policy changes
  • Pre-validated classifications by security experts for accuracy benchmarking

The evaluation framework compares the AI agent's classifications against these known classifications to measure:

  • How accurately the agent identifies truly relevant threats (precision)
  • How well it avoids false positives from irrelevant news
  • Consistency across different types of security content

This approach ensures that performance metrics reflect real-world usage scenarios where the AI must distinguish between various types of cybersecurity news content.

Make sure to fill in the RELEVANT_ARTICLES with the ones you classify as relevant, so that you can compare the accuracy after running the eval script.*

Sentry

Sentry is optional: the application does not require it to function, and all features will work even if Sentry is not configured or fails to start. If you install the project via pip and want Sentry installed, run:

   pip install lightman-ai[sentry]

Sentry comes by default with the Docker image. If you don't want to use it, simply do not set SENTRY_DSN env variable.

The application will automatically pick up and use environment variables if they are present in your environment or .env file. To enable Sentry, set the SENTRY_DSN environment variable. This is mandatory for Sentry to be enabled. If SENTRY_DSN is not set, Sentry will be skipped and the application will run normally. If Sentry fails to initialize for any reason (e.g., network issues, invalid DSN), the application will log a warning and continue execution without error monitoring, and logging to stdout.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • TheHackerNews for providing cybersecurity news data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightman_ai-1.3.1.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lightman_ai-1.3.1-py3-none-any.whl (27.7 kB view details)

Uploaded Python 3

File details

Details for the file lightman_ai-1.3.1.tar.gz.

File metadata

  • Download URL: lightman_ai-1.3.1.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lightman_ai-1.3.1.tar.gz
Algorithm Hash digest
SHA256 9bf80ccefce4483583ea46dd8135859ab60169fd5f7e7c000b9bebb687e2fff5
MD5 a23833f8b1db44aa4751a5e2cb53ff28
BLAKE2b-256 2486af24086c8fc565cdff3feb32d25856d8c3b64f5498783fc431ecdcdbcf9e

See more details on using hashes here.

File details

Details for the file lightman_ai-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: lightman_ai-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 27.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lightman_ai-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a6df590397842bf952258b0199a0d1b782fd677271dfca1f60763c0a470def35
MD5 d2e1ebab489ede672ef55e8a81144ea0
BLAKE2b-256 8f9c5cb520fe58eb6feea4703d0ba941040493c79fd720f95cce9ad0d2c843ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page