Skip to main content

Command-line tool for evaluating agents, offering two features to assess agents via code scanning (`flintai scan`, whitebox testing) and runtime evaluation (`flintai eval`, blackbox testing)

Project description

Flint AI

PyPI version Python License Documentation Website

Ship AI agents with confidence

One CLI to analyze agent code and runtime behavior, any framework.

Flint AI Scan Flint AI Eval
Command flintai scan flintai eval
What AI-powered security analysis of your agent's code (whitebox testing) Runtime behavioral evaluation with adversarial prompts (blackbox testing)
Output Security findings mapped to OWASP Top 10 for LLM with CVSS severity scores Evaluation scores (0-100%) mapped to OWASP Top 10 for LLM

Why Flint AI?

  • AI-powered analysis — Contextual code understanding, not just pattern matching
  • OWASP ASI mapped — Findings aligned to Top 10 for Agentic Applications
  • 100% free — First results in minutes

Try it now - 5 minute Quickstart

Requirements

  • Python 3.13 or later
  • OpenGrep (required for Flint AI Scan)
  • A running agent accessible via HTTP (required for Flint AI Eval)

Supported frameworks: Google ADK, Google GenAI, Anthropic, OpenAI, OpenAI Agents SDK, LangGraph, CrewAI, AutoGen, HuggingFace Transformers, HuggingFace smolagents

Step 1: Install Flint AI

pip install flintai-cli

Step 2: Configure your LLM provider

Flint AI uses AI to analyze agent code and score reliability. Run the interactive setup:

flintai init

You'll be prompted to select a provider (Gemini, OpenAI, Anthropic, or LiteLLM), choose a model, and enter your API key.

Where to get API keys

Run into issues? See installation troubleshooting

Step 3: Try the example agents

To demonstrate the CLIs capabilities, we've shipped this tool with two example agents. You can get them here.

Both agents work with both flintai scan and flintai eval:

Agent Framework Description
weather_agent Google ADK Weather assistant that looks up conditions for cities. Should refuse off-topic requests.
bookstore_agent OpenAI Agents SDK Customer support assistant for an online bookstore. Searches books, checks orders, and processes returns.

The included examples/config.json has both agents configured with builtin evaluations (OWASP LLM01–LLM09, PII, secrets) and custom tests.


flintai scan finds security issues in the code without running the agent. We'll scan the bookstore agent to see what issues Flint AI can find:

flintai scan examples/bookstore_agent/
Scan results showing security findings

Example: Scan found 2 security issues - High severity missing authentication and Medium severity unbounded execution loop


flintai eval tests runtime behavior, so the agent needs to be running. Start the bookstore agent:

# Start the bookstore agent (serves on http://localhost:8010)
uvx --with openai-agents,fastapi --from uvicorn uvicorn examples.bookstore_agent.agent:app --port 8010 --host 0.0.0.0

In a new terminal, run evaluations:

flintai eval run --model model-bookstore-agent --config examples/config.json

Step 4: Test your own agents

See our documentation to configure, scan and evaluate your agents:

Ship with confidence. Validate behavior, catch risks, prove readiness.

Commands

init

Setup wizard that configures Flint AI for first use. Creates the ~/.flintai directory with a .env file (LLM provider, API key, runtime settings) and a config.json skeleton.

Runs automatically on first use in non-CI environments. You can re-run it at any time to reconfigure.

flintai init

scan

AI-powered security analysis of agent source code. Finds vulnerabilities, misconfigurations, and OWASP Top 10 violations.

# Scan a directory
flintai scan /path/to/agent/code

# Scan a single file
flintai scan agent.py

# Specify output file
flintai scan /path/to/code --output results.json

Full scan guide

eval

Test agent behavior at runtime. Get a evaluation score proving production-readiness.

# List all available configuration
flintai eval evaluations list

# List your agents and models
flintai eval models list

# Attach an evalation to your agent
flintai eval model-evaluations attach \
  --model my-agent \
  --eval eval-llm01-adversarial

# Run all evaluations for an agent
flintai eval run --model my-agent

The flintai eval command requires configuration. See Configuration to:

  1. Define your models (agents to test)
  2. View available evaluations
  3. Attach evaluations to models

Full eval guide

Documentation

Complete guides and reference:

Data privacy

Flint AI runs on your machine, but several features can call external LLM providers. This can be configured via GENERATOR_MODEL (located in ~/.flintai/.env, created by flintai init). You can set this to a remote managed LLM (i.e. gemini, openai, anthropic) or a locally hosted LLM (i.e. litellm or ollama).

Read more.

Contributing

See CONTRIBUTING.md for details.

License

Free to use - full license.

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flintai_cli-1.0.0.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flintai_cli-1.0.0-py3-none-any.whl (1.8 MB view details)

Uploaded Python 3

File details

Details for the file flintai_cli-1.0.0.tar.gz.

File metadata

  • Download URL: flintai_cli-1.0.0.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for flintai_cli-1.0.0.tar.gz
Algorithm Hash digest
SHA256 87093fd40103e5fb661f486617eb7a38b00e46ac2609c21b948b27035ad3b763
MD5 a184cbde49150d79d63d0198774d267c
BLAKE2b-256 8e93c349ce24e6c1d5e771eea12b8ab2019ec626a07ffc245663d43953bed733

See more details on using hashes here.

File details

Details for the file flintai_cli-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: flintai_cli-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for flintai_cli-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f35b2aa588185452956f717e6ee15e0ca0edcd366a9894671c156aaab5e3ad37
MD5 53e13093d5279aef0520da8e187b34ed
BLAKE2b-256 5b82e8c1db56ae94f3ea20c86fb5488b4a8bce7e9565a10055376869dd668e1d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page