No project description provided

These details have not been verified by PyPI

Project description

FilterChatTag

Powered by LangChain — works with OpenAI, Google Gemini, Anthropic Claude, and Ollama. Pick the provider by changing one env var.

FilterChatTag is an OpenFilter that sends each video/image frame to a multimodal chat model and attaches structured annotations ({present, confidence} per label) to the frame metadata. Built on top of LangChain's init_chat_model, so any LangChain-supported chat model with vision can be plugged in.

Breaking change in v0.3.0 — this filter was previously published as filter-chatgpt-annotator / FilterChatgptAnnotator. See MIGRATION.md for the rename map.

Features

Multi-provider — OpenAI, Gemini, Claude, Ollama via LangChain. Same code, change FILTER_CHATTAG_MODEL.
Structured output enforcement — annotations are validated by a Pydantic schema generated from FILTER_OUTPUT_SCHEMA; LangChain picks the best mechanism per provider (tool-calling, JSON-mode).
Standardized output contract — versioned JSON payload on the frame (meta.chattag) and per-line in labels.jsonl. See docs/output_contract.md.
Image optimization — optional resize/quality settings to control cost.
Dataset generators on shutdown — binary classification datasets (balanced + unbalanced) and COCO multilabel export when output_schema has more than one label.
Topic filtering / forwarding, no-ops mode, frame persistence — pipeline-friendly.

Quick start

make install
cp env.example .env
# Edit .env: pick a provider + set the credential
make run

Pick a provider

Set FILTER_CHATTAG_MODEL to a LangChain provider:model string. All four providers are installed by default — no extra install step.

Provider	`FILTER_CHATTAG_MODEL` example	Credential env var
OpenAI	`openai:gpt-4o-mini`	`OPENAI_API_KEY`
Google Gemini	`google_genai:gemini-2.0-flash`	`GOOGLE_API_KEY`
Anthropic Claude	`anthropic:claude-3-5-sonnet-latest`	`ANTHROPIC_API_KEY`
Ollama (local)	`ollama:llava`	`OLLAMA_HOST`

Any other LangChain-supported chat model with vision works too — just install the matching langchain-* package and use its provider prefix.

Configuration

# Required
FILTER_CHATTAG_MODEL=openai:gpt-4o-mini
OPENAI_API_KEY=sk-...
FILTER_PROMPT=./prompts/food_annotation_prompt.txt

# Optional — LLM
FILTER_MAX_TOKENS=1000
FILTER_TEMPERATURE=0.1

# Optional — image processing
FILTER_MAX_IMAGE_SIZE=0    # 0 = original
FILTER_IMAGE_QUALITY=85

# Optional — output
FILTER_SAVE_FRAMES=true
FILTER_OUTPUT_DIR=./output_frames
FILTER_OUTPUT_SCHEMA={"lettuce":{"present":false,"confidence":0.0},"tomato":{"present":false,"confidence":0.0}}

# Optional — topic filtering / forwarding
FILTER_TOPIC_PATTERN=.*
FILTER_EXCLUDE_TOPICS=debug,test
FILTER_FORWARD_MAIN=false

# Optional — testing
FILTER_NO_OPS=false

Configuration matrix

Variable	Type	Default	Required	Notes
`chattag_model`	string	`openai:gpt-4o-mini`	Yes	LangChain `provider:model` string
`prompt`	string	`""`	Yes	Path to prompt file (.txt)
`output_schema`	dict	`{}`	No	Labels + defaults; enforced via Pydantic
`max_tokens`	int	`1000`	No	Max response tokens
`temperature`	float	`0.1`	No	Controls randomness
`max_image_size`	int	`0`	No	Max image side in px (0 = original)
`image_quality`	int	`85`	No	JPEG quality (1–100)
`save_frames`	bool	`true`	No	Persist per-frame results
`output_dir`	string	`./output_frames`	No	Where to save results
`forward_main`	bool	`false`	No	Forward main topic to output
`no_ops`	bool	`false`	No	Skip LLM calls (testing)
`confidence_threshold`	float	`0.9`	No	Positive-class threshold for dataset generators

Credentials are NOT a config field — set the provider's native env var (OPENAI_API_KEY, GOOGLE_API_KEY, ANTHROPIC_API_KEY, OLLAMA_HOST). LangChain reads them automatically.

Architecture

The filter follows the standard OpenFilter setup → process → shutdown lifecycle:

Stage	Responsibility
`setup()`	Validate config; build the LangChain chat model via `init_chat_model`; wrap it with `with_structured_output(Pydantic)` derived from `FILTER_OUTPUT_SCHEMA`; load prompt file
`process()`	For each frame: BGR→base64, build a multimodal `HumanMessage`, invoke the chain, normalize annotations, attach to `frame.data["meta"]["chattag"]`
`shutdown()`	Generate binary + balanced + COCO multilabel datasets from `labels.jsonl`

Data signature

Processed frames carry results under frame.data["meta"]["chattag"] (see docs/output_contract.md):

schema_version — contract version (e.g. "1.0")
annotations — {label_name: {present: bool, confidence: float}}
usage — {input_tokens, output_tokens, total_tokens} (from AIMessage.usage_metadata)
processing_time, timestamp, model, frame_id
error — present when processing failed

Topic forwarding

forward_main=True preserves the original main topic alongside processed topics in the output dict — useful when downstream filters need the unmodified frame.

Output structure (with `save_frames=true`)

./output_frames/
├── data/                       # Processed images
├── labels.jsonl                # One JSON line per frame (see docs/output_contract.md)
├── binary_datasets/            # Generated on shutdown
├── binary_datasets_balanced/   # Balanced (equal class) variant
└── multilabel_datasets/        # COCO export when schema has >1 label

Binary datasets are overwritten on each run; labels.jsonl and data/ are append-only.

Confidence threshold

FILTER_CONFIDENCE_THRESHOLD controls the cutoff used by the dataset generators on shutdown — confidence ≥ threshold → positive class, otherwise absent. Defaults to 0.9 (high precision). Lower it for higher recall.

No-ops mode

export FILTER_NO_OPS=true

Wires up the pipeline without making any LLM calls — images are still processed and saved, default annotations are emitted. Use for plumbing/integration tests without burning credits.

Usage scenarios

# Food annotation
export FILTER_PROMPT="./prompts/food_annotation_prompt.txt"
export FILTER_OUTPUT_SCHEMA='{"lettuce":{"present":false,"confidence":0.0},"tomato":{"present":false,"confidence":0.0}}'
python scripts/filter_food_annotation.py

# Pet classification (Gemini)
export FILTER_CHATTAG_MODEL=google_genai:gemini-2.0-flash
export FILTER_PROMPT="./prompts/pet_classification_prompt.txt"
export FILTER_OUTPUT_SCHEMA='{"cat":{"present":false,"confidence":0.0},"dog":{"present":false,"confidence":0.0}}'
python scripts/filter_pet_classification.py

# Multilabel (Claude, with COCO export on shutdown)
export FILTER_CHATTAG_MODEL=anthropic:claude-3-5-sonnet-latest
python scripts/filter_multilabel.py

Prompt format

Prompts should clearly describe the task and the expected labels. Because LangChain enforces the output structure via Pydantic, prompts no longer need to insist as heavily on "return only valid JSON" — the provider's tool-calling layer handles that — but it remains a good idea to include the expected label list and rules for uncertainty.

You are a vision analyst. Given an image, decide whether each of the
following items is visibly present:

ITEMS = ["cat", "dog"]

For each item, return:
  present: true if you can see it in the image, else false
  confidence: 0.0–1.0 reflecting your certainty

Output example

{
  "schema_version": "1.0",
  "image": "001.png",
  "labels": {
    "cat": {"present": true,  "confidence": 0.92},
    "dog": {"present": false, "confidence": 0.15}
  },
  "usage": {"input_tokens": 26288, "output_tokens": 414, "total_tokens": 26702},
  "prompt_used": "pet_classification_prompt.txt"
}

Full contract: docs/output_contract.md.

Project layout

filter-chatgpt-annotator/        # repo root, also the PyPI distribution name
├── filter_chattag/                 # import package (renamed in v0.3.0)
│   └── filter.py              # Main filter implementation (LangChain)
├── scripts/                   # Example pipelines
├── prompts/                   # Example prompt files
├── tests/
├── schemas/chattag_output.schema.json  # JSON Schema for the output contract
├── docs/                      # Output contract, usage guide, examples, providers
├── env.example
└── pyproject.toml

Key dependencies

langchain>=0.3,<0.4 + langchain-openai, langchain-google-genai, langchain-anthropic, langchain-ollama
pydantic>=2.0,<3.0
openfilter[all]>=1.1.0,<2.0.0
opencv-python>=4.8.0, pillow>=9.0.0, python-dotenv>=1.0.0

Testing

make test
make test-coverage

The offline test suite mocks LangChain and never hits a real provider. Integration tests against real providers are not run by default.

Troubleshooting

Authentication/401 errors — the provider's native env var (OPENAI_API_KEY etc.) isn't reaching the process. make run and docker-compose need it exported in the parent shell or set under environment: in the compose file.
Provider 'X' not supported — the matching langchain-X package isn't installed; the four officially supported ones ship by default. To use something else, pip install it and use its provider prefix.
Garbled annotations from Ollama — vision-capable Ollama models (e.g. llava, llama3.2-vision) are required; text-only models will refuse the image content.
Slow processing — set FILTER_MAX_IMAGE_SIZE=512 and use a smaller model (gpt-4o-mini, gemini-2.0-flash, claude-3-5-haiku-latest).

Documentation

License

See LICENSE for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

May 26, 2026

0.2.3

Apr 23, 2026

0.1.1

Sep 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

filter_chatgpt_annotator-0.3.0-py3-none-any.whl (19.4 kB view details)

Uploaded May 26, 2026 Python 3

File details

Details for the file filter_chatgpt_annotator-0.3.0-py3-none-any.whl.

File metadata

Download URL: filter_chatgpt_annotator-0.3.0-py3-none-any.whl
Upload date: May 26, 2026
Size: 19.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for filter_chatgpt_annotator-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c71c9b1b6e5fdd5c1ebf5b1443ffacbfc7c2e9bc42d984936f0fb2d9bfd6a933`
MD5	`5778dcfb2c44ed46165d6dbda455878e`
BLAKE2b-256	`05da8e3b60c8bb3d6327d6f77a9a8e303cf0bba2cdc23eccddc390ee89ed829b`

See more details on using hashes here.

filter-chatgpt-annotator 0.3.0

Navigation

Verified details

Owner

Unverified details

Meta

Classifiers