Skip to main content

Shared foundation library for the prompttools suite

Project description

prompttools-core

Shared foundation library for the prompttools suite.

PyPI License: MIT Python 3.9+

What It Provides

prompttools-core is the shared library used by all tools in the prompttools suite (promptfmt, promptcost, prompttest, promptlint). It provides:

  • Multi-format prompt parsing -- YAML, JSON, Markdown, and plain text
  • Tokenization engine -- tiktoken-based token counting with model-aware encoding selection
  • Model profiles -- Built-in profiles for OpenAI, Anthropic, and Google models with pricing and context window data
  • Configuration system -- Hierarchical config file discovery and merging
  • Content caching -- SHA256-based file-system cache to avoid redundant computation
  • Plugin system -- Discover and load custom plugin classes from directories
  • Template variable extraction -- Detects {{var}}, {var}, and <var> syntax across all formats

Installation

pip install prompttools-core-ai

Dependencies: pydantic >= 2.0, pyyaml >= 6.0, tiktoken >= 0.7

Supported Formats

Extension Format Description
.yaml, .yml YAML Structured messages with front-matter metadata
.json JSON Structured messages with metadata
.md Markdown Heading-delimited message sections
.txt Text Plain text, single or --- delimited messages

API Reference

Parsing

from prompttools_core import parse_file, parse_directory, parse_stdin, parse_pipeline

# Parse a single prompt file (auto-detects format from extension)
prompt = parse_file("prompts/greeting.yaml")

# Parse from stdin
prompt = parse_stdin(content, format="yaml")  # format: text, md, yaml, json

# Parse all prompt files in a directory
prompts = parse_directory("prompts/")

# Parse a pipeline manifest
pipeline = parse_pipeline("pipeline.yaml")

parse_file(path, config=None) -> PromptFile Parse a prompt file, auto-detecting format by extension. Raises ParseError for unsupported formats and FileNotFoundError if the file does not exist.

parse_directory(path, config=None, patterns=None) -> list[PromptFile] Recursively parse all prompt files in a directory. Skips files that fail to parse. Respects config.exclude patterns.

parse_stdin(content, format) -> PromptFile Parse prompt content from stdin. The format parameter must be one of: text, md, yaml, json.

parse_pipeline(manifest_path, config=None) -> PromptPipeline Parse a pipeline manifest YAML and all referenced prompt files. The manifest defines stages with name, file, depends_on, expected_output_tokens, and persona fields.

Tokenization

from prompttools_core import Tokenizer, count_tokens, get_encoding

# Quick token count
tokens = count_tokens("Hello, world!", encoding="cl100k_base")

# Model-aware tokenizer
tokenizer = Tokenizer.for_model("gpt-4o")
tokens = tokenizer.count("Some text")

# Count tokens for a full prompt file (populates token_count on each message)
total = tokenizer.count_file(prompt)

# Count tokens across messages including role overhead
total = tokenizer.count_messages(prompt.messages)

Tokenizer(encoding="cl100k_base", provider="default") Create a tokenizer with a specific encoding. The provider parameter controls per-message role overhead (OpenAI: 4 tokens, Anthropic: 3, Google: 3).

Tokenizer.for_model(model_name) -> Tokenizer Factory method that looks up the model profile and creates a tokenizer with the correct encoding and provider.

count_tokens(text, encoding="cl100k_base") -> int Standalone convenience function to count tokens in a string.

Model Profiles

from prompttools_core import get_profile, list_profiles, register_profile, ModelProfile

# Look up a built-in profile
profile = get_profile("gpt-4o")
print(f"Context: {profile.context_window}, Encoding: {profile.encoding}")
print(f"Input: ${profile.input_price_per_mtok}/Mtok")

# List all profiles
for name, p in list_profiles().items():
    print(f"{name}: {p.provider}, {p.context_window} tokens")

# Register a custom profile
register_profile(ModelProfile(
    name="my-model",
    context_window=32000,
    encoding="cl100k_base",
    provider="custom",
    input_price_per_mtok=1.0,
    output_price_per_mtok=3.0,
))

Built-in Profiles

Model Provider Context Encoding Input $/Mtok Output $/Mtok
gpt-4 openai 8,192 cl100k_base $30.00 $60.00
gpt-4-turbo openai 128,000 cl100k_base $10.00 $30.00
gpt-4o openai 128,000 o200k_base $2.50 $10.00
gpt-4o-mini openai 128,000 o200k_base $0.15 $0.60
claude-3-haiku anthropic 200,000 cl100k_base $0.25 $1.25
claude-3-sonnet anthropic 200,000 cl100k_base $3.00 $15.00
claude-3-opus anthropic 200,000 cl100k_base $15.00 $75.00
claude-4-sonnet anthropic 200,000 cl100k_base $3.00 $15.00
gemini-1.5-pro google 1,000,000 cl100k_base $1.25 $5.00
gemini-2.0-flash google 1,048,576 cl100k_base $0.10 $0.40

Non-OpenAI models use cl100k_base as an approximation (indicated by approximate_tokenizer=True on the profile).

Configuration

from prompttools_core import load_config, find_config_file

# Auto-discover and load config
config = load_config(tool_name="fmt", start_dir=Path("."))

# Load with explicit path and CLI overrides
config = load_config(
    tool_name="cost",
    config_path=Path(".prompttools.yaml"),
    cli_overrides={"model": "gpt-4o"},
)

Config file discovery order (at each directory level, walking upward):

  1. .prompt{tool_name}.yaml (e.g., .promptfmt.yaml)
  2. .prompttools.yaml
  3. .promptlint.yaml (backward compatibility)

Merge priority (highest to lowest):

  1. CLI overrides
  2. Config file
  3. Model profile defaults
  4. Built-in defaults

Configuration File Format

# .prompttools.yaml
model: gpt-4o
exclude:
  - "vendor/**"
  - "*.generated.*"
plugins:
  - "./my-plugins"
cache:
  enabled: true
  dir: .prompttools-cache
tokenizer:
  encoding: cl100k_base

# Tool-specific sections
fmt:
  delimiter_style: "###"
  variable_style: double_brace
cost:
  default_output_tokens: 1000

Caching

from prompttools_core import PromptCache

cache = PromptCache(cache_dir=Path(".prompttools-cache"))

# Generate a content-based key
key = PromptCache.content_key(content="Hello", encoding="cl100k_base")

# Store and retrieve values (with optional TTL in seconds)
cache.set(key, 42, ttl=3600)
value = cache.get(key)  # Returns None if expired

# Invalidate a single key or clear all
cache.invalidate(key)
cache.clear()

Plugin System

from prompttools_core import discover_plugins, load_plugin

# Load plugin classes from a single file
classes = load_plugin(Path("my_plugin.py"), base_class=MyBaseClass)

# Discover all plugins in directories
classes = discover_plugins(
    plugin_dirs=[Path("./plugins"), Path("./custom-rules")],
    base_class=MyBaseClass,
)

Plugins are Python files containing classes that subclass a given base class. The plugin system is used by promptlint for custom lint rules and by promptfmt for custom formatters.

Data Models

PromptFile -- Represents a parsed prompt file.

  • path: Path -- Source file path
  • format: PromptFormat -- Detected format (TEXT, MARKDOWN, YAML, JSON)
  • raw_content: str -- Original file content
  • messages: list[Message] -- Parsed messages
  • variables: dict[str, str] -- Template variables mapped to syntax style
  • variable_defaults: dict[str, str] -- Default values from metadata
  • metadata: dict[str, Any] -- Front-matter or top-level metadata
  • total_tokens: Optional[int] -- Populated by tokenizer
  • content_hash: str -- SHA256 hash of raw_content

Message -- A single turn in a prompt conversation.

  • role: Literal["system", "user", "assistant", "tool"]
  • content: str
  • line_start: Optional[int], line_end: Optional[int]
  • token_count: Optional[int]
  • metadata: dict[str, Any]

ModelProfile -- Configuration for a specific LLM model.

  • name, context_window, encoding, provider
  • input_price_per_mtok, output_price_per_mtok
  • max_output_tokens, supports_system_message, supports_tools
  • approximate_tokenizer: bool

ToolConfig -- Base configuration shared by all tools.

  • model, tokenizer_encoding, exclude, plugins
  • cache_enabled, cache_dir, extra

PromptPipeline / PipelineStage -- Multi-stage pipeline support.

Error Hierarchy

All exceptions inherit from PromptToolsError:

  • ParseError -- File cannot be parsed (also a ValueError)
  • ConfigError -- Configuration is invalid
  • TokenizerError -- Tokenization fails
  • ProfileNotFoundError -- Unknown model profile
  • PluginError -- Plugin loading or execution fails
  • CacheError -- Cache read/write fails

Template Variable Extraction

from prompttools_core.parser import extract_variables

vars = extract_variables("Hello {{name}}, your order {id} is in <status>")
# {'name': 'jinja', 'id': 'fstring', 'status': 'xml'}

Common HTML tags (div, span, p, etc.) are automatically excluded from XML-style variable detection.

License

MIT License. Author: Scott Converse.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompttools_core_ai-1.0.0.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prompttools_core_ai-1.0.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file prompttools_core_ai-1.0.0.tar.gz.

File metadata

  • Download URL: prompttools_core_ai-1.0.0.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prompttools_core_ai-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1dea58a2ec36e5221d9e868962f6243a553ea313f0f5b50bc27e610cfaac5c30
MD5 f9ce854c3a74fefebd069d49ff7fcd8a
BLAKE2b-256 7d53fe370f5f5a62d31fd0c954fafbcaf180fe4d1c7b60d0aaa1e166aebf888d

See more details on using hashes here.

File details

Details for the file prompttools_core_ai-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for prompttools_core_ai-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d398af2dc682025f4ebb0f862ea092645dcad9f47bc759c390bab20c8aac43d
MD5 b2dd490435aa3a35d993761ebe7b03ed
BLAKE2b-256 40ba290d79a9b6d2c4583df269dd7aebcc64fb9fb0530005c07df4d9465e7b76

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page