A Python library and CLI tool for parsing and searching front matter in files
Project description
fmu - Front Matter Utils
A Python library and CLI tool for parsing and searching front matter in files.
Features
- Library Mode: Reusable API for parsing and searching frontmatter
- CLI Mode: Command-line interface for batch operations
- YAML Support: Parse YAML frontmatter (default format)
- Flexible Search: Search by field name and optionally by value
- Array Search: Search within array/list frontmatter values
- Regex Support: Use regular expressions for value matching
- Validation Engine: Validate frontmatter fields against custom rules
- Update Engine: Transform, replace, and remove frontmatter values (New in v0.4.0)
- Case Transformations: Six different case conversion types (New in v0.4.0)
- Value Deduplication: Automatic removal of duplicate array values (New in v0.4.0)
- Template Output: Export content and frontmatter using custom templates (New in v0.9.0)
- Character Escaping: Escape special characters in output (New in v0.9.0)
- File Output: Save command output directly to files (New in v0.10.0)
- Case Sensitivity: Support for case-sensitive or case-insensitive matching
- Multiple Output Formats: Console output or CSV export
- Glob Pattern Support: Process multiple files using glob patterns
Installation
From Source
git clone https://github.com/geraldnguyen/frontmatter-utils.git
cd frontmatter-utils
pip install -e .
Dependencies
- Python 3.7+
- PyYAML>=6.0
Getting Started
Library Usage
from fmu import parse_file, search_frontmatter, validate_frontmatter, update_frontmatter
# Parse a single file
frontmatter, content = parse_file('example.md')
print(f"Title: {frontmatter.get('title')}")
print(f"Content: {content}")
# Search for frontmatter across multiple files
results = search_frontmatter(['*.md'], 'author', 'John Doe')
for file_path, field_name, field_value in results:
print(f"{file_path}: {field_name} = {field_value}")
# Search within array values
results = search_frontmatter(['*.md'], 'tags', 'python')
# Validate frontmatter fields
validations = [
{'type': 'exist', 'field': 'title'},
{'type': 'eq', 'field': 'status', 'value': 'published'},
{'type': 'contain', 'field': 'tags', 'value': 'tech'}
]
failures = validate_frontmatter(['*.md'], validations)
for file_path, field_name, field_value, reason in failures:
print(f"Validation failed in {file_path}: {reason}")
# Update frontmatter fields (New in v0.4.0)
operations = [
{'type': 'case', 'case_type': 'lower'},
{'type': 'replace', 'from': 'python', 'to': 'programming', 'ignore_case': False, 'regex': False},
{'type': 'remove', 'value': 'deprecated', 'ignore_case': False, 'regex': False}
]
results = update_frontmatter(['*.md'], 'tags', operations, deduplication=True)
for result in results:
if result['changes_made']:
print(f"Updated {result['file_path']}: {result['reason']}")
CLI Usage
Basic Commands
# Show version
fmu version
# Show help
fmu help
# Parse files and show both frontmatter and content
fmu read "*.md"
# Parse files and show only frontmatter
fmu read "*.md" --output frontmatter
# Parse files and show only content
fmu read "*.md" --output content
# Skip section headings
fmu read "*.md" --skip-heading
# Escape special characters in output (New in v0.9.0)
fmu read "*.md" --escape
# Use template output for custom formatting (New in v0.9.0)
fmu read "*.md" --output template --template '{ "title": "$frontmatter.title", "file": "$filename" }'
# Save output to file (New in v0.10.0)
fmu read "*.md" --file output.txt
# Save template output to JSON file (New in v0.10.0)
fmu read "*.md" --output template --template '{ "title": "$frontmatter.title" }' --file output.json
File Output (New in v0.10.0)
The --file option allows you to save command output directly to a file instead of displaying it in the console:
# Save standard output to file
fmu read "*.md" --file output.txt
# Save template output to file
fmu read "*.md" --output template --template '{ "title": "$frontmatter.title" }' --file output.json
# Combine with escape for JSON-safe file output
fmu read "*.md" --output template --template '{ "content": "$content" }' --escape --file data.json
# Works with specs files - different commands can output to different files
fmu execute commands.yaml # Each command can specify its own --file destination
Use Cases:
- Export metadata to JSON files for further processing
- Generate data files for static site generators
- Create batch processing pipelines with file-based workflows
- Archive frontmatter and content in structured formats
Template Output (New in v0.9.0)
The --output template option allows you to export content and frontmatter in custom formats:
# Export as JSON-like format
fmu read "*.md" --output template --template '{ "title": "$frontmatter.title", "content": "$content" }'
# Access array elements by index
fmu read "*.md" --output template --template '{ "first_tag": "$frontmatter.tags[0]", "second_tag": "$frontmatter.tags[1]" }'
# Include file metadata
fmu read "*.md" --output template --template '{ "path": "$filepath", "name": "$filename" }'
# Combine with escape option for JSON-safe output
fmu read "*.md" --output template --template '{ "content": "$content" }' --escape
Template Placeholders:
$filename: Base filename (e.g., "post.md")$filepath: Full file path$content: Content after frontmatter$frontmatter.fieldname: Access frontmatter field (single value or full array as JSON)$frontmatter.fieldname[N]: Access array element by index (0-based)
Escape Option:
When --escape is used, the following characters are escaped:
- Newline:
\n - Carriage return:
\r - Tab:
\t - Single quote:
'→\' - Double quote:
"→\"
Search Commands
# Search for posts with 'author' field
fmu search "*.md" --name author
# Search for posts by specific author
fmu search "*.md" --name author --value "John Doe"
# Case-insensitive search
fmu search "*.md" --name author --value "john doe" --ignore-case
# Search within array values
fmu search "*.md" --name tags --value python
# Use regex for pattern matching
fmu search "*.md" --name title --value "^Guide.*" --regex
# Output results to CSV file
fmu search "*.md" --name category --csv results.csv
Validation Commands
# Validate that required fields exist
fmu validate "*.md" --exist title --exist author
# Validate that certain fields don't exist
fmu validate "*.md" --not draft --not private
# Validate field values
fmu validate "*.md" --eq status published --ne category "deprecated"
# Validate array contents
fmu validate "*.md" --contain tags "tech" --not-contain tags "obsolete"
# Validate using regex patterns
fmu validate "*.md" --match title "^[A-Z].*" --not-match content "TODO"
# Case-insensitive validation
fmu validate "*.md" --eq STATUS "published" --ignore-case
# Output validation failures to CSV
fmu validate "*.md" --exist title --csv validation_report.csv
# Complex validation with multiple rules
fmu validate "blog/*.md" \
--exist title \
--exist author \
--eq status "published" \
--contain tags "tech" \
--match date "^\d{4}-\d{2}-\d{2}$" \
--csv blog_validation.csv
Update Commands (New in v0.4.0)
# Transform case of frontmatter values
fmu update "*.md" --name title --case "Title Case"
fmu update "*.md" --name author --case lower
# Replace values
fmu update "*.md" --name status --replace draft published
fmu update "*.md" --name category --replace "old-name" "new-name"
# Case-insensitive replacement
fmu update "*.md" --name tags --replace Python python --ignore-case
# Regex-based replacement
fmu update "*.md" --name content --replace "TODO:.*" "DONE" --regex
# Remove specific values
fmu update "*.md" --name tags --remove "deprecated"
fmu update "*.md" --name status --remove "draft"
# Remove with regex patterns
fmu update "*.md" --name tags --remove "^test.*" --regex
# Multiple operations (applied in sequence)
fmu update "*.md" --name tags \
--replace python programming \
--remove deprecated \
--case lower
# Disable deduplication (enabled by default for arrays)
fmu update "*.md" --name tags --deduplication false --case lower
# Complex update with multiple operations
fmu update "blog/*.md" \
--name tags \
--case lower \
--replace "javascript" "js" \
--replace "python" "py" \
--remove "deprecated" \
--remove "old" \
--deduplication true
Global Options
# Specify frontmatter format (currently only YAML supported)
fmu --format yaml read "*.md"
Documentation
For detailed information about using fmu, see:
- CLI Command Reference: Complete guide to all CLI commands, options, and examples
- Library API Reference: Comprehensive Python API documentation
- Specs File Specification: Format and usage of specs files for command automation
Changelog
Version 0.13.0
- Slice Function for Compute Operations
- New
slice()function for list slicing in--computeoption - Support for Python-like slicing syntax:
slice(list, start),slice(list, start, stop),slice(list, start, stop, step) - Negative indices support for reverse indexing (e.g.,
-1for last element) - Negative step support for reverse iteration
- New
- Enhanced Compute Behavior
- When computed value is a list (e.g., from
slice()), it now replaces the entire list instead of appending - Maintains backward compatibility: scalar computed values still append to list fields
- When computed value is a list (e.g., from
- Use Cases
- Extract last element:
=slice($frontmatter.aliases, -1) - Get first N elements:
=slice($frontmatter.tags, 0, 3) - Filter with step:
=slice($frontmatter.items, 0, 10, 2)(every other element) - Reverse lists:
=slice($frontmatter.list, -1, 0, -1)
- Extract last element:
- Documentation
- Updated CLI.md with slice function examples
- Updated API.md with slice function specifications
- Updated SPECS.md with slice function usage
- All 182 tests passing (18 new tests for slice functionality)
Version 0.12.0
- Compute Operations
- New
--computeoption for the update command to calculate and set frontmatter values - Support for literal values, placeholder references, and function calls
- Built-in functions:
now(),list(),hash(string, length),concat(string, ...) - Placeholder references:
$filename,$filepath,$content,$frontmatter.name,$frontmatter.name[index] - Auto-create frontmatter fields if they don't exist
- Automatically append to list fields when computing values
- New
- Formula Types
- Literals: Set static values like
1,2nd,any text - Placeholders: Reference file metadata and frontmatter fields
- Functions: Dynamic value generation with built-in functions
- Literals: Set static values like
- Use Cases
- Generate timestamps with
=now() - Create content IDs with
=hash($frontmatter.url, 10) - Build dynamic URLs with
=concat(/post/, $frontmatter.id) - Initialize empty arrays with
=list() - Store file metadata in frontmatter
- Generate timestamps with
- Documentation
- Updated CLI.md with compute examples and function reference
- Updated API.md with compute operation specifications
- Updated SPECS.md with compute formula examples
- All 164 tests passing (28 new tests for compute functionality)
Version 0.11.0
- Documentation Reorganization
- Extracted CLI Command Reference to separate CLI.md file
- Extracted Library API Reference to separate API.md file
- Streamlined README.md to focus on Features, Installation, Getting Started, Changelog, and Mics sections
- Added Documentation section with links to CLI, API, and Specs documentation
- Enhanced SPECS.md with up-to-date command and option information
- All documentation now reflects current implementation and features through v0.10.0
Version 0.10.0
- File Output Feature
- New
--fileoption to save command output directly to files - Works with all output modes (frontmatter, content, both, template)
- Enable file-based workflows for batch processing
- Multiple commands in specs files can output to different files
- New
- Enhanced Integration
- Seamless integration with specs file execution
- Each command can specify independent output destination
- Console and file output can be mixed in the same workflow
- Use Cases
- Export metadata to JSON files for further processing
- Generate data files for static site generators
- Create automated pipelines with file-based workflows
- Testing
- Added comprehensive tests for file output functionality
- All 136 tests passing
Version 0.9.0
- Template Output Feature
- New
--output templateoption for custom formatting - Template placeholders:
$filename,$filepath,$content,$frontmatter.field - Array indexing support:
$frontmatter.field[N] - Array values exported as JSON when accessed without index
- New
- Character Escaping
- New
--escapeoption to escape special characters - Escapes: newline (
\n), carriage return (\r), tab (\t), quotes (',") - Works with all output modes (frontmatter, content, both, template)
- New
- Enhanced Read Command
- Template mode validation (requires
--templatewhen--output template) - Support for complex output formats (JSON, custom text, etc.)
- Graceful handling of missing frontmatter fields in templates
- Template mode validation (requires
- Library API Updates
- Template rendering functions available for library users
- Character escaping functions for text processing
Version 0.4.0
- New update command
updatecommand for modifying frontmatter fields in place- Six case transformation types: upper, lower, Sentence case, Title Case, snake_case, kebab-case
- Flexible value replacement with substring and regex support
- Value removal with regex pattern support
- Automatic array deduplication (configurable)
- Multiple operations can be applied in sequence
- Enhanced CLI options
--caseoption for case transformations--replaceoption for value replacement--removeoption for value removal- Shared
--ignore-caseand--regexoptions for both replace and remove operations --deduplicationoption to control array deduplication
- Library API enhancements
update_frontmatter()function for programmatic updatesupdate_and_output()function for direct console output- Comprehensive operation support in library mode
- Comprehensive testing
- 27 new update tests covering all update functionality
- Enhanced error handling and edge case coverage
- Documentation updates
- Complete update command documentation
- Detailed update examples and use cases
- Enhanced API documentation with update functions
Version 0.3.0
- New validation command
validatecommand for comprehensive frontmatter validation- Eight validation types: exist, not, eq, ne, contain, not-contain, match, not-match
- Support for field existence, value equality, array content, and regex pattern validation
- Enhanced CLI capabilities
- Repeatable validation options (e.g., multiple
--existflags) - Case-insensitive validation with
--ignore-case - CSV export for validation failures with detailed failure reasons
- Repeatable validation options (e.g., multiple
- Library API enhancements
- New
validate_frontmatter()function for programmatic validation - New
validate_and_output()function for direct output - Comprehensive validation rule format
- New
- Comprehensive testing
- 30 new validation tests covering all validation types
- 7 new CLI tests for validation functionality
- Enhanced error handling and edge case coverage
- Documentation updates
- Complete validation command documentation
- Detailed validation examples and use cases
- Enhanced API documentation with validation functions
Version 0.2.0
- Enhanced search capabilities
- Array/list value matching: Search within array frontmatter fields
- Regex pattern matching: Use regular expressions for flexible value search
- Support for both scalar and array field searches
- New CLI options
--regexflag for enabling regex pattern matching- Improved help documentation with regex examples
- Library API enhancements
- Updated
search_frontmatter()function withregexparameter - Backward compatible with existing code
- Updated
- Comprehensive testing
- Added tests for array value matching
- Added tests for regex functionality
- Added CLI tests for new features
- Documentation updates
- Detailed regex support documentation
- Enhanced examples and usage patterns
Version 0.1.0
- Initial release
- YAML frontmatter parsing
- CLI with read and search commands
- Library API for programmatic usage
- Glob pattern support
- CSV export functionality
- Case-sensitive and case-insensitive search
- Comprehensive test suite
Mics
- Download stats: https://pypistats.org/packages/frontmatter-utils
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file frontmatter_utils-0.13.0.tar.gz.
File metadata
- Download URL: frontmatter_utils-0.13.0.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83c3b80ae387136b41d1dd68802368a0a3438a9ca0874b0f3bceb25eb12da2dc
|
|
| MD5 |
73eebc1e3e193ea2effe965304928ecb
|
|
| BLAKE2b-256 |
6f241857d0a7965a475aaec9a2a5f74349c7ad0bc1412565add0e5db1e33d999
|
Provenance
The following attestation bundles were made for frontmatter_utils-0.13.0.tar.gz:
Publisher:
release-build.yml on geraldnguyen/frontmatter-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
frontmatter_utils-0.13.0.tar.gz -
Subject digest:
83c3b80ae387136b41d1dd68802368a0a3438a9ca0874b0f3bceb25eb12da2dc - Sigstore transparency entry: 636122846
- Sigstore integration time:
-
Permalink:
geraldnguyen/frontmatter-utils@6f416b44e94f608b64ec1b5267c4844b39cc2f43 -
Branch / Tag:
refs/tags/v0.13.0 - Owner: https://github.com/geraldnguyen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-build.yml@6f416b44e94f608b64ec1b5267c4844b39cc2f43 -
Trigger Event:
release
-
Statement type:
File details
Details for the file frontmatter_utils-0.13.0-py3-none-any.whl.
File metadata
- Download URL: frontmatter_utils-0.13.0-py3-none-any.whl
- Upload date:
- Size: 45.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06973e7adf01acda26d59ee77a28353de5ace6e8efd7f95bc542f8bf5de233cc
|
|
| MD5 |
be09d2c0cf009820e72629e8c02a6275
|
|
| BLAKE2b-256 |
4300d85d45167958c64798f7e3a83f40c3289417bd44f4e10b7a9553680c18f4
|
Provenance
The following attestation bundles were made for frontmatter_utils-0.13.0-py3-none-any.whl:
Publisher:
release-build.yml on geraldnguyen/frontmatter-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
frontmatter_utils-0.13.0-py3-none-any.whl -
Subject digest:
06973e7adf01acda26d59ee77a28353de5ace6e8efd7f95bc542f8bf5de233cc - Sigstore transparency entry: 636122850
- Sigstore integration time:
-
Permalink:
geraldnguyen/frontmatter-utils@6f416b44e94f608b64ec1b5267c4844b39cc2f43 -
Branch / Tag:
refs/tags/v0.13.0 - Owner: https://github.com/geraldnguyen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-build.yml@6f416b44e94f608b64ec1b5267c4844b39cc2f43 -
Trigger Event:
release
-
Statement type: