Skip to main content

A python implementation of recutils

Project description

python-recutils

A Python implementation of GNU recutils, a set of tools and libraries to access human-editable, text-based databases called recfiles.

Installation

pip install python-recutils

Or with uv:

uv add python-recutils

Development

To contribute or modify the library:

git clone https://github.com/hkhanna/python-recutils.git
cd python-recutils
uv sync

Run tests with:

uv run pytest tests/ -v

Publishing

  1. Run uv version --bump <major, minor, patch>
  2. Tag and push: git tag v0.x.x && git push --tags

Usage

Parsing Rec Files

from recutils import parse, parse_file

# Parse from string
data = """
Name: Ada Lovelace
Age: 36

Name: Peter the Great
Age: 53
"""

record_sets = parse(data)
for rs in record_sets:
    for record in rs.records:
        print(record.get_field('Name'), record.get_field('Age'))

# Parse from file
with open('contacts.rec') as f:
    record_sets = parse_file(f)

Using recsel

The recsel function mirrors the interface of the recsel command-line utility.

from recutils import recsel, format_recsel_output

data = """
%rec: Book
%mandatory: Title

Title: GNU Emacs Manual
Author: Richard M. Stallman
Location: home

Title: The Colour of Magic
Author: Terry Pratchett
Location: loaned

Title: Mio Cid
Author: Anonymous
Location: home
"""

# Select all books
result = recsel(data, record_type='Book')
print(format_recsel_output(result))

# Select with expression (like recsel -e)
result = recsel(data, record_type='Book', expression="Location = 'home'")

# Select by position (like recsel -n)
result = recsel(data, record_type='Book', indexes='0,2')

# Print specific fields (like recsel -p)
result = recsel(data, record_type='Book', print_fields='Title,Author')

# Print values only (like recsel -P)
result = recsel(data, record_type='Book', print_values='Title')
# Returns: "GNU Emacs Manual\nThe Colour of Magic\nMio Cid"

# Count records (like recsel -c)
count = recsel(data, record_type='Book', count=True)
# Returns: 3

# Sort output (like recsel -S)
result = recsel(data, record_type='Book', sort='Title')

# Random selection (like recsel -m)
result = recsel(data, record_type='Book', random_count=2)

recsel Options

Option CLI Equivalent Description
record_type -t TYPE Select records of this type
indexes -n INDEXES Select by position (e.g., "0,2,4-9")
expression -e EXPR Selection expression filter
quick -q STR Quick substring search
random_count -m NUM Select N random records
print_fields -p FIELDS Print fields with names
print_values -P FIELDS Print field values only
print_row -R FIELDS Print values space-separated
count -c Return count of matches
include_descriptors -d Include record descriptors
collapse -C Don't separate with blank lines
case_insensitive -i Case-insensitive matching
sort -S FIELDS Sort by fields
group_by -G FIELDS Group by fields
uniq -U Remove duplicate fields

Selection Expressions

Selection expressions filter records based on field values:

# Numeric comparisons
recsel(data, expression="Age < 18")
recsel(data, expression="Score >= 90")

# String equality
recsel(data, expression="Name = 'John'")
recsel(data, expression="Status != 'inactive'")

# Regex matching
recsel(data, expression=r"Email ~ '\.org$'")

# Logical operators
recsel(data, expression="Age > 18 && Status = 'active'")
recsel(data, expression="Role = 'admin' || Role = 'superuser'")
recsel(data, expression="!Disabled")

# Field count
recsel(data, expression="#Email > 1")  # Records with multiple Email fields

# Field subscripts
recsel(data, expression="Email[0] ~ 'primary'")  # First Email field

# Implies operator
recsel(data, expression="Premium => Discount")  # If Premium, must have Discount

# Ternary conditional
recsel(data, expression="Age > 18 ? 1 : 0")

# String concatenation
recsel(data, expression="First & ' ' & Last = 'John Doe'")

# Arithmetic
recsel(data, expression="Price * Quantity > 100")

Using recfix

The recfix function checks and fixes rec files, similar to the recfix command-line utility.

from recutils import recfix, format_recfix_output

data = """
%rec: Contact
%mandatory: Name Email
%type: Age int
%key: Id
%auto: Id

Name: Alice
Email: alice@example.com
Age: 30

Name: Bob
Email: bob@example.com
Age: twenty-five
"""

# Check integrity (default behavior)
result = recfix(data)
if not result.success:
    print(result.format_errors())
    # Output: error: type 'Contact' record 1 field 'Age': expected integer, got 'twenty-five'

# Sort records according to %sort specification
data_with_sort = """
%rec: Book
%sort: Title

Title: Zebra Tales
Title: Apple Picking
Title: Mountain Views
"""
result = recfix(data_with_sort, sort=True)
print(format_recfix_output(result))

# Generate auto fields for records missing them
data_with_auto = """
%rec: Item
%key: Id
%auto: Id

Name: First Item

Name: Second Item
"""
result = recfix(data_with_auto, auto=True)
# Records now have auto-generated Id fields (0, 1, ...)

# Encrypt confidential fields
data_with_confidential = """
%rec: User
%confidential: Password

Name: Alice
Password: secret123
"""
result = recfix(data_with_confidential, encrypt=True, password="mykey")

# Decrypt confidential fields
result = recfix(encrypted_data, decrypt=True, password="mykey")

# Force operations even with integrity errors
result = recfix(data, sort=True, force=True)

recfix Options

Option CLI Equivalent Description
check (default) Check database integrity
sort -s Sort records per %sort specification
encrypt --encrypt Encrypt confidential fields
decrypt --decrypt Decrypt confidential fields
auto -A Generate auto fields
password -p Password for encryption/decryption
force -f Force operations even with integrity errors

Integrity Checks

recfix validates records against their descriptor constraints:

  • %mandatory: Required fields must be present
  • %key: Key field must be unique across records and can only appear once per record
  • %unique: Field can only appear once per record
  • %singular: Field value must be unique across all records
  • %prohibit: Prohibited fields must not be present
  • %allowed: Only listed fields are allowed
  • %type: Field values must match their declared type
  • %typedef: Custom type definitions (checked for circular references and undefined types)
  • %constraint: Custom constraint expressions must evaluate to true
  • %size: Record set must have the specified number of records
  • %confidential: Fields marked confidential must be encrypted

Supported Types

The %type directive supports these built-in types:

Type Description Example
int Integer (decimal, hex with 0x, octal with leading 0) 42, 0xFF, 077
real Floating-point number 3.14, -2.5
bool Boolean value yes, no, true, false, 0, 1
range MIN MAX Integer within range range 1 100
size N String with max length N size 255
line Single-line string (no newlines)
enum VAL1 VAL2... One of the listed values enum draft published archived
date Date value
email Email address (must contain @) user@example.com
uuid UUID string 550e8400-e29b-41d4-a716-446655440000
regexp /PATTERN/ String matching regex pattern regexp /^[A-Z]{2}[0-9]{4}$/
field Valid field name
rec TYPE Foreign key reference to another record type rec: Contact

Custom types can be defined with %typedef:

data = """
%rec: Person
%typedef: Percentage range 0 100
%typedef: Status enum active inactive pending
%type: Score Percentage
%type: AccountStatus Status

Name: Alice
Score: 85
AccountStatus: active
"""

Working with Records

from recutils import Record, Field

# Create a record
record = Record(fields=[
    Field('Name', 'John Doe'),
    Field('Email', 'john@example.com'),
    Field('Email', 'john.doe@work.com'),  # Multiple fields with same name
])

# Access fields
name = record.get_field('Name')           # First value: 'John Doe'
emails = record.get_fields('Email')       # All values: ['john@example.com', 'john.doe@work.com']
count = record.get_field_count('Email')   # Count: 2
has_phone = record.has_field('Phone')     # False

# Convert to string (rec format)
print(str(record))
# Output:
# Name: John Doe
# Email: john@example.com
# Email: john.doe@work.com

Evaluating Expressions Directly

from recutils import evaluate_sex, Record, Field

record = Record(fields=[
    Field('Age', '25'),
    Field('Status', 'active'),
])

# Evaluate expression against a record
matches = evaluate_sex("Age > 18 && Status = 'active'", record)
# Returns: True

Rec Format Overview

Recfiles are text files with a simple format:

# Comments start with #

# Record descriptor (optional, defines record type)
%rec: Contact
%mandatory: Name
%type: Age int

# Records are separated by blank lines
Name: Alice Smith
Email: alice@example.com
Age: 30

Name: Bob Jones
Email: bob@example.com
Email: bob.jones@work.com
Age: 25
Phone: +1 555-1234

Key concepts:

  • Fields: Name: Value pairs
  • Records: Groups of fields separated by blank lines
  • Multi-line values: Use + continuation or \ line continuation
  • Record descriptors: Special records starting with %rec: that define record types
  • Comments: Lines starting with #

License

See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_recutils-0.1.4.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_recutils-0.1.4-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file python_recutils-0.1.4.tar.gz.

File metadata

  • Download URL: python_recutils-0.1.4.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for python_recutils-0.1.4.tar.gz
Algorithm Hash digest
SHA256 fb68593fd656e59095817c5818945692ee04dacae5bad32b949ce2e579be3401
MD5 dd82d6df24529dec1bc3b5a52a4847ee
BLAKE2b-256 c493bf573a985a8988ab4acb4c8bec44ba5d106b5b3a7554c257e25b45aab189

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_recutils-0.1.4.tar.gz:

Publisher: publish.yml on hkhanna/python-recutils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file python_recutils-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for python_recutils-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3f8277537c150c926b4c22414914614086f6b3fa531314634276d89e835c268f
MD5 c06f892f5e809bd95a06baf53fbcec28
BLAKE2b-256 52da7f5e7cbd83fa6ecedf4cfab10ff2b9d79d7a18e7e99ca3221b75223ce6b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_recutils-0.1.4-py3-none-any.whl:

Publisher: publish.yml on hkhanna/python-recutils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page