Skip to main content

Modular configuration system with composable settings and environment variable overrides

Project description

DataKnobs Config

A modular, reusable configuration system for composable settings with environment variable overrides, file loading, and optional object construction helpers.

Features

  • Modular Configuration: Organize configurations by type with atomic configuration units
  • Multiple Input Formats: Load from YAML, JSON files, or Python dictionaries
  • Composable: Reference other configurations and compose complex setups
  • Environment Overrides: Override any configuration value via environment variables
  • Path Resolution: Automatically resolve relative paths to absolute
  • Object Construction: Optional helpers to build objects from configurations
  • Defaults Management: Global and type-specific default values
  • Caching: Cache constructed objects for efficiency

Installation

pip install dataknobs-config

Quick Start

from dataknobs_config import Config

# Load from dictionary
config = Config({
    "database": [
        {"name": "primary", "host": "localhost", "port": 5432},
        {"name": "secondary", "host": "backup.local", "port": 5433}
    ],
    "cache": [
        {"name": "redis", "host": "localhost", "port": 6379}
    ]
})

# Access configurations
primary_db = config.get("database", "primary")
print(primary_db["host"])  # localhost

# Load from file
config = Config.from_file("config.yaml")

# Load from multiple sources
config = Config("base.yaml", "overrides.json", {"extra": [...]})

Core Concepts

Atomic Configurations

Each configuration is an "atomic" unit - a dictionary of settings for a single object:

{
    "name": "primary",      # Optional, auto-generated if not provided
    "type": "database",     # Optional, inferred from parent key
    "host": "localhost",
    "port": 5432,
    # ... any other attributes
}

Configuration Structure

Internally, configurations are organized by type:

{
    "database": [           # Type name
        {...},              # Atomic config 1
        {...}               # Atomic config 2
    ],
    "cache": [
        {...}               # Atomic config
    ],
    "settings": {           # Special type for global settings
        "config_root": "/app/config",
        "default_timeout": 30
    }
}

String References (xref)

Reference other configurations using the xref format:

config = Config({
    "database": [
        {"name": "primary", "host": "db.example.com"}
    ],
    "api": [
        {
            "name": "main",
            "database": "xref:database[primary]"  # Reference
        }
    ]
})

# Resolve references
api = config.resolve_reference("xref:api[main]")
print(api["database"]["host"])  # db.example.com

Reference Formats

  • xref:type[name] - Reference by name
  • xref:type[0] - Reference by index
  • xref:type[-1] - Reference last item
  • xref:type - Reference first/only item

Environment Variable Overrides

Override any configuration value using environment variables:

export DATAKNOBS_DATABASE__PRIMARY__HOST=prod.example.com
export DATAKNOBS_DATABASE__PRIMARY__PORT=5433
export DATAKNOBS_CACHE__REDIS__TTL=7200
config = Config({
    "database": [{"name": "primary", "host": "localhost", "port": 5432}],
    "cache": [{"name": "redis", "ttl": 3600}]
})

# Environment variables automatically override values
db = config.get("database", "primary")
print(db["host"])  # prod.example.com
print(db["port"])  # 5433 (converted to int)

Environment Variable Format

  • Pattern: DATAKNOBS_<TYPE>__<NAME_OR_INDEX>__<ATTRIBUTE>
  • Nested attributes: DATAKNOBS_DATABASE__0__CONNECTION__TIMEOUT
  • Automatic type conversion for integers, floats, and booleans

File References

Reference external configuration files using the @ prefix:

# main.yaml
database:
  - "@database/primary.yaml"    # Load from file
  - "@database/secondary.yaml"

settings:
  config_root: /app/config       # Base path for relative references

Global Settings and Defaults

Configure global settings and defaults in the special settings section:

config = Config({
    "database": [{"name": "db1"}],
    "settings": {
        # Paths
        "config_root": "/app/config",           # Base path for "@"-prefixed config file references
        "global_root": "/app",                   # Base for path resolution (settings.path_resolution_attributes)
        "database.global_root": "/app/db",       # Type-specific base for path resolution
        
        # Path resolution (supports exact names and regex patterns)
        "path_resolution_attributes": [
            "config_path",                       # Exact match for all types
            "database.data_dir",                 # Exact match for database type only
            "/.*_path$/",                        # Regex: all attributes ending with "_path"
            "cache./.*_dir$/"                    # Regex: cache type attributes ending with "_dir"
        ],
        
        # Defaults
        "default_timeout": 30,                   # Global default
        "database.default_pool_size": 10        # Type-specific default
    }
})

Path Resolution

Automatically resolve relative paths to absolute:

config = Config({
    "database": [{
        "name": "db1",
        "data_dir": "./data",              # Relative path
        "backup_dir": "/abs/path"          # Absolute path unchanged
    }],
    "settings": {
        "global_root": "/app",              # Base for path resolution
        "path_resolution_attributes": ["data_dir", "backup_dir"]
    }
})

db = config.get("database", "db1")
print(db["data_dir"])     # /app/data (resolved)
print(db["backup_dir"])   # /abs/path (unchanged)

Object Construction (Optional)

Build objects directly from configurations:

# Using class attribute
config = Config({
    "database": [{
        "name": "primary",
        "class": "myapp.database.PostgreSQL",
        "host": "localhost",
        "port": 5432
    }]
})

# Build object
db = config.build_object("xref:database[primary]")
# Returns instance of myapp.database.PostgreSQL

# Using factory pattern
config = Config({
    "cache": [{
        "name": "redis",
        "factory": "myapp.cache.CacheFactory",
        "type": "redis",
        "host": "localhost"
    }]
})

cache = config.build_object("xref:cache[redis]")

Implementing Configurable Classes

from dataknobs_config import ConfigurableBase

class MyDatabase(ConfigurableBase):
    def __init__(self, host, port, **kwargs):
        self.host = host
        self.port = port
        
    @classmethod
    def from_config(cls, config):
        # Custom configuration logic
        return cls(**config)

Implementing Factories

from dataknobs_config import FactoryBase

class DatabaseFactory(FactoryBase):
    def create(self, **config):
        db_type = config.pop("type", "postgresql")
        if db_type == "postgresql":
            return PostgreSQL(**config)
        elif db_type == "mysql":
            return MySQL(**config)

Lazy Factory Access

# Configuration with factory
config = Config({
    "database": [{
        "name": "primary",
        "factory": "myapp.db.DatabaseFactory",
        "type": "postgresql",
        "host": "localhost"
    }]
})

# Get the factory instance (cached)
factory = config.get_factory("database", "primary")
db1 = factory.create(database="app1")
db2 = factory.create(database="app2")

# Or get an instance directly
db = config.get_instance("database", "primary", database="myapp")

API Reference

Config Class

class Config:
    def __init__(self, *sources, use_env=True)
    def from_file(cls, path) -> Config
    def from_dict(cls, data) -> Config
    
    # Access
    def get_types() -> List[str]
    def get_count(type_name: str) -> int
    def get_names(type_name: str) -> List[str]
    def get(type_name: str, name_or_index: Union[str, int] = 0) -> dict
    def set(type_name: str, name_or_index: Union[str, int], config: dict)
    
    # References
    def resolve_reference(ref: str) -> dict
    def build_reference(type_name: str, name_or_index: Union[str, int]) -> str
    
    # Merging
    def merge(other: Config, precedence: str = "first")
    
    # Export
    def to_dict() -> dict
    def to_file(path: Path, format: str = None)
    
    # Object Construction
    def build_object(ref: str, cache: bool = True, **kwargs) -> Any
    def clear_object_cache(ref: str = None)
    
    # Lazy Factory Access
    def get_factory(type_name: str, name_or_index: Union[str, int] = 0) -> Any
    def get_instance(type_name: str, name_or_index: Union[str, int] = 0, **kwargs) -> Any

Examples

Multi-Environment Configuration

# base.yaml
database:
  - name: primary
    host: localhost
    port: 5432

# production.yaml  
database:
  - name: primary
    host: prod.db.example.com
    pool_size: 50

# Load with overrides
config = Config("base.yaml", "production.yaml")

Service Discovery Integration

config = Config({
    "services": [
        {"name": "auth", "url": "http://auth:8000"},
        {"name": "api", "url": "http://api:8080"}
    ],
    "app": [{
        "name": "main",
        "auth_service": "xref:services[auth]",
        "api_service": "xref:services[api]"
    }]
})

app = config.resolve_reference("xref:app[main]")
# app["auth_service"]["url"] = "http://auth:8000"

Dynamic Configuration with Environment

# Development: export DATAKNOBS_DATABASE__PRIMARY__HOST=localhost
# Production:  export DATAKNOBS_DATABASE__PRIMARY__HOST=prod.db.aws.com

config = Config.from_file("config.yaml")
db = config.get("database", "primary")
# Automatically uses environment-appropriate host

Best Practices

  1. Use Type Organization: Group related configurations by type
  2. Leverage Defaults: Define common values in settings to avoid repetition
  3. Environment Overrides: Use for deployment-specific values (hosts, ports, credentials)
  4. File References: Split large configurations into manageable files
  5. Path Resolution: Use relative paths in configs for portability
  6. Object Caching: Enable caching for expensive object construction

Testing

Run tests with pytest:

pytest tests/

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataknobs_config-0.2.0.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataknobs_config-0.2.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file dataknobs_config-0.2.0.tar.gz.

File metadata

  • Download URL: dataknobs_config-0.2.0.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.12

File hashes

Hashes for dataknobs_config-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b6174b15d608307a9f7e3cbcc47c54960934397b2631a6d991ccf07e5ed4f9ee
MD5 170654954c057251eaa236166575b7ca
BLAKE2b-256 1a7a2d4d5170df3df5edf63d1c0a10665fc7492b73a7ccd5505954fb2ea503f9

See more details on using hashes here.

File details

Details for the file dataknobs_config-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dataknobs_config-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 083898b153c30106ed5bc7c6b52ce4ec5feb51c57b1dc4f773e5121b7ca93b07
MD5 81fb758320873fe6d08093afb8d971d6
BLAKE2b-256 ff4fd99f81bb5d9d80c125449e648803fe5bad5e4c2014caaeccbec9c29503d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page