Skip to main content

Modular configuration system with composable settings and environment variable overrides

Project description

DataKnobs Config

A modular, reusable configuration system for composable settings with environment variable overrides, file loading, and optional object construction helpers.

Features

  • Modular Configuration: Organize configurations by type with atomic configuration units
  • Multiple Input Formats: Load from YAML, JSON files, or Python dictionaries
  • Composable: Reference other configurations and compose complex setups
  • Environment Overrides: Override any configuration value via environment variables
  • Path Resolution: Automatically resolve relative paths to absolute
  • Object Construction: Optional helpers to build objects from configurations
  • Defaults Management: Global and type-specific default values
  • Caching: Cache constructed objects for efficiency

Installation

pip install dataknobs-config

Quick Start

from dataknobs_config import Config

# Load from dictionary
config = Config({
    "database": [
        {"name": "primary", "host": "localhost", "port": 5432},
        {"name": "secondary", "host": "backup.local", "port": 5433}
    ],
    "cache": [
        {"name": "redis", "host": "localhost", "port": 6379}
    ]
})

# Access configurations
primary_db = config.get("database", "primary")
print(primary_db["host"])  # localhost

# Load from file
config = Config.from_file("config.yaml")

# Load from multiple sources
config = Config("base.yaml", "overrides.json", {"extra": [...]})

Core Concepts

Atomic Configurations

Each configuration is an "atomic" unit - a dictionary of settings for a single object:

{
    "name": "primary",      # Optional, auto-generated if not provided
    "type": "database",     # Optional, inferred from parent key
    "host": "localhost",
    "port": 5432,
    # ... any other attributes
}

Configuration Structure

Internally, configurations are organized by type:

{
    "database": [           # Type name
        {...},              # Atomic config 1
        {...}               # Atomic config 2
    ],
    "cache": [
        {...}               # Atomic config
    ],
    "settings": {           # Special type for global settings
        "config_root": "/app/config",
        "default_timeout": 30
    }
}

String References (xref)

Reference other configurations using the xref format:

config = Config({
    "database": [
        {"name": "primary", "host": "db.example.com"}
    ],
    "api": [
        {
            "name": "main",
            "database": "xref:database[primary]"  # Reference
        }
    ]
})

# Resolve references
api = config.resolve_reference("xref:api[main]")
print(api["database"]["host"])  # db.example.com

Reference Formats

  • xref:type[name] - Reference by name
  • xref:type[0] - Reference by index
  • xref:type[-1] - Reference last item
  • xref:type - Reference first/only item

Environment Variable Overrides

Override any configuration value using environment variables:

export DATAKNOBS_DATABASE__PRIMARY__HOST=prod.example.com
export DATAKNOBS_DATABASE__PRIMARY__PORT=5433
export DATAKNOBS_CACHE__REDIS__TTL=7200
config = Config({
    "database": [{"name": "primary", "host": "localhost", "port": 5432}],
    "cache": [{"name": "redis", "ttl": 3600}]
})

# Environment variables automatically override values
db = config.get("database", "primary")
print(db["host"])  # prod.example.com
print(db["port"])  # 5433 (converted to int)

Environment Variable Format

  • Pattern: DATAKNOBS_<TYPE>__<NAME_OR_INDEX>__<ATTRIBUTE>
  • Nested attributes: DATAKNOBS_DATABASE__0__CONNECTION__TIMEOUT
  • Automatic type conversion for integers, floats, and booleans

File References

Reference external configuration files using the @ prefix:

# main.yaml
database:
  - "@database/primary.yaml"    # Load from file
  - "@database/secondary.yaml"

settings:
  config_root: /app/config       # Base path for relative references

Global Settings and Defaults

Configure global settings and defaults in the special settings section:

config = Config({
    "database": [{"name": "db1"}],
    "settings": {
        # Paths
        "config_root": "/app/config",           # Base path for "@"-prefixed config file references
        "global_root": "/app",                   # Base for path resolution (settings.path_resolution_attributes)
        "database.global_root": "/app/db",       # Type-specific base for path resolution
        
        # Path resolution (supports exact names and regex patterns)
        "path_resolution_attributes": [
            "config_path",                       # Exact match for all types
            "database.data_dir",                 # Exact match for database type only
            "/.*_path$/",                        # Regex: all attributes ending with "_path"
            "cache./.*_dir$/"                    # Regex: cache type attributes ending with "_dir"
        ],
        
        # Defaults
        "default_timeout": 30,                   # Global default
        "database.default_pool_size": 10        # Type-specific default
    }
})

Path Resolution

Automatically resolve relative paths to absolute:

config = Config({
    "database": [{
        "name": "db1",
        "data_dir": "./data",              # Relative path
        "backup_dir": "/abs/path"          # Absolute path unchanged
    }],
    "settings": {
        "global_root": "/app",              # Base for path resolution
        "path_resolution_attributes": ["data_dir", "backup_dir"]
    }
})

db = config.get("database", "db1")
print(db["data_dir"])     # /app/data (resolved)
print(db["backup_dir"])   # /abs/path (unchanged)

Object Construction (Optional)

Build objects directly from configurations:

# Using class attribute
config = Config({
    "database": [{
        "name": "primary",
        "class": "myapp.database.PostgreSQL",
        "host": "localhost",
        "port": 5432
    }]
})

# Build object
db = config.build_object("xref:database[primary]")
# Returns instance of myapp.database.PostgreSQL

# Using factory pattern
config = Config({
    "cache": [{
        "name": "redis",
        "factory": "myapp.cache.CacheFactory",
        "type": "redis",
        "host": "localhost"
    }]
})

cache = config.build_object("xref:cache[redis]")

Implementing Configurable Classes

from dataknobs_config import ConfigurableBase

class MyDatabase(ConfigurableBase):
    def __init__(self, host, port, **kwargs):
        self.host = host
        self.port = port
        
    @classmethod
    def from_config(cls, config):
        # Custom configuration logic
        return cls(**config)

Implementing Factories

from dataknobs_config import FactoryBase

class DatabaseFactory(FactoryBase):
    def create(self, **config):
        db_type = config.pop("type", "postgresql")
        if db_type == "postgresql":
            return PostgreSQL(**config)
        elif db_type == "mysql":
            return MySQL(**config)

Lazy Factory Access

# Configuration with factory
config = Config({
    "database": [{
        "name": "primary",
        "factory": "myapp.db.DatabaseFactory",
        "type": "postgresql",
        "host": "localhost"
    }]
})

# Get the factory instance (cached)
factory = config.get_factory("database", "primary")
db1 = factory.create(database="app1")
db2 = factory.create(database="app2")

# Or get an instance directly
db = config.get_instance("database", "primary", database="myapp")

API Reference

Config Class

class Config:
    def __init__(self, *sources, use_env=True)
    def from_file(cls, path) -> Config
    def from_dict(cls, data) -> Config
    
    # Access
    def get_types() -> List[str]
    def get_count(type_name: str) -> int
    def get_names(type_name: str) -> List[str]
    def get(type_name: str, name_or_index: Union[str, int] = 0) -> dict
    def set(type_name: str, name_or_index: Union[str, int], config: dict)
    
    # References
    def resolve_reference(ref: str) -> dict
    def build_reference(type_name: str, name_or_index: Union[str, int]) -> str
    
    # Merging
    def merge(other: Config, precedence: str = "first")
    
    # Export
    def to_dict() -> dict
    def to_file(path: Path, format: str = None)
    
    # Object Construction
    def build_object(ref: str, cache: bool = True, **kwargs) -> Any
    def clear_object_cache(ref: str = None)
    
    # Lazy Factory Access
    def get_factory(type_name: str, name_or_index: Union[str, int] = 0) -> Any
    def get_instance(type_name: str, name_or_index: Union[str, int] = 0, **kwargs) -> Any

Examples

Multi-Environment Configuration

# base.yaml
database:
  - name: primary
    host: localhost
    port: 5432

# production.yaml  
database:
  - name: primary
    host: prod.db.example.com
    pool_size: 50

# Load with overrides
config = Config("base.yaml", "production.yaml")

Service Discovery Integration

config = Config({
    "services": [
        {"name": "auth", "url": "http://auth:8000"},
        {"name": "api", "url": "http://api:8080"}
    ],
    "app": [{
        "name": "main",
        "auth_service": "xref:services[auth]",
        "api_service": "xref:services[api]"
    }]
})

app = config.resolve_reference("xref:app[main]")
# app["auth_service"]["url"] = "http://auth:8000"

Dynamic Configuration with Environment

# Development: export DATAKNOBS_DATABASE__PRIMARY__HOST=localhost
# Production:  export DATAKNOBS_DATABASE__PRIMARY__HOST=prod.db.aws.com

config = Config.from_file("config.yaml")
db = config.get("database", "primary")
# Automatically uses environment-appropriate host

Best Practices

  1. Use Type Organization: Group related configurations by type
  2. Leverage Defaults: Define common values in settings to avoid repetition
  3. Environment Overrides: Use for deployment-specific values (hosts, ports, credentials)
  4. File References: Split large configurations into manageable files
  5. Path Resolution: Use relative paths in configs for portability
  6. Object Caching: Enable caching for expensive object construction

Testing

Run tests with pytest:

pytest tests/

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataknobs_config-0.1.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataknobs_config-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file dataknobs_config-0.1.0.tar.gz.

File metadata

  • Download URL: dataknobs_config-0.1.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.12

File hashes

Hashes for dataknobs_config-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ddd9c438c5a163256ad9df1acf1217342ff3e1b2088583cdd1e68dbc0ec13cf
MD5 139f2da83639190928e1bb85ff4e24c7
BLAKE2b-256 bc4c400633d8e28676953412a490732d17e7f92627de1a4e20af55a2d7f8f854

See more details on using hashes here.

File details

Details for the file dataknobs_config-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dataknobs_config-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa4d1010068cc1b0b017b0bb304cd0b08023df8290aa66b791f4f586d527c65f
MD5 f297cada7b76e7ac541933ccedbe7143
BLAKE2b-256 43fe90d54e314fe924d18f21b2080e126a86a680840e8f262fbf04f60bbc730f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page