Skip to content

baldur.interfaces — Config, Runtime & Domain State Stores

The configuration-provider interface and its defaults, the database-health and session-invalidation providers, the runtime-config manager, and the per-domain state-store contracts (canary / chaos / config history / cross-cluster).

Configuration provider

ConfigProviderInterface

Bases: ABC

Abstract interface for configuration providers.

Implementations can load config from: - Django settings - Environment variables - YAML/JSON files - Consul/etcd - AWS Parameter Store

get abstractmethod

get(key: str, default: Any = None) -> Any

Get a configuration value.

Parameters:

Name Type Description Default
key str

Configuration key (dot notation supported, e.g., "BALDUR.SLA.DEFAULT_HOURS")

required
default Any

Default value if key not found

None

Returns:

Type Description
Any

Configuration value or default

get_nested abstractmethod

get_nested(*keys: str, default: Any = None) -> Any

Get a nested configuration value.

Parameters:

Name Type Description Default
keys str

Path to the configuration value

()
default Any

Default value if path not found

None

Returns:

Type Description
Any

Configuration value or default

Example

provider.get_nested("BALDUR", "SLA", "DEFAULT_HOURS", default=24)

get_section abstractmethod

get_section(section: str) -> dict[str, Any]

Get an entire configuration section as a dictionary.

Parameters:

Name Type Description Default
section str

Section name (e.g., "BALDUR")

required

Returns:

Type Description
dict[str, Any]

Dictionary of configuration values

get_bool

get_bool(key: str, default: bool = False) -> bool

Get a boolean configuration value.

get_int

get_int(key: str, default: int = 0) -> int

Get an integer configuration value.

get_float

get_float(key: str, default: float = 0.0) -> float

Get a float configuration value.

get_str

get_str(key: str, default: str = '') -> str

Get a string configuration value.

DictConfigProvider

DictConfigProvider(config: dict[str, Any] | None = None)

Bases: ConfigProviderInterface

Simple dictionary-based configuration provider.

Useful for testing and simple deployments.

get

get(key: str, default: Any = None) -> Any

Get a configuration value using dot notation.

get_nested

get_nested(*keys: str, default: Any = None) -> Any

Get a nested configuration value.

get_section

get_section(section: str) -> dict[str, Any]

Get an entire configuration section.

set

set(key: str, value: Any) -> None

Set a configuration value (for testing).

update

update(config: dict[str, Any]) -> None

Update configuration with new values.

EnvConfigProvider

EnvConfigProvider(prefix: str = '', separator: str = '__')

Bases: ConfigProviderInterface

Environment variable based configuration provider.

Converts nested keys to environment variable format: - "BALDUR.SLA.DEFAULT_HOURS" -> "BALDUR__SLA__DEFAULT_HOURS"

Supports JSON parsing for complex values.

get

get(key: str, default: Any = None) -> Any

Get a configuration value from environment.

get_nested

get_nested(*keys: str, default: Any = None) -> Any

Get a nested configuration value.

get_section

get_section(section: str) -> dict[str, Any]

Get all environment variables with given prefix as a dict.

Note: This is limited compared to file-based config.

Database & session

DatabaseConnectionInfo dataclass

DatabaseConnectionInfo(
    alias: str,
    vendor: str = "unknown",
    is_usable: bool = False,
    metadata: dict[str, Any] = dict(),
)

Database connection metadata returned by health check.

DatabaseHealthProvider

Bases: ABC

Abstract interface for database connection health monitoring.

Provides framework-agnostic access to database connection metadata. Replaces direct django.db.connections usage in service layer.

check_connection abstractmethod

check_connection(
    alias: str = "default",
) -> DatabaseConnectionInfo

Check a specific database connection.

Returns connection metadata including vendor and usability status.

list_aliases abstractmethod

list_aliases() -> list[str]

List all configured database aliases.

Replaces: for alias in django.db.connections

close_all abstractmethod

close_all() -> None

Close all database connections.

Used in Gunicorn post_fork hooks to clean up parent FDs. Replaces: for conn in connections.all(): conn.close()

health_check

health_check() -> bool

Convenience: check default connection health.

Can be registered as a callback in ConnectionHealthMonitor

monitor.register_health_check( ConnectionType.DATABASE, "default", provider.health_check )

SessionInvalidationProvider

Bases: ABC

Abstract interface for session invalidation across backends.

Implementations should return description strings for audit logging.

invalidate_user_sessions abstractmethod

invalidate_user_sessions(user_id: int | str) -> list[str]

Invalidate all active sessions for a user.

Returns:

Type Description
list[str]

List of invalidation description strings

list[str]

(e.g. ["django_sessions(2)", "redis_sessions(0:cleanup_failed)"]).

list[str]

Callers join these for audit logging and security event details.

get_active_session_count abstractmethod

get_active_session_count(user_id: int | str) -> int

Count active sessions for a user.

Useful for security audit compliance checks.

Runtime config

RuntimeConfigManager

Bases: Protocol

Protocol for the PRO runtime configuration manager.

Domain state stores

CanaryRolloutStore

Bases: ABC

Abstract store for canary rollout state and config locks.

Rollout operations: - get_rollout / save_rollout: individual rollout CRUD - get_active_ids / add_active / remove_active: active set management - find_completed: pattern-based search for terminal rollouts

Config lock operations: - acquire_config_lock / release_config_lock / get_config_lock_owner

get_rollout abstractmethod

get_rollout(rollout_id: str) -> dict[str, Any] | None

Get rollout data by ID.

Parameters:

Name Type Description Default
rollout_id str

Rollout identifier

required

Returns:

Type Description
dict[str, Any] | None

Rollout data dict or None

save_rollout abstractmethod

save_rollout(
    rollout_id: str,
    data: dict[str, Any],
    ttl_seconds: int,
    expected_version: int | None = None,
) -> bool

Save rollout data with optional optimistic locking.

Parameters:

Name Type Description Default
rollout_id str

Rollout identifier

required
data dict[str, Any]

Serialized rollout data

required
ttl_seconds int

Time-to-live in seconds

required
expected_version int | None

If provided, save only if stored version matches. None = unconditional save (backward compatible).

None

Returns:

Type Description
bool

True if saved, False if version conflict.

get_active_ids abstractmethod

get_active_ids() -> set[str]

Get IDs of all active rollouts.

Returns:

Type Description
set[str]

Set of active rollout IDs

add_active abstractmethod

add_active(rollout_id: str) -> None

Add a rollout to the active set.

Parameters:

Name Type Description Default
rollout_id str

Rollout identifier

required

remove_active abstractmethod

remove_active(rollout_id: str) -> None

Remove a rollout from the active set.

Parameters:

Name Type Description Default
rollout_id str

Rollout identifier

required

find_completed abstractmethod

find_completed(pattern: str) -> list[dict[str, Any]]

Find rollouts matching a key pattern.

Used to discover completed/terminal rollouts via SCAN.

Parameters:

Name Type Description Default
pattern str

Key pattern (e.g., '{prefix}canary:rollout:*')

required

Returns:

Type Description
list[dict[str, Any]]

List of rollout data dicts

acquire_config_lock abstractmethod

acquire_config_lock(
    config_type: str,
    rollout_id: str,
    timeout: timedelta | None = None,
) -> bool

Acquire a config lock (SET NX PX semantics).

Only one rollout can hold the lock for a given config_type. Lock auto-expires after timeout to prevent zombie locks.

Parameters:

Name Type Description Default
config_type str

Configuration type to lock

required
rollout_id str

Rollout ID as lock owner

required
timeout timedelta | None

Lock auto-expire duration (default: implementation-defined)

None

Returns:

Type Description
bool

True if lock was acquired

release_config_lock abstractmethod

release_config_lock(
    config_type: str, rollout_id: str
) -> bool

Release a config lock (atomic check-and-delete).

Only the lock owner (matching rollout_id) can release.

Parameters:

Name Type Description Default
config_type str

Configuration type

required
rollout_id str

Expected lock owner

required

Returns:

Type Description
bool

True if lock was released

get_config_lock_owner abstractmethod

get_config_lock_owner(config_type: str) -> str | None

Get the current lock owner for a config type.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

Returns:

Type Description
str | None

Lock owner (rollout_id) or None if unlocked

is_config_locked abstractmethod

is_config_locked(config_type: str) -> bool

Check if a config type is currently locked.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

Returns:

Type Description
bool

True if locked

force_release_config_lock abstractmethod

force_release_config_lock(config_type: str) -> bool

Force-release a config lock without owner check.

Admin-only operation for zombie lock cleanup.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

Returns:

Type Description
bool

True if a lock was removed

extend_config_lock abstractmethod

extend_config_lock(
    config_type: str,
    rollout_id: str,
    additional_time: timedelta | None = None,
) -> bool

Extend a config lock's TTL.

Only the lock owner can extend.

Parameters:

Name Type Description Default
config_type str

Configuration type

required
rollout_id str

Expected lock owner

required
additional_time timedelta | None

New TTL measured from now (default: implementation-defined). This resets the deadline to now + additional_time (Redis PEXPIRE semantics), it does not increment the existing deadline — so repeated renewals do not accumulate and the crash-freeze expiry valve is preserved.

None

Returns:

Type Description
bool

True if TTL was extended

ChaosExperimentStore

Bases: ABC

Abstract store for chaos experiment data.

Write operations (chaos service): - save / delete: experiment lifecycle

Read operations (ChaosGuard + chaos service): - get: single experiment lookup - find_active: list all active experiments

save abstractmethod

save(
    experiment_id: str,
    data: dict[str, Any],
    ttl_seconds: int,
) -> None

Save an experiment with TTL.

Parameters:

Name Type Description Default
experiment_id str

Experiment identifier

required
data dict[str, Any]

Serialized experiment data

required
ttl_seconds int

Time-to-live in seconds

required

get abstractmethod

get(experiment_id: str) -> dict[str, Any] | None

Get experiment data by ID.

Parameters:

Name Type Description Default
experiment_id str

Experiment identifier

required

Returns:

Type Description
dict[str, Any] | None

Experiment data dict or None

delete abstractmethod

delete(experiment_id: str) -> None

Delete an experiment.

Parameters:

Name Type Description Default
experiment_id str

Experiment identifier

required

find_active abstractmethod

find_active() -> list[dict[str, Any]]

Find all active experiments.

Returns experiments where status == 'active' and not expired.

Returns:

Type Description
list[dict[str, Any]]

List of experiment data dicts

ConfigHistoryStore

Bases: ABC

Abstract store for configuration version history.

Each method maps to a domain operation: - next_version: atomic version counter increment - save_version: atomic version save (history + current) - get_current / get_history / get_version_count: read operations - clear: cleanup (testing / admin)

next_version abstractmethod

next_version(config_type: str) -> int

Atomically increment and return the next version number.

Parameters:

Name Type Description Default
config_type str

Configuration type (e.g., 'circuit_breaker', 'dlq')

required

Returns:

Type Description
int

New version number (starts from 1)

save_version abstractmethod

save_version(
    config_type: str,
    version_data: dict[str, Any],
    max_entries: int,
) -> None

Atomically save a version to history and update current pointer.

This operation MUST be atomic: - Prepend to history list - Trim history to max_entries - Update current version pointer

Parameters:

Name Type Description Default
config_type str

Configuration type

required
version_data dict[str, Any]

Serialized version data dict

required
max_entries int

Maximum history entries to retain

required

get_current abstractmethod

get_current(config_type: str) -> dict[str, Any] | None

Get the current (latest) version data.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

Returns:

Type Description
dict[str, Any] | None

Version data dict or None if no versions exist

get_history abstractmethod

get_history(
    config_type: str, limit: int
) -> list[dict[str, Any]]

Get version history (newest first).

Parameters:

Name Type Description Default
config_type str

Configuration type

required
limit int

Maximum number of entries to return

required

Returns:

Type Description
list[dict[str, Any]]

List of version data dicts, newest first

get_version_count abstractmethod

get_version_count(config_type: str) -> int

Get the number of stored versions.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

Returns:

Type Description
int

Version count

clear abstractmethod

clear(config_type: str) -> None

Clear all history for a config type.

WARNING: Destructive operation — use for testing/admin only.

Parameters:

Name Type Description Default
config_type str

Configuration type

required

CrossClusterStore

Bases: ABC

Abstract store for cross-cluster propagation and governance policy state.

Propagation request operations: - save_request / get_request: request CRUD - add_pending / remove_pending: per-cluster pending set management

Governance policy operations: - save_policy / get_policy: policy CRUD

save_request abstractmethod

save_request(
    request_id: str, data: dict[str, Any], ttl_seconds: int
) -> None

Save a propagation request with TTL.

Parameters:

Name Type Description Default
request_id str

Request identifier

required
data dict[str, Any]

Serialized request data

required
ttl_seconds int

Time-to-live in seconds

required

get_request abstractmethod

get_request(request_id: str) -> dict[str, Any] | None

Get a propagation request by ID.

Parameters:

Name Type Description Default
request_id str

Request identifier

required

Returns:

Type Description
dict[str, Any] | None

Request data dict or None

add_pending abstractmethod

add_pending(cluster: str, request_id: str) -> None

Add a request to a cluster's pending set.

Parameters:

Name Type Description Default
cluster str

Target cluster name

required
request_id str

Request identifier

required

remove_pending abstractmethod

remove_pending(cluster: str, request_id: str) -> None

Remove a request from a cluster's pending set.

Parameters:

Name Type Description Default
cluster str

Target cluster name

required
request_id str

Request identifier

required

save_policy abstractmethod

save_policy(policy_id: str, data: dict[str, Any]) -> None

Save a governance policy.

Parameters:

Name Type Description Default
policy_id str

Policy identifier (typically config_type)

required
data dict[str, Any]

Serialized policy data

required

get_policy abstractmethod

get_policy(policy_id: str) -> dict[str, Any] | None

Get a governance policy by ID.

Parameters:

Name Type Description Default
policy_id str

Policy identifier

required

Returns:

Type Description
dict[str, Any] | None

Policy data dict or None