Skip to content

baldur — Facade & Bootstrap

The top-level entry points: bootstrap wiring, the resilience facade, the leader-elected scheduler, SQL transaction scope, the admin server, and the framework-extra hooks. Every name below is covered by SemVer compatibility guarantees in v1.x.

Bootstrap

init

init(
    quarantine_callback: (
        Callable[[Exception], None] | None
    ) = None,
    task_backend: str = "inline",
) -> None

Initialize the Baldur framework.

Idempotent — safe to call multiple times. Second and later calls are silent DEBUG-level no-ops.

Parameters:

Name Type Description Default
quarantine_callback Callable[[Exception], None] | None

Optional callable invoked with the FatalConfigError instance when startup config validation fails fatally. Django passes its own _activate_quarantine_mode wrapper here. FastAPI/Flask adapters may pass None.

None
task_backend str

Scheduler execution mode (429 Part 6, C13). - "inline" (default): jobs run on the in-process LeaderScheduler thread. Zero extra dependency; matches Sentry / OTEL precedent. - "celery": scheduled ticks enqueue via the corresponding Celery @shared_task's .delay() instead of running inline. Requires pip install baldur-framework[celery] / an already-configured Celery app. - "arq": reserved for future baldur-framework[arq] extra; currently falls back to inline with a WARNING log.

'inline'

Resilience facade

protect

protect(
    name: str,
    fn: Callable[[], T],
    *,
    fallback: Callable[[], T] | None = None,
    dlq: bool | None = None,
    retry: (
        bool
        | RetryPolicyConfig
        | ResiliencePolicy[T]
        | None
    ) = None,
    circuit_breaker: bool | None = None,
    timeout: float | None = _TIMEOUT_UNSET,
    idempotency_key: (
        str | Callable[[PolicyContext], str] | None
    ) = None,
    idempotency_fail_open: bool | None = None,
    idempotency_ttl: timedelta | None = None,
    idempotency_execution_ttl: timedelta | None = None,
    context: PolicyContext | None = None
) -> T

Run fn under Baldur's composed resilience pipeline and return its value.

Composition order (outer→inner): CircuitBreaker → Retry → Fallback. Final failures optionally flow through DLQSink. fn runs at most retry.max_attempts times; on all-failed without fallback the original exception is re-raised.

Parameters:

Name Type Description Default
name str

Service identifier. Used as the Circuit Breaker key, Retry domain, and Prometheus label — keep it stable per downstream.

required
fn Callable[[], T]

Zero-argument callable to protect. Must be idempotent when retry is enabled — or supply idempotency_key= to dedup duplicate executions.

required
fallback Callable[[], T] | None

Optional zero-argument callable invoked when fn raises and the pipeline decides to fall back. Its return value is returned.

None
dlq bool | None

When True, final failures flow into the DLQ repository resolved via ProviderRegistry. None uses ProtectSettings.default_dlq.

None
retry bool | RetryPolicyConfig | ResiliencePolicy[T] | None

True uses RetryPolicyConfig.from_settings(domain=name); pass a RetryPolicyConfig for explicit control; pass a pre-built ResiliencePolicy (e.g. TenacityBridgePolicy) to use it as the retry stage directly. False/None disables retry (None consults ProtectSettings.default_retry).

None
circuit_breaker bool | None

When True, wraps fn in CircuitBreakerPolicy. None uses ProtectSettings.default_circuit_breaker.

None
idempotency_key str | Callable[[PolicyContext], str] | None

Opt into composed dedup so a retried (or re-submitted) operation runs its side effect once, and a concurrent in-flight duplicate (double-submit, duplicate webhook) is blocked rather than executed in parallel. A str is read as a PolicyContext field name (e.g. "order_id") and namespaced as f"{name}:{value}"; a Callable[[PolicyContext], str] builds a composite/custom key verbatim. Requires a context (raises ValueError otherwise); a missing/None/non-primitive field also raises ValueError rather than producing a false-dedup key. None (default) adds no idempotency stage. Note: dedup is exactly-once for the concurrent case; across a process crash between the side effect and the post-execution mark the operation may run again after the gate record's TTL expires (an essential at-least-once limit — true exactly-once requires a transactional outbox spanning your own side effect).

None
idempotency_fail_open bool | None

Cache-error fail direction for the dedup gate. None (default) consults IdempotencySettings.fail_open_on_cache_error (default fail-closed). False forces fail-closed (a cache I/O error during the check raises IdempotencyUnavailableError); True forces fail-open (the unverifiable check proceeds). Ignored when idempotency_key is None.

None
idempotency_ttl timedelta | None

Dedup memory window — how long a completed operation is remembered (how long duplicates stay blocked after success). None (default) uses the gate memory default (BALDUR_IDEMPOTENCY_GATE_MEMORY_TTL_SECONDS, 30 minutes unless tuned). Ignored when idempotency_key is None.

None
idempotency_execution_ttl timedelta | None

In-flight execution window — how long a running claim is honored before a crashed attempt becomes retryable. None (default) uses the gate execution default (30 minutes). Set it >= the worst-case runtime of fn; a value below it risks a concurrent duplicate run via stale takeover. Ignored when idempotency_key is None.

None
context PolicyContext | None

Optional PolicyContext carrying business identifiers (order_id, user_id, trace_id) for Guard/Hook/Sink propagation. Required when idempotency_key is supplied.

None

Returns:

Type Description
T

Whatever fn (or fallback) returned on the succeeding branch.

Raises:

Type Description
Exception

Re-raises the underlying error when all branches fail and no fallback produced a value.

IdempotencyDuplicateError

A duplicate was blocked by the dedup gate — already completed (SKIP) or a concurrent in-flight call (ABORT). The same type @idempotent raises.

IdempotencyUnavailableError

A cache I/O error prevented the dedup check and idempotency_fail_open resolved to fail-closed (the default).

ValueError

idempotency_key supplied without a context, its field resolves to a missing/non-primitive value, or idempotency_ttl / idempotency_execution_ttl is not a positive timedelta.

aprotect async

aprotect(
    name: str,
    fn: Callable[[], Awaitable[T]],
    *,
    fallback: Callable[[], Awaitable[T]] | None = None,
    dlq: bool | None = None,
    retry: (
        bool
        | RetryPolicyConfig
        | ResiliencePolicy[T]
        | None
    ) = None,
    circuit_breaker: bool | None = None,
    timeout: float | None = _TIMEOUT_UNSET,
    idempotency_key: (
        str | Callable[[PolicyContext], str] | None
    ) = None,
    idempotency_fail_open: bool | None = None,
    idempotency_ttl: timedelta | None = None,
    idempotency_execution_ttl: timedelta | None = None,
    context: PolicyContext | None = None
) -> T

Async counterpart of protect().

Current async limitations (PR1): - circuit_breaker=True → raises NotImplementedError (AsyncCircuitBreakerPolicy is pending). - retry=True / retry=RetryPolicyConfig / retry=<ResiliencePolicy> → raises NotImplementedError (AsyncRetryPolicy and AsyncTenacityBridgePolicy are pending). - None defaults resolve to "async-appropriate off" for CB/Retry, regardless of ProtectSettings.default_* sync defaults.

Supported async kwargs: fallback, dlq, idempotency_key, idempotency_fail_open, idempotency_ttl, idempotency_execution_ttl, context. Idempotency dedup is meaningful here even without async retry — it blocks a duplicate call (double-submit, duplicate webhook): a concurrent in-flight duplicate or an already-completed one raises IdempotencyDuplicateError instead of running the side effect. idempotency_fail_open matches protect (cache-error fail direction; fail-closed by default raises IdempotencyUnavailableError), as do idempotency_ttl (dedup memory window) and idempotency_execution_ttl (in-flight execution window). The guard/hook are sync and invoked by AsyncPolicyComposer around the async chain. When the async policies land, the raise paths will be removed without any API change.

Dedup is exactly-once for the concurrent case; across a process crash between the side effect and the post-execution mark the operation may run again after the gate record's TTL expires (an essential at-least-once limit — see protect).

protected

protected(
    name: str,
    *,
    fallback: Callable[[], Any] | None = None,
    dlq: bool | None = None,
    retry: (
        bool
        | RetryPolicyConfig
        | ResiliencePolicy[Any]
        | None
    ) = None,
    circuit_breaker: bool | None = None,
    timeout: float | None = _TIMEOUT_UNSET,
    idempotency_key: (
        str | Callable[[PolicyContext], str] | None
    ) = None,
    idempotency_fail_open: bool | None = None,
    idempotency_ttl: timedelta | None = None,
    idempotency_execution_ttl: timedelta | None = None,
    context_from: (
        Callable[..., PolicyContext] | None | Literal[False]
    ) = None
) -> Callable[[Callable[..., T]], Callable[..., T]]

Decorator form of protect().

Auto-detects coroutine functions and dispatches to aprotect() when appropriate, so @protected(...) works uniformly for sync and async callables. Arguments passed to the decorated function are forwarded to fn via partial binding.

Parameters:

Name Type Description Default
name str

Service identifier — see protect() docstring.

required
fallback Callable[[], Any] | None

Same as protect().

None
dlq bool | None

Same as protect().

None
retry bool | RetryPolicyConfig | ResiliencePolicy[Any] | None

Same as protect().

None
circuit_breaker bool | None

Same as protect().

None
timeout float | None

Same as protect().

_TIMEOUT_UNSET
idempotency_key str | Callable[[PolicyContext], str] | None

Same as protect(). The decorator auto-builds the PolicyContext from the call site, so idempotency_key="order_id" reads the wrapped function's order_id argument. Incompatible with context_from=False (no context to read — raises ValueError).

None
idempotency_fail_open bool | None

Same as protect() — cache-error fail direction for the dedup gate (None consults the global setting).

None
idempotency_ttl timedelta | None

Same as protect() — dedup memory window for the completed-operation record.

None
idempotency_execution_ttl timedelta | None

Same as protect() — in-flight execution window; set it >= the wrapped function's worst-case runtime.

None
context_from Callable[..., PolicyContext] | None | Literal[False]

Controls auto-population of PolicyContext from the wrapped function's bound arguments. None (default) → auto-extract: order_id / user_id flow to the named fields, every primitive-typed bound arg flows into extra["request_data"] so DLQ entries carry searchable business identifiers and the full payload snapshot. Callable[..., PolicyContext] → custom extract. False → skip extraction; pass context=None to protect(). Use the False sentinel at privacy-sensitive callsites (e.g., @protected("auth.verify_password", context_from=False)).

None

aprotected

aprotected(
    name: str,
    *,
    fallback: Callable[[], Awaitable[Any]] | None = None,
    dlq: bool | None = None,
    retry: (
        bool
        | RetryPolicyConfig
        | ResiliencePolicy[Any]
        | None
    ) = None,
    circuit_breaker: bool | None = None,
    timeout: float | None = _TIMEOUT_UNSET,
    idempotency_key: (
        str | Callable[[PolicyContext], str] | None
    ) = None,
    idempotency_fail_open: bool | None = None,
    idempotency_ttl: timedelta | None = None,
    idempotency_execution_ttl: timedelta | None = None,
    context_from: (
        Callable[..., PolicyContext] | None | Literal[False]
    ) = None
) -> Callable[
    [Callable[..., Awaitable[T]]],
    Callable[..., Awaitable[T]],
]

Async-only decorator. Use @protected for mixed sync/async callsites; prefer @aprotected when you want a type-checker error on misuse against a sync function.

context_from, idempotency_key, idempotency_fail_open, idempotency_ttl, and idempotency_execution_ttl behave identically to @protected (async parity). When AsyncCircuitBreakerPolicy / AsyncRetryPolicy land, DLQ entries from async pipelines will already carry the captured context.

Scheduler

get_leader_scheduler

get_leader_scheduler(
    resource_name: str = DEFAULT_SCHEDULER_RESOURCE,
) -> LeaderScheduler

Return the LeaderScheduler singleton.

Parameters:

Name Type Description Default
resource_name str

Resource name

DEFAULT_SCHEDULER_RESOURCE

Returns:

Type Description
LeaderScheduler

The LeaderScheduler instance

SQL storage

sql_transaction

sql_transaction(conn: Any) -> Any

Suspend repo-scoped auto-commit for the duration of the block.

Usage::

with sql_transaction(conn):
    dlq_repo.save(...)
    cb_repo.update(...)
# single commit (or rollback on exception) applies to both.

All repositories whose get_connection returns conn during the block skip their per-call commit. The context manager itself issues the final commit, or rollback on exception.

Admin server

start_admin_server

start_admin_server(
    port: int | None = None,
    bind: str | None = None,
    *,
    register_shutdown: bool = True
) -> AdminServer

Public entry point — start the admin server.

Arguments override the corresponding settings when provided; otherwise settings (BALDUR_ADMIN_*) apply. Subsequent calls return the already running server.

Parameters:

Name Type Description Default
port int | None

Override BALDUR_ADMIN_PORT.

None
bind str | None

Override BALDUR_ADMIN_BIND.

None
register_shutdown bool

When True (default), integrates with :class:~baldur.core.shutdown_coordinator.ShutdownCoordinator so the server stops cleanly on process shutdown.

True

Returns:

Name Type Description
The AdminServer

class:AdminServer instance.

Raises:

Type Description
AdminAuthRequiredError

non-localhost bind without an API key.

stop_admin_server

stop_admin_server(timeout: float = 5.0) -> None

Stop the singleton admin server if running. Idempotent.

Framework extras

fastapi_lifespan async

fastapi_lifespan(
    app: FastAPI,
) -> AsyncIterator[dict[str, Any]]

Initialize Baldur on app startup, drain on shutdown.

Yields an empty mapping so callers can attach lifespan-scoped state via the standard lifespan_state pattern; the dict is reserved for Baldur-internal use and may carry diagnostic state in future versions.

init_flask

init_flask(
    app: Flask,
    service_name: str | None = None,
    rate_limit: int | None = None,
    window_seconds: int | None = None,
) -> None

Initialize Baldur for a Flask app.

Parameters:

Name Type Description Default
app Flask

The Flask application instance.

required
service_name str | None

Optional upstream identity. When supplied, CB pre-flight + post-response observation are enabled. When None, the CB hooks are no-ops (rate limit + backpressure still apply).

None
rate_limit int | None

Per-instance override for the middleware rate limit (requests per window). None falls back to RateLimitSettings.middleware_rate_limit (default 0 = disabled). Pass a positive integer to enable rate limiting only for this Flask app.

None
window_seconds int | None

Per-instance window size override. None falls back to RateLimitSettings.middleware_window_seconds.

None