baldur.core.exceptions — Exception Hierarchy
Top-level re-export selection rule: a name is exported by
baldur/__init__.py iff it is (a) a domain base class, or (b) a
leaf class raised by code reachable from a top-level public surface.
Everything else stays in baldur.core.exceptions and is reachable via
the nested import path.
When-to-catch (13 top-level exceptions)
| Exception | When raised | Recovery hint |
|---|---|---|
BaldurError |
Catch-all base | Last-resort logging + alert |
AdapterError |
Adapter base (Redis/SQL/Kafka misuse) | Reconnect; fall back to in-memory |
AdapterNotFoundError |
ProviderRegistry.resolve(...) finds no match |
Wire the missing adapter |
CircuitBreakerError |
CB-base — domain-specific subclasses | Do NOT retry — non-retryable |
DLQError |
DLQ-base | Inspect DLQ entry; manual replay if needed |
DLQReplayError |
ReplayService.replay(...) could not complete |
Mark for manual review |
ResilienceError |
Resilience-pattern base | Domain-specific fallback |
RetryExhaustedError |
All retry attempts failed | Send to DLQ; alert on SLA breach |
TimeoutPolicyError |
protect(timeout=...) exceeded |
Cancel downstream; degrade |
RateLimitExceeded |
@rate_limit rejected the call |
Surface 429; honor reset_at |
IdempotencyDuplicateError |
@idempotent detected a duplicate |
Return cached result; do NOT retry |
DomainValidationError |
Domain-tag validation rejected the call | Caller bug — fix the domain identifier |
ConfigurationError |
Settings or wiring misconfiguration | Crash fast — operator must fix env |
Nested-only classes
exceptions
Baldur library exception hierarchy.
All library exceptions inherit from BaldurError.
Callers can use except BaldurError to catch any library error.
Top-level re-export selection rule
A name in core/exceptions.__all__ is re-exported by baldur/__init__.py
iff it is either (a) a domain base class, or (b) a leaf class raised by code
reachable from a top-level public surface (protect, decorators, ...).
Re-exported at baldur top-level (12 names):
Bases — BaldurError, AdapterError, CircuitBreakerError,
DLQError, ResilienceError, ConfigurationError
Leaves — AdapterNotFoundError, RetryExhaustedError,
TimeoutPolicyError, RateLimitExceeded,
IdempotencyDuplicateError, DLQReplayError
Internal / nested-only (baldur.core.exceptions):
AdapterInitializationError, AdapterConnectionError,
RecoveryAdapterError, StoreError,
CircuitBreakerTransitionError, InvalidStateTransitionError,
DLQEntryNotFoundError, AuditError, RunbookError,
SettingsValidationError, StepExecutionError, StepTimeoutError,
CompensationError, ConcurrencyConflictError.
BaldurError
BaldurError(message: str = '', *, code: str = '')
Bases: Exception
Base exception for all baldur library errors.
extra_context
extra_context() -> dict[str, Any]
Return structlog-bindable context. Override in subclasses.
AdapterError
AdapterError(message: str = '', *, code: str = '')
AdapterNotFoundError
AdapterNotFoundError(
message: str = "",
*,
adapter_type: str = "",
adapter_name: str = "",
code: str = ""
)
AdapterInitializationError
AdapterInitializationError(
message: str = "", *, code: str = ""
)
AdapterConnectionError
AdapterConnectionError(
message: str = "", *, code: str = ""
)
RecoveryAdapterError
RecoveryAdapterError(
message: str = "",
*,
service_name: str = "",
replicas: int | None = None,
namespace: str = "",
code: str = ""
)
Bases: AdapterError
Raised when the recovery adapter encounters an error.
Used for input validation and workload detection failures.
StoreError
StoreError(message: str = '', *, code: str = '')
Bases: AdapterError
Base exception for domain state store errors.
Used by domain-specific stores (ConfigHistoryStore, CanaryRolloutStore, ChaosExperimentStore, CrossClusterStore) for data corruption or store-level logic errors.
Infrastructure failures (Redis down) are handled transparently by ResilientStorageBackend's silent fallback — this exception covers cases where the store itself encounters an unrecoverable problem (e.g. unparseable data, schema mismatch).
CircuitBreakerError
CircuitBreakerError(message: str = '', *, code: str = '')
CircuitBreakerTransitionError
CircuitBreakerTransitionError(
message: str = "", *, code: str = ""
)
InvalidStateTransitionError
InvalidStateTransitionError(
message: str = "",
*,
current: str = "",
target: str = "",
entity_id: str = "",
code: str = ""
)
Bases: BaldurError
Raised when an invalid state transition is attempted.
Follows the same pattern as CircuitBreakerTransitionError but is domain-agnostic (Recovery Session, etc.).
DLQError
DLQError(message: str = '', *, code: str = '')
DLQEntryNotFoundError
DLQEntryNotFoundError(message: str = '', *, code: str = '')
DLQStateConflictError
DLQStateConflictError(message: str = '', *, code: str = '')
Bases: DLQError
Raised when a DLQ operation violates an entry's state precondition.
Covers resolved/archived/at-cap/not-in-replayable-state conflicts (e.g. a double-click force-redrive, or a retry of an already-resolved entry). Maps to HTTP 409 Conflict at the handler layer, distinct from a not-found (404) or an unexpected replay-execution failure (500).
DLQReplayError
DLQReplayError(message: str = '', *, code: str = '')
ResilienceError
ResilienceError(message: str = '', *, code: str = '')
RetryExhaustedError
RetryExhaustedError(message: str = '', *, code: str = '')
TimeoutPolicyError
TimeoutPolicyError(
timeout_seconds: float, message: str = ""
)
RateLimitExceeded
RateLimitExceeded(
message: str = "",
*,
key: str = "",
limit: int = 0,
window_seconds: int = 0,
reset_at: int = 0
)
Bases: ResilienceError
Raised by @rate_limit when a call is rejected by the limiter.
Function-level rejection signal — distinct from
RateLimitStorageError (storage-backend failure).
IdempotencyDuplicateError
IdempotencyDuplicateError(
message: str = "",
*,
key: str = "",
domain: str = "",
decision: str = ""
)
Bases: BaldurError
Raised by @idempotent on a detected duplicate or in-flight collision.
Inherits BaldurError directly (correctness contract, not a
resilience stage). Non-retryable by default — outer @dlq_protect
layers should treat this as a terminal signal.
IdempotencyUnavailableError
IdempotencyUnavailableError(
message: str = "", *, key: str = "", error: str = ""
)
Bases: BaldurError
Raised when an idempotency check cannot complete due to a cache I/O failure (e.g. Redis unreachable) on an enabled, explicitly-requested gate.
Distinct from IdempotencyDuplicateError (a successful dedup verdict):
this signals the verdict is unknown, so the caller can assume neither
"safe to skip" nor "safe to run". Fail-closed by default — opt into
fail-open via BALDUR_IDEMPOTENCY_FAIL_OPEN_ON_CACHE_ERROR or the per-call
idempotency_fail_open=True. Wraps the original cache exception (raised
from it) so a backend-specific error never leaks across the boundary.
DomainValidationError
DomainValidationError(
message: str = "",
*,
original_domain: str = "",
reason: Any = None
)
Bases: BaldurError
Raised when a domain input string fails validation.
Carries the original (pre-normalization) input and a typed reject reason for downstream logging / metric labelling.
Modeled on RecoveryAdapterError: raised at validation sites that have
a loud failure mode (decoration-time, where a CI/dev surface can recover
via test or rename). Runtime APIs catch this and fall back to
FALLBACK_DOMAIN instead of propagating.
AuditError
AuditError(message: str = '', *, code: str = '')
RunbookError
RunbookError(message: str = '', *, code: str = '')
ConfigurationError
ConfigurationError(message: str = '', *, code: str = '')
SettingsValidationError
SettingsValidationError(
message: str = "", *, code: str = ""
)
StepExecutionError
StepExecutionError(message: str = '', *, code: str = '')
Bases: BaldurError
Base exception for step execution engine errors.
Shared by Saga, Runbook, and other step-based execution engines. Domain-specific subclasses live in each service module.
StepTimeoutError
StepTimeoutError(
step_type: str | float = "",
timeout_seconds: float | int = 0,
message: str = "",
)
Bases: StepExecutionError
Step execution timed out.
Raised by TimeoutExecutor when a handler does not complete within timeout_seconds.
Supports two call conventions
StepTimeoutError(step_type="X", timeout_seconds=30) # keyword StepTimeoutError("X", 30) # positional (legacy compat) StepTimeoutError(timeout_seconds=30) # timeout only (TimeoutExecutor)
CompensationError
CompensationError(message: str = '', *, code: str = '')
Bases: StepExecutionError
Raised when step compensation fails.
Domain-specific subclasses (SagaCompensationError, RunbookCompensationError) add service-specific extra_context.
ConcurrencyConflictError
ConcurrencyConflictError(
message: str = "",
*,
entity_id: str = "",
expected_version: int = 0,
actual_version: int = 0
)
Bases: BaldurError
Raised on optimistic concurrency control (OCC) conflicts.
Covers version CAS failures in Saga, Runbook, and other engines.
non_retryable_exceptions
non_retryable_exceptions() -> tuple[type[Exception], ...]
Exceptions that must never be retried.
CircuitBreakerError: CB OPEN means 'stop sending traffic'. Retrying defeats circuit breaker semantics. Industry standard (Hystrix, Resilience4j, Polly).