Skip to content

Runtime configs & testing

The 0.7-era AgentRuntimeConfig / ResilienceConfig / ObservabilityConfig wrapper-of-flat-kwargs configs were deleted in 0.7.9 — they bundled flat kwargs into shareable objects with a flat kwarg > config object > default precedence game that required a private _UNSET sentinel on every kwarg (a documented LLM trap, T14 in the audit).

For fleet management, use a Python dict spread:

PROD_DEFAULTS = dict(
    timeout=60,
    max_retries=5,
    max_output_retries=2,
    cache=True,
    verbose=False,
    session=session,
)

researcher = Agent(**PROD_DEFAULTS, engine=LLMEngine("model"), name="research")
writer     = Agent(**PROD_DEFAULTS, engine=LLMEngine("model"), name="write")

Same end-user value, no precedence-game complexity, no sentinel.

Cache config (kept)

CacheConfig is intentionally kept — it carries real semantic value (enabled, ttl) consumed inside LLMEngine.

lazybridge.CacheConfig dataclass

CacheConfig(enabled: bool = True, ttl: str = '5m')

Mark the static prefix of a request (system prompt + tool definitions) as cacheable.

Providers with explicit prompt caching (Anthropic) need a cache_control marker on the last block of each cached segment; this makes it a one-flag opt-in instead of asking callers to hand-craft provider-specific content lists.

Provider-specific behaviour:

  • Anthropic — the system prompt is upgraded from a string to a [{type: "text", text, cache_control}] block; a cache breakpoint is also placed on the last tool definition if tools are present. Cache hits cost ~10% of input tokens; writes cost ~25% more. TTL options: "5m" (default) or "1h".
  • OpenAI — automatic for system prompts >1024 tokens; no user-visible opt-in is required. This config is a no-op but accepted for forward-compat.
  • Google Gemini — explicit Context Caching uses a different lifecycle (create a cache resource, reference by name). Not auto-wired from this config; pass via extra if needed.
  • DeepSeek — automatic; no-op.

Testing

lazybridge.MockAgent

MockAgent(responses: Any, *, name: str = 'mock_agent', description: str | None = None, output: type = str, cycle: bool = False, delay_ms: float = 0.0, default_input_tokens: int = 10, default_output_tokens: int = 20, default_cost_usd: float = 0.0, default_latency_ms: float | None = None, default_model: str = 'mock', default_provider: str = 'mock')

Deterministic test double that quacks like :class:lazybridge.Agent.

Designed for testing pipeline composition and data transmission without touching a real provider. Drop-in compatible with:

  • Agent(..., tools=[mock]) — wrapped via _wrap_tool using the _is_lazy_agent duck-type marker; nested Envelope metadata rolls up through the tool boundary.
  • Plan(Step(target=mock, ...)) — PlanEngine detects _is_lazy_agent and calls target.run(env) directly.
  • mock.as_tool() returns a standard :class:Tool.
  • Agent.chain(mock_a, mock_b) / Agent.parallel(mock_a, mock_b).

Response specification. The responses argument is resolved per call in this order:

  1. Callable fn(env) -> value (sync or async) — the incoming :class:Envelope is passed in. Whatever the callable returns is re-fed through rules 2–4.
  2. Dict — keys are substrings checked against env.task in insertion order; "*" is a catch-all default. No match and no default raises :class:RuntimeError.
  3. List — one response per call, in order. Exhausting the list raises :class:RuntimeError unless cycle=True.
  4. Scalar — returned on every call.

Response value types (after callable resolution):

  • :class:Envelope — returned verbatim; payload + metadata + error preserved. Use this to test error-path propagation.
  • :class:ErrorInfo — returned as Envelope(error=...).
  • :class:BaseException instance — raised from run() so tests can assert on provider-style failures.
  • :class:~pydantic.BaseModel / str / dict / list / scalar — wrapped as the envelope payload with the mock's default metadata (tokens, cost, model, provider).

Recording. Every call is appended to :attr:calls. Use :meth:assert_called_with, :meth:assert_call_count, :attr:call_count, :attr:last_call for assertions. :meth:reset clears both the call log and the list/cycle cursor.

Example::

from lazybridge import Agent, Plan, Step
from lazybridge.testing import MockAgent

researcher = MockAgent(
    {"weather": "sunny", "market": "bullish", "*": "no data"},
    name="researcher",
    default_input_tokens=50, default_output_tokens=30,
)
writer = MockAgent(
    lambda env: f"Report based on: {env.text()}",
    name="writer",
)

plan = Plan(
    Step(target=researcher, task="weather today"),
    Step(target=writer),   # from_prev by default
)
agent = Agent(engine=plan)
env = agent("daily brief")

assert "sunny" in env.text()
assert researcher.call_count == 1
assert writer.call_count == 1
assert env.metadata.input_tokens + env.metadata.output_tokens > 0
Source code in lazybridge/testing.py
def __init__(
    self,
    responses: Any,
    *,
    name: str = "mock_agent",
    description: str | None = None,
    output: type = str,
    cycle: bool = False,
    delay_ms: float = 0.0,
    default_input_tokens: int = 10,
    default_output_tokens: int = 20,
    default_cost_usd: float = 0.0,
    default_latency_ms: float | None = None,
    default_model: str = "mock",
    default_provider: str = "mock",
) -> None:
    self._responses = responses
    self._cycle = cycle
    self._cursor = 0
    self.name = name
    self.description = description
    self.output = output
    # Mirror the Agent surface so test code that introspects agents
    # (session graph registration, verbose exporters, …) works
    # uniformly regardless of whether the agent is real or mocked.
    self.session: Any | None = None
    self.memory: Any | None = None
    self.sources: list[Any] = []
    self.guard: Any | None = None
    self.verify: Any | None = None
    self.timeout: float | None = None
    self.fallback: Any | None = None
    self._tools_raw: list[Any] = []
    self._tool_map: dict[str, Any] = {}

    self._delay_ms = float(delay_ms)
    self._default_input_tokens = int(default_input_tokens)
    self._default_output_tokens = int(default_output_tokens)
    self._default_cost_usd = float(default_cost_usd)
    self._default_latency_ms = default_latency_ms  # None = use measured elapsed
    self._default_model = default_model
    self._default_provider = default_provider

    self.calls: list[MockCall] = []

stream async

stream(task: str | Envelope) -> AsyncGenerator[str, None]

Minimal streaming surface: yields the final text in one chunk.

Real streaming isn't meaningful for a deterministic mock, but tests that exercise the .stream() path still need something that behaves like an async generator.

Source code in lazybridge/testing.py
async def stream(self, task: str | Envelope) -> AsyncGenerator[str, None]:
    """Minimal streaming surface: yields the final text in one chunk.

    Real streaming isn't meaningful for a deterministic mock, but
    tests that exercise the ``.stream()`` path still need something
    that behaves like an async generator.
    """
    env = await self.run(task)
    yield env.text()

definition

definition() -> Any

ToolDefinition for this mock — mirrors Agent.definition().

Source code in lazybridge/testing.py
def definition(self) -> Any:
    """ToolDefinition for this mock — mirrors ``Agent.definition()``."""
    return self.as_tool().definition()

reset

reset() -> None

Clear recorded calls and rewind the list/cycle cursor.

Source code in lazybridge/testing.py
def reset(self) -> None:
    """Clear recorded calls and rewind the list/cycle cursor."""
    self.calls.clear()
    self._cursor = 0

assert_called_with

assert_called_with(task: str | None = None, *, contains: str | None = None) -> None

Assert at least one call matched task= exactly or contains=.

Source code in lazybridge/testing.py
def assert_called_with(
    self,
    task: str | None = None,
    *,
    contains: str | None = None,
) -> None:
    """Assert at least one call matched ``task=`` exactly or ``contains=``."""
    for c in self.calls:
        if task is not None and c.task == task:
            return
        if contains is not None and contains in (c.task or ""):
            return
    raise AssertionError(
        f"MockAgent({self.name!r}): no call matched task={task!r} "
        f"contains={contains!r}. Recorded tasks: "
        f"{[c.task for c in self.calls]}"
    )