Custom providers¶

BaseProvider is the stable extension point for integrating any LLM backend. The provider registry on LLMEngine routes model strings to registered providers.

For narrative usage see Guides → Advanced → BaseProvider and Guides → Advanced → Providers (built-in catalogue + tier tables).

Abstract base class¶

lazybridge.BaseProvider ¶

BaseProvider(api_key: str | None = None, model: str | None = None, *, fallback_model: str | None = None, strict_native_tools: bool | None = None, **kwargs: Any)

Bases: ABC

Stable abstract base class for all LLM providers.

Subclass this to integrate any LLM backend with LazyBridge. Plug a custom provider in by constructing an LLMEngine that routes to it (see lazybridge/core/executor.py for resolution)::

agent = Agent(engine=LLMEngine("my-model"))

Stability contract The following are guaranteed stable across minor versions:

__init__(api_key, model, **kwargs) signature
_init_client(**kwargs) — override to initialise your SDK client
complete(request) — synchronous completion
stream(request) — synchronous streaming
acomplete(request) — async completion
astream(request) — async streaming generator
default_model: str — class-level default model name
supported_native_tools: frozenset[NativeTool] — declare web search etc.
get_default_max_tokens(model) — override to set per-model limits
_resolve_model(request) — helper: request.model → self.model → default_model
_compute_cost(model, input_tokens, output_tokens) — override for cost tracking
_check_native_tools(tools) — filters unsupported native tools with a warning

What you MUST implement: complete, stream, acomplete, astream.

What you SHOULD override: _init_client, default_model, get_default_max_tokens, _compute_cost.

What you MUST NOT do: - Raise exceptions other than Python built-ins or your SDK's own error types. LazyBridge does not wrap provider exceptions — they propagate as-is. - Mutate request — it is shared and must be treated as read-only. - Block the event loop inside acomplete / astream — use await or asyncio.get_event_loop().run_in_executor for blocking SDK calls.

Initialise the provider.

Parameters¶

api_key: Provider API key. If None, _init_client reads it from an environment variable (standard pattern for all built-in providers). model: Model identifier to use for all requests. When None and default_model is also None (recommended for paid cloud providers), _resolve_model raises a clear ValueError rather than silently falling back to an expensive flagship. fallback_model: Model to use when neither model= nor request.model is set. Two forms: - Explicit string, e.g. fallback_model="gpt-4o-mini" — used verbatim (tier aliases are resolved normally). - "cheapest" — automatically resolves to the cheapest tier alias available on this provider (super_cheap → cheap → medium, in that order). When None (default) and no model is configured, a ValueError is raised with guidance on how to fix it. strict_native_tools: When True, requesting an unsupported :class:NativeTool raises :class:UnsupportedNativeToolError. When None (default) the class-level :attr:strict_native_tools attribute is used (typically False). **kwargs: Forwarded verbatim to :meth:_init_client.

Source code in lazybridge/core/providers/base.py

def __init__(
    self,
    api_key: str | None = None,
    model: str | None = None,
    *,
    fallback_model: str | None = None,
    strict_native_tools: bool | None = None,
    **kwargs: Any,
) -> None:
    """Initialise the provider.

    Parameters
    ----------
    api_key:
        Provider API key.  If ``None``, ``_init_client`` reads it from
        an environment variable (standard pattern for all built-in providers).
    model:
        Model identifier to use for all requests.  When ``None`` and
        ``default_model`` is also ``None`` (recommended for paid cloud
        providers), ``_resolve_model`` raises a clear ``ValueError``
        rather than silently falling back to an expensive flagship.
    fallback_model:
        Model to use when neither ``model=`` nor ``request.model`` is set.
        Two forms:
        - Explicit string, e.g. ``fallback_model="gpt-4o-mini"`` — used
          verbatim (tier aliases are resolved normally).
        - ``"cheapest"`` — automatically resolves to the cheapest tier
          alias available on this provider
          (``super_cheap`` → ``cheap`` → ``medium``, in that order).
        When ``None`` (default) and no model is configured, a
        ``ValueError`` is raised with guidance on how to fix it.
    strict_native_tools:
        When ``True``, requesting an unsupported :class:`NativeTool`
        raises :class:`UnsupportedNativeToolError`.  When ``None``
        (default) the class-level :attr:`strict_native_tools`
        attribute is used (typically ``False``).
    **kwargs:
        Forwarded verbatim to :meth:`_init_client`.
    """
    if api_key is not None and not api_key.strip():
        raise ValueError(
            f"{self.__class__.__name__}: api_key must not be an empty or "
            "whitespace-only string. Pass None to read from the environment "
            "variable, or provide a valid key."
        )
    self.api_key = api_key
    # Store the user-supplied model separately so _resolve_model can
    # distinguish "user didn't pass a model" from "class default applies".
    # self.model is the effective value for backward-compat reads (e.g.
    # executor.model); _resolve_model uses _user_model to decide when to
    # consult fallback_model before falling through to default_model.
    self._user_model: str | None = model
    self.model = model or self.default_model
    self.fallback_model = fallback_model
    if strict_native_tools is not None:
        # Per-instance override of the class-level default.
        self.strict_native_tools = bool(strict_native_tools)
    self._init_client(**kwargs)

default_model `class-attribute` `instance-attribute` ¶

default_model: str | None = ''

Class-level default model identifier. Used when neither the request nor the constructor model= argument specifies a model. Set to None on paid cloud providers to force explicit model selection and prevent silent fallback to an expensive flagship.

supported_native_tools `class-attribute` `instance-attribute` ¶

supported_native_tools: frozenset[NativeTool] = frozenset()

Declare which :class:~lazybridge.core.types.NativeTool values this provider supports (e.g. frozenset({NativeTool.WEB_SEARCH})). Unsupported tools requested by the user are filtered and warned — or raised, when strict_native_tools=True is set on construction.

supports_streaming `class-attribute` `instance-attribute` ¶

supports_streaming: bool = True

Does this provider expose stream(...) / astream(...)?

supports_structured_output `class-attribute` `instance-attribute` ¶

supports_structured_output: bool = True

Does this provider accept request.structured_output (Pydantic model or JSON-schema dict)?

supports_thinking `class-attribute` `instance-attribute` ¶

supports_thinking: bool = True

Does this provider produce a thinking field on the response (or reasoning_tokens / thoughts_token_count on usage)?

strict_native_tools `class-attribute` `instance-attribute` ¶

strict_native_tools: bool = False

When True, requesting an unsupported :class:NativeTool raises :class:UnsupportedNativeToolError instead of warning-and-dropping. Set on construction (BaseProvider(..., strict_native_tools=True)) or via the subclass. Default False preserves the friendly pre-W5.1 behaviour for ad-hoc / interactive use. Production setups should consider opting into strict mode so a misconfigured provider fails loud rather than degrading to a non-grounded reply.

supports_vision `classmethod` ¶

supports_vision(model: str | None = None) -> bool

Whether the resolved model accepts image input.

Default implementation does a substring scan against :attr:_VISION_CAPABLE_MODEL_PATTERNS. Override when the decision needs custom logic (e.g. version-range checks).

Returns False for None / empty model because we don't know what the eventual default will be — caller can re-query once the model is resolved.

Source code in lazybridge/core/providers/base.py

@classmethod
def supports_vision(cls, model: str | None = None) -> bool:
    """Whether the resolved ``model`` accepts image input.

    Default implementation does a substring scan against
    :attr:`_VISION_CAPABLE_MODEL_PATTERNS`.  Override when the
    decision needs custom logic (e.g. version-range checks).

    Returns ``False`` for ``None`` / empty model because we don't
    know what the eventual default will be — caller can re-query
    once the model is resolved.
    """
    if not model:
        return False
    m = model.lower()
    return any(p in m for p in cls._VISION_CAPABLE_MODEL_PATTERNS)

supports_audio `classmethod` ¶

supports_audio(model: str | None = None) -> bool

Whether the resolved model accepts audio input.

See :meth:supports_vision — same semantics, audio modality.

Source code in lazybridge/core/providers/base.py

@classmethod
def supports_audio(cls, model: str | None = None) -> bool:
    """Whether the resolved ``model`` accepts audio input.

    See :meth:`supports_vision` — same semantics, audio modality.
    """
    if not model:
        return False
    m = model.lower()
    return any(p in m for p in cls._AUDIO_CAPABLE_MODEL_PATTERNS)

is_retryable ¶

is_retryable(exc: BaseException) -> bool | None

Classify a provider exception as retryable, non-retryable, or defer.

The :class:~lazybridge.core.executor.Executor consults this hook before falling back to its generic status/string heuristic. Override when the provider SDK raises structured exception types that encode retry semantics more precisely than HTTP status codes alone — for example a rate-limit exception that carries a retry_after attribute distinguishing "back off" (retryable) from "quota exhausted" (not).

Return values

True — retry with backoff.
False — do not retry; surface the exception.
None — no opinion; Executor falls back to its generic classifier (core.executor._is_retryable) that matches status_code in {429, 5xx} and common transient-error strings.

Default implementation returns None so built-in providers fall through to the generic path with no behaviour change.

Source code in lazybridge/core/providers/base.py

def is_retryable(self, exc: BaseException) -> bool | None:
    """Classify a provider exception as retryable, non-retryable, or defer.

    The :class:`~lazybridge.core.executor.Executor` consults this hook
    before falling back to its generic status/string heuristic.  Override
    when the provider SDK raises structured exception types that encode
    retry semantics more precisely than HTTP status codes alone — for
    example a rate-limit exception that carries a ``retry_after`` attribute
    distinguishing "back off" (retryable) from "quota exhausted" (not).

    Return values:
      * ``True`` — retry with backoff.
      * ``False`` — do not retry; surface the exception.
      * ``None`` — no opinion; Executor falls back to its generic
        classifier (``core.executor._is_retryable``) that matches
        ``status_code in {429, 5xx}`` and common transient-error strings.

    Default implementation returns ``None`` so built-in providers fall
    through to the generic path with no behaviour change.
    """
    return None

complete `abstractmethod` ¶

complete(request: CompletionRequest) -> CompletionResponse

Execute a synchronous completion and return a unified response.

Parameters¶

request: Fully assembled :class:~lazybridge.core.types.CompletionRequest. Treat as read-only — do not mutate.

Returns¶

CompletionResponse At minimum, content must be set to the model's text reply. Populate usage, model, tool_calls, stop_reason when available. Set raw to the original SDK response object to allow callers to access provider-specific fields.

Raises¶

Any exception from your SDK is acceptable — LazyBridge propagates them as-is and handles retry logic in :class:~lazybridge.core.executor.Executor.

Source code in lazybridge/core/providers/base.py

@abstractmethod
def complete(self, request: CompletionRequest) -> CompletionResponse:
    """Execute a synchronous completion and return a unified response.

    Parameters
    ----------
    request:
        Fully assembled :class:`~lazybridge.core.types.CompletionRequest`.
        Treat as **read-only** — do not mutate.

    Returns
    -------
    CompletionResponse
        At minimum, ``content`` must be set to the model's text reply.
        Populate ``usage``, ``model``, ``tool_calls``, ``stop_reason``
        when available.  Set ``raw`` to the original SDK response object
        to allow callers to access provider-specific fields.

    Raises
    ------
    Any exception from your SDK is acceptable — LazyBridge propagates them
    as-is and handles retry logic in :class:`~lazybridge.core.executor.Executor`.
    """
    ...

stream `abstractmethod` ¶

stream(request: CompletionRequest) -> Iterator[StreamChunk]

Stream a completion, yielding :class:~lazybridge.core.types.StreamChunk objects.

The final chunk must have is_final=True and stop_reason set. Token usage should be reported on the final chunk when available.

Parameters¶

request: Same as :meth:complete. Treat as read-only.

Yields¶

StreamChunk Intermediate chunks: delta contains the new text fragment. Final chunk: is_final=True, stop_reason set, usage populated.

Example skeleton::

def stream(self, request):
    for raw_chunk in self._client.stream(...):
        yield StreamChunk(delta=raw_chunk.text)
    yield StreamChunk(
        delta="",
        stop_reason="end_turn",
        is_final=True,
        usage=UsageStats(input_tokens=..., output_tokens=...),
    )

Source code in lazybridge/core/providers/base.py

@abstractmethod
def stream(self, request: CompletionRequest) -> Iterator[StreamChunk]:
    """Stream a completion, yielding :class:`~lazybridge.core.types.StreamChunk` objects.

    The final chunk **must** have ``is_final=True`` and ``stop_reason`` set.
    Token usage should be reported on the final chunk when available.

    Parameters
    ----------
    request:
        Same as :meth:`complete`. Treat as read-only.

    Yields
    ------
    StreamChunk
        Intermediate chunks: ``delta`` contains the new text fragment.
        Final chunk: ``is_final=True``, ``stop_reason`` set, ``usage`` populated.

    Example skeleton::

        def stream(self, request):
            for raw_chunk in self._client.stream(...):
                yield StreamChunk(delta=raw_chunk.text)
            yield StreamChunk(
                delta="",
                stop_reason="end_turn",
                is_final=True,
                usage=UsageStats(input_tokens=..., output_tokens=...),
            )
    """
    ...

acomplete `abstractmethod` `async` ¶

acomplete(request: CompletionRequest) -> CompletionResponse

Async version of :meth:complete.

Semantics and return contract are identical. Use await for all blocking operations — never call time.sleep or blocking I/O here.

Source code in lazybridge/core/providers/base.py

@abstractmethod
async def acomplete(self, request: CompletionRequest) -> CompletionResponse:
    """Async version of :meth:`complete`.

    Semantics and return contract are identical. Use ``await`` for all
    blocking operations — never call ``time.sleep`` or blocking I/O here.
    """
    ...

astream `abstractmethod` ¶

astream(request: CompletionRequest) -> AsyncIterator[StreamChunk]

Async streaming generator — async version of :meth:stream.

Implement as an async def generator::

async def astream(self, request):
    async for raw_chunk in self._client.astream(...):
        yield StreamChunk(delta=raw_chunk.text)
    yield StreamChunk(stop_reason="end_turn", is_final=True, usage=...)

The same final-chunk contract as :meth:stream applies.

Source code in lazybridge/core/providers/base.py

@abstractmethod
def astream(self, request: CompletionRequest) -> AsyncIterator[StreamChunk]:
    """Async streaming generator — async version of :meth:`stream`.

    Implement as an ``async def`` generator::

        async def astream(self, request):
            async for raw_chunk in self._client.astream(...):
                yield StreamChunk(delta=raw_chunk.text)
            yield StreamChunk(stop_reason="end_turn", is_final=True, usage=...)

    The same final-chunk contract as :meth:`stream` applies.
    """
    ...

get_default_max_tokens ¶

get_default_max_tokens(model: str | None = None) -> int

Return the default max_tokens cap for the given model.

Override when your model has a limit lower or higher than 4096. LazyBridge calls this when max_tokens is not set explicitly.

Source code in lazybridge/core/providers/base.py

def get_default_max_tokens(self, model: str | None = None) -> int:
    """Return the default ``max_tokens`` cap for the given model.

    Override when your model has a limit lower or higher than 4096.
    LazyBridge calls this when ``max_tokens`` is not set explicitly.
    """
    return 4096

Provider registry surface¶

The registry methods are class-level on LLMEngine. They mutate class-level tables (_PROVIDER_ALIASES, _PROVIDER_RULES, _PROVIDER_DEFAULT) and are documented under the engine class itself — see Engines → LLMEngine for the full method list. For read-only introspection from caller code, use LLMEngine.provider_aliases() (returns a fresh dict[str, str] copy of the routing aliases — safe to mutate without affecting the framework). The old top-level PROVIDER_ALIASES constant was an import-time snapshot that silently diverged from the live registry after register_provider_alias; it is deprecated since 0.10 and still present (with a DeprecationWarning) as of 1.0.1 — pending removal in a future major version, not yet scheduled.

Registry mutation entry points (quick reference):

Method	Effect
`LLMEngine.provider_aliases()`	Snapshot of the current alias map (read-only)
`LLMEngine.register_provider_alias(alias, provider)`	Exact-match (case-insensitive) routing
`LLMEngine.register_provider_rule(pattern, provider, *, kind="contains" \| "startswith")`	Substring / prefix routing; new rules prepend the rule list
`LLMEngine.set_default_provider(provider \| None)`	Fallback when no rule matches; `None` (the 0.7.9 default) makes unknown-model strings raise `ValueError` instead of silently routing to Anthropic

Capability matrix¶

Provider	Streaming	Structured output	Thinking	code_execution	computer_use	file_search	google_maps	google_search	image_generation	web_search
`anthropic`	✓	✓	✓	✓	✓	—	—	—	—	✓
`deepseek`	✓	✓	✓	—	—	—	—	—	—	—
`google`	✓	✓	✓	—	—	—	✓	✓	—	✓
`litellm`	✓	✓	—	—	—	—	—	—	—	—
`lmstudio`	✓	✓	—	—	—	—	—	—	—	—
`openai`	✓	✓	✓	✓	✓	✓	—	—	✓	✓

Generated from lazybridge.matrix.provider_capabilities() at docs build time — see tools/mkdocs_provider_table.py.

The table above is generated at docs build time from lazybridge.matrix.provider_capabilities() which in turn reads the ClassVar flags on each provider class. Update the matrix by editing the provider's supports_streaming / supports_structured_output / supports_thinking / supported_native_tools declarations — the table re-renders on the next mkdocs build.

from lazybridge.matrix import provider_capabilities

for name, caps in provider_capabilities().items():
    print(name, caps.streaming, caps.structured_output, caps.thinking)

`lazybridge.matrix` reference¶

lazybridge.matrix.provider_capabilities `cached` ¶

provider_capabilities() -> dict[str, ProviderCapabilities]

Return the capability matrix for every registered provider.

Keys are the provider names recognised by LLMEngine (the provider= argument and the LLMEngine._PROVIDER_RULES map); values are :class:ProviderCapabilities instances.

Cached after first call; the underlying ClassVar declarations are immutable in practice so re-querying the providers each call would just thrash the import system.

Graceful degradation — each provider class is imported lazily and individually. If importing one provider's module fails (e.g. a broken optional SDK that explodes at import time), that provider is omitted from the returned matrix and a :class:UserWarning is issued, rather than letting one bad import break introspection for every other provider.

Source code in lazybridge/matrix.py

@lru_cache(maxsize=1)
def provider_capabilities() -> dict[str, ProviderCapabilities]:
    """Return the capability matrix for every registered provider.

    Keys are the provider names recognised by ``LLMEngine`` (the
    ``provider=`` argument and the ``LLMEngine._PROVIDER_RULES`` map);
    values are :class:`ProviderCapabilities` instances.

    Cached after first call; the underlying ``ClassVar`` declarations
    are immutable in practice so re-querying the providers each call
    would just thrash the import system.

    **Graceful degradation** — each provider class is imported lazily and
    *individually*.  If importing one provider's module fails (e.g. a
    broken optional SDK that explodes at import time), that provider is
    omitted from the returned matrix and a :class:`UserWarning` is issued,
    rather than letting one bad import break introspection for every other
    provider.
    """
    out: dict[str, ProviderCapabilities] = {}
    for name, module_path, attr in _PROVIDER_IMPORTS:
        try:
            cls: Any = getattr(importlib.import_module(module_path), attr)
        except Exception as exc:  # defend against any import-time blow-up
            warnings.warn(
                f"lazybridge.matrix: provider {name!r} is unavailable "
                f"(failed to import {module_path}.{attr}: {exc!r}); "
                f"omitting it from the capability matrix.",
                stacklevel=2,
            )
            continue
        out[name] = ProviderCapabilities(
            native_tools=frozenset(getattr(cls, "supported_native_tools", frozenset())),
            streaming=bool(getattr(cls, "supports_streaming", True)),
            structured_output=bool(getattr(cls, "supports_structured_output", True)),
            thinking=bool(getattr(cls, "supports_thinking", True)),
        )
    return out

lazybridge.matrix.native_tool_support ¶

native_tool_support() -> dict[str, list[str]]

Compact provider → [native-tool names] mapping.

Convenient for README tables and doc generation; the full :class:ProviderCapabilities shape is what most callers want.

Source code in lazybridge/matrix.py

def native_tool_support() -> dict[str, list[str]]:
    """Compact ``provider → [native-tool names]`` mapping.

    Convenient for README tables and doc generation; the full
    :class:`ProviderCapabilities` shape is what most callers want.
    """
    return {name: sorted(t.value for t in caps.native_tools) for name, caps in provider_capabilities().items()}

lazybridge.matrix.ProviderCapabilities `dataclass` ¶

ProviderCapabilities(native_tools: frozenset[NativeTool] = frozenset(), streaming: bool = True, structured_output: bool = True, thinking: bool = True)

Snapshot of a single provider's declared capabilities.

All four fields come from ClassVar declarations on the provider class; keep them in sync there, not here.

stop_reason normalisation¶

Each provider exposes its own raw finish-reason vocabulary; LazyBridge maps them to a normalised CompletionResponse.stop_reason so engine loops can decide identically across providers. Notable mappings:

Provider	Raw value	Normalised
Anthropic	`end_turn` / `tool_use` / `max_tokens` / `stop_sequence`	`end_turn` / `tool_use` / `max_tokens` / `end_turn`
OpenAI	`stop` / `tool_calls` / `length` / `content_filter`	`end_turn` / `tool_use` / `max_tokens` / `error`
Google	`STOP` / `MAX_TOKENS` / `SAFETY` / `RECITATION` / `BLOCKLIST`	`end_turn` / `max_tokens` / `error` (the bucket for non-stop terminations)
DeepSeek	passes through OpenAI shape	as OpenAI

The Google MAX_TOKENS mapping is fixed in 0.7.9 — pre-fix it was returned as the literal string and broke loops that branched on stop_reason == "max_tokens". Inspect Envelope.metadata.stop_reason to read the normalised value.

Custom providers¶

Abstract base class¶

lazybridge.BaseProvider ¶

Parameters¶

default_model class-attribute instance-attribute ¶

supported_native_tools class-attribute instance-attribute ¶

supports_streaming class-attribute instance-attribute ¶

supports_structured_output class-attribute instance-attribute ¶

supports_thinking class-attribute instance-attribute ¶

strict_native_tools class-attribute instance-attribute ¶

supports_vision classmethod ¶

supports_audio classmethod ¶

is_retryable ¶

complete abstractmethod ¶

Parameters¶

Returns¶

Raises¶

stream abstractmethod ¶

Parameters¶

Yields¶

acomplete abstractmethod async ¶

astream abstractmethod ¶

get_default_max_tokens ¶

Provider registry surface¶

Capability matrix¶

lazybridge.matrix reference¶

lazybridge.matrix.provider_capabilities cached ¶

lazybridge.matrix.native_tool_support ¶

lazybridge.matrix.ProviderCapabilities dataclass ¶

stop_reason normalisation¶

default_model `class-attribute` `instance-attribute` ¶

supported_native_tools `class-attribute` `instance-attribute` ¶

supports_streaming `class-attribute` `instance-attribute` ¶

supports_structured_output `class-attribute` `instance-attribute` ¶

supports_thinking `class-attribute` `instance-attribute` ¶

strict_native_tools `class-attribute` `instance-attribute` ¶

supports_vision `classmethod` ¶

supports_audio `classmethod` ¶

complete `abstractmethod` ¶

stream `abstractmethod` ¶

acomplete `abstractmethod` `async` ¶

astream `abstractmethod` ¶

`lazybridge.matrix` reference¶

lazybridge.matrix.provider_capabilities `cached` ¶

lazybridge.matrix.ProviderCapabilities `dataclass` ¶