Skip to content

Custom providers

BaseProvider is the stable extension point for integrating any LLM backend. The provider registry on LLMEngine routes model strings to registered providers.

For narrative usage see Guides → Advanced → BaseProvider and Guides → Advanced → Providers (built-in catalogue + tier tables).

Abstract base class

lazybridge.BaseProvider

BaseProvider(api_key: str | None = None, model: str | None = None, *, fallback_model: str | None = None, strict_native_tools: bool | None = None, **kwargs: Any)

Bases: ABC

Stable abstract base class for all LLM providers.

Subclass this to integrate any LLM backend with LazyBridge. Plug a custom provider in by constructing an LLMEngine that routes to it (see lazybridge/core/executor.py for resolution)::

agent = Agent(engine=LLMEngine("my-model"))

Stability contract The following are guaranteed stable across minor versions:

  • __init__(api_key, model, **kwargs) signature
  • _init_client(**kwargs) — override to initialise your SDK client
  • complete(request) — synchronous completion
  • stream(request) — synchronous streaming
  • acomplete(request) — async completion
  • astream(request) — async streaming generator
  • default_model: str — class-level default model name
  • supported_native_tools: frozenset[NativeTool] — declare web search etc.
  • get_default_max_tokens(model) — override to set per-model limits
  • _resolve_model(request) — helper: request.model → self.model → default_model
  • _compute_cost(model, input_tokens, output_tokens) — override for cost tracking
  • _check_native_tools(tools) — filters unsupported native tools with a warning

What you MUST implement: complete, stream, acomplete, astream.

What you SHOULD override: _init_client, default_model, get_default_max_tokens, _compute_cost.

What you MUST NOT do: - Raise exceptions other than Python built-ins or your SDK's own error types. LazyBridge does not wrap provider exceptions — they propagate as-is. - Mutate request — it is shared and must be treated as read-only. - Block the event loop inside acomplete / astream — use await or asyncio.get_event_loop().run_in_executor for blocking SDK calls.

Initialise the provider.

Parameters

api_key: Provider API key. If None, _init_client reads it from an environment variable (standard pattern for all built-in providers). model: Model identifier to use for all requests. When None and default_model is also None (recommended for paid cloud providers), _resolve_model raises a clear ValueError rather than silently falling back to an expensive flagship. fallback_model: Model to use when neither model= nor request.model is set. Two forms: - Explicit string, e.g. fallback_model="gpt-4o-mini" — used verbatim (tier aliases are resolved normally). - "cheapest" — automatically resolves to the cheapest tier alias available on this provider (super_cheapcheapmedium, in that order). When None (default) and no model is configured, a ValueError is raised with guidance on how to fix it. strict_native_tools: When True, requesting an unsupported :class:NativeTool raises :class:UnsupportedNativeToolError. When None (default) the class-level :attr:strict_native_tools attribute is used (typically False). **kwargs: Forwarded verbatim to :meth:_init_client.

Source code in lazybridge/core/providers/base.py
def __init__(
    self,
    api_key: str | None = None,
    model: str | None = None,
    *,
    fallback_model: str | None = None,
    strict_native_tools: bool | None = None,
    **kwargs: Any,
) -> None:
    """Initialise the provider.

    Parameters
    ----------
    api_key:
        Provider API key.  If ``None``, ``_init_client`` reads it from
        an environment variable (standard pattern for all built-in providers).
    model:
        Model identifier to use for all requests.  When ``None`` and
        ``default_model`` is also ``None`` (recommended for paid cloud
        providers), ``_resolve_model`` raises a clear ``ValueError``
        rather than silently falling back to an expensive flagship.
    fallback_model:
        Model to use when neither ``model=`` nor ``request.model`` is set.
        Two forms:
        - Explicit string, e.g. ``fallback_model="gpt-4o-mini"`` — used
          verbatim (tier aliases are resolved normally).
        - ``"cheapest"`` — automatically resolves to the cheapest tier
          alias available on this provider
          (``super_cheap`` → ``cheap`` → ``medium``, in that order).
        When ``None`` (default) and no model is configured, a
        ``ValueError`` is raised with guidance on how to fix it.
    strict_native_tools:
        When ``True``, requesting an unsupported :class:`NativeTool`
        raises :class:`UnsupportedNativeToolError`.  When ``None``
        (default) the class-level :attr:`strict_native_tools`
        attribute is used (typically ``False``).
    **kwargs:
        Forwarded verbatim to :meth:`_init_client`.
    """
    if api_key is not None and not api_key.strip():
        raise ValueError(
            f"{self.__class__.__name__}: api_key must not be an empty or "
            "whitespace-only string. Pass None to read from the environment "
            "variable, or provide a valid key."
        )
    self.api_key = api_key
    # Store the user-supplied model separately so _resolve_model can
    # distinguish "user didn't pass a model" from "class default applies".
    # self.model is the effective value for backward-compat reads (e.g.
    # executor.model); _resolve_model uses _user_model to decide when to
    # consult fallback_model before falling through to default_model.
    self._user_model: str | None = model
    self.model = model or self.default_model
    self.fallback_model = fallback_model
    if strict_native_tools is not None:
        # Per-instance override of the class-level default.
        self.strict_native_tools = bool(strict_native_tools)
    self._init_client(**kwargs)

default_model class-attribute instance-attribute

default_model: str | None = ''

Class-level default model identifier. Used when neither the request nor the constructor model= argument specifies a model. Set to None on paid cloud providers to force explicit model selection and prevent silent fallback to an expensive flagship.

supported_native_tools class-attribute instance-attribute

supported_native_tools: frozenset[NativeTool] = frozenset()

Declare which :class:~lazybridge.core.types.NativeTool values this provider supports (e.g. frozenset({NativeTool.WEB_SEARCH})). Unsupported tools requested by the user are filtered and warned — or raised, when strict_native_tools=True is set on construction.

supports_streaming class-attribute instance-attribute

supports_streaming: bool = True

Does this provider expose stream(...) / astream(...)?

supports_structured_output class-attribute instance-attribute

supports_structured_output: bool = True

Does this provider accept request.structured_output (Pydantic model or JSON-schema dict)?

supports_thinking class-attribute instance-attribute

supports_thinking: bool = True

Does this provider produce a thinking field on the response (or reasoning_tokens / thoughts_token_count on usage)?

strict_native_tools class-attribute instance-attribute

strict_native_tools: bool = False

When True, requesting an unsupported :class:NativeTool raises :class:UnsupportedNativeToolError instead of warning-and-dropping. Set on construction (BaseProvider(..., strict_native_tools=True)) or via the subclass. Default False preserves the friendly pre-W5.1 behaviour for ad-hoc / interactive use. Production setups should consider opting into strict mode so a misconfigured provider fails loud rather than degrading to a non-grounded reply.

supports_vision classmethod

supports_vision(model: str | None = None) -> bool

Whether the resolved model accepts image input.

Default implementation does a substring scan against :attr:_VISION_CAPABLE_MODEL_PATTERNS. Override when the decision needs custom logic (e.g. version-range checks).

Returns False for None / empty model because we don't know what the eventual default will be — caller can re-query once the model is resolved.

Source code in lazybridge/core/providers/base.py
@classmethod
def supports_vision(cls, model: str | None = None) -> bool:
    """Whether the resolved ``model`` accepts image input.

    Default implementation does a substring scan against
    :attr:`_VISION_CAPABLE_MODEL_PATTERNS`.  Override when the
    decision needs custom logic (e.g. version-range checks).

    Returns ``False`` for ``None`` / empty model because we don't
    know what the eventual default will be — caller can re-query
    once the model is resolved.
    """
    if not model:
        return False
    m = model.lower()
    return any(p in m for p in cls._VISION_CAPABLE_MODEL_PATTERNS)

supports_audio classmethod

supports_audio(model: str | None = None) -> bool

Whether the resolved model accepts audio input.

See :meth:supports_vision — same semantics, audio modality.

Source code in lazybridge/core/providers/base.py
@classmethod
def supports_audio(cls, model: str | None = None) -> bool:
    """Whether the resolved ``model`` accepts audio input.

    See :meth:`supports_vision` — same semantics, audio modality.
    """
    if not model:
        return False
    m = model.lower()
    return any(p in m for p in cls._AUDIO_CAPABLE_MODEL_PATTERNS)

is_retryable

is_retryable(exc: BaseException) -> bool | None

Classify a provider exception as retryable, non-retryable, or defer.

The :class:~lazybridge.core.executor.Executor consults this hook before falling back to its generic status/string heuristic. Override when the provider SDK raises structured exception types that encode retry semantics more precisely than HTTP status codes alone — for example a rate-limit exception that carries a retry_after attribute distinguishing "back off" (retryable) from "quota exhausted" (not).

Return values
  • True — retry with backoff.
  • False — do not retry; surface the exception.
  • None — no opinion; Executor falls back to its generic classifier (core.executor._is_retryable) that matches status_code in {429, 5xx} and common transient-error strings.

Default implementation returns None so built-in providers fall through to the generic path with no behaviour change.

Source code in lazybridge/core/providers/base.py
def is_retryable(self, exc: BaseException) -> bool | None:
    """Classify a provider exception as retryable, non-retryable, or defer.

    The :class:`~lazybridge.core.executor.Executor` consults this hook
    before falling back to its generic status/string heuristic.  Override
    when the provider SDK raises structured exception types that encode
    retry semantics more precisely than HTTP status codes alone — for
    example a rate-limit exception that carries a ``retry_after`` attribute
    distinguishing "back off" (retryable) from "quota exhausted" (not).

    Return values:
      * ``True`` — retry with backoff.
      * ``False`` — do not retry; surface the exception.
      * ``None`` — no opinion; Executor falls back to its generic
        classifier (``core.executor._is_retryable``) that matches
        ``status_code in {429, 5xx}`` and common transient-error strings.

    Default implementation returns ``None`` so built-in providers fall
    through to the generic path with no behaviour change.
    """
    return None

complete abstractmethod

complete(request: CompletionRequest) -> CompletionResponse

Execute a synchronous completion and return a unified response.

Parameters

request: Fully assembled :class:~lazybridge.core.types.CompletionRequest. Treat as read-only — do not mutate.

Returns

CompletionResponse At minimum, content must be set to the model's text reply. Populate usage, model, tool_calls, stop_reason when available. Set raw to the original SDK response object to allow callers to access provider-specific fields.

Raises

Any exception from your SDK is acceptable — LazyBridge propagates them as-is and handles retry logic in :class:~lazybridge.core.executor.Executor.

Source code in lazybridge/core/providers/base.py
@abstractmethod
def complete(self, request: CompletionRequest) -> CompletionResponse:
    """Execute a synchronous completion and return a unified response.

    Parameters
    ----------
    request:
        Fully assembled :class:`~lazybridge.core.types.CompletionRequest`.
        Treat as **read-only** — do not mutate.

    Returns
    -------
    CompletionResponse
        At minimum, ``content`` must be set to the model's text reply.
        Populate ``usage``, ``model``, ``tool_calls``, ``stop_reason``
        when available.  Set ``raw`` to the original SDK response object
        to allow callers to access provider-specific fields.

    Raises
    ------
    Any exception from your SDK is acceptable — LazyBridge propagates them
    as-is and handles retry logic in :class:`~lazybridge.core.executor.Executor`.
    """
    ...

stream abstractmethod

stream(request: CompletionRequest) -> Iterator[StreamChunk]

Stream a completion, yielding :class:~lazybridge.core.types.StreamChunk objects.

The final chunk must have is_final=True and stop_reason set. Token usage should be reported on the final chunk when available.

Parameters

request: Same as :meth:complete. Treat as read-only.

Yields

StreamChunk Intermediate chunks: delta contains the new text fragment. Final chunk: is_final=True, stop_reason set, usage populated.

Example skeleton::

def stream(self, request):
    for raw_chunk in self._client.stream(...):
        yield StreamChunk(delta=raw_chunk.text)
    yield StreamChunk(
        delta="",
        stop_reason="end_turn",
        is_final=True,
        usage=UsageStats(input_tokens=..., output_tokens=...),
    )
Source code in lazybridge/core/providers/base.py
@abstractmethod
def stream(self, request: CompletionRequest) -> Iterator[StreamChunk]:
    """Stream a completion, yielding :class:`~lazybridge.core.types.StreamChunk` objects.

    The final chunk **must** have ``is_final=True`` and ``stop_reason`` set.
    Token usage should be reported on the final chunk when available.

    Parameters
    ----------
    request:
        Same as :meth:`complete`. Treat as read-only.

    Yields
    ------
    StreamChunk
        Intermediate chunks: ``delta`` contains the new text fragment.
        Final chunk: ``is_final=True``, ``stop_reason`` set, ``usage`` populated.

    Example skeleton::

        def stream(self, request):
            for raw_chunk in self._client.stream(...):
                yield StreamChunk(delta=raw_chunk.text)
            yield StreamChunk(
                delta="",
                stop_reason="end_turn",
                is_final=True,
                usage=UsageStats(input_tokens=..., output_tokens=...),
            )
    """
    ...

acomplete abstractmethod async

acomplete(request: CompletionRequest) -> CompletionResponse

Async version of :meth:complete.

Semantics and return contract are identical. Use await for all blocking operations — never call time.sleep or blocking I/O here.

Source code in lazybridge/core/providers/base.py
@abstractmethod
async def acomplete(self, request: CompletionRequest) -> CompletionResponse:
    """Async version of :meth:`complete`.

    Semantics and return contract are identical. Use ``await`` for all
    blocking operations — never call ``time.sleep`` or blocking I/O here.
    """
    ...

astream abstractmethod

astream(request: CompletionRequest) -> AsyncIterator[StreamChunk]

Async streaming generator — async version of :meth:stream.

Implement as an async def generator::

async def astream(self, request):
    async for raw_chunk in self._client.astream(...):
        yield StreamChunk(delta=raw_chunk.text)
    yield StreamChunk(stop_reason="end_turn", is_final=True, usage=...)

The same final-chunk contract as :meth:stream applies.

Source code in lazybridge/core/providers/base.py
@abstractmethod
def astream(self, request: CompletionRequest) -> AsyncIterator[StreamChunk]:
    """Async streaming generator — async version of :meth:`stream`.

    Implement as an ``async def`` generator::

        async def astream(self, request):
            async for raw_chunk in self._client.astream(...):
                yield StreamChunk(delta=raw_chunk.text)
            yield StreamChunk(stop_reason="end_turn", is_final=True, usage=...)

    The same final-chunk contract as :meth:`stream` applies.
    """
    ...

get_default_max_tokens

get_default_max_tokens(model: str | None = None) -> int

Return the default max_tokens cap for the given model.

Override when your model has a limit lower or higher than 4096. LazyBridge calls this when max_tokens is not set explicitly.

Source code in lazybridge/core/providers/base.py
def get_default_max_tokens(self, model: str | None = None) -> int:
    """Return the default ``max_tokens`` cap for the given model.

    Override when your model has a limit lower or higher than 4096.
    LazyBridge calls this when ``max_tokens`` is not set explicitly.
    """
    return 4096

Provider registry surface

The registry methods are class-level on LLMEngine. They mutate class-level tables (_PROVIDER_ALIASES, _PROVIDER_RULES, _PROVIDER_DEFAULT) and are documented under the engine class itself — see Engines → LLMEngine for the full method list. For read-only introspection from caller code, use the public PROVIDER_ALIASES snapshot or LLMEngine.provider_aliases() (returns a fresh dict[str, str] copy of the routing aliases — safe to mutate without affecting the framework).

lazybridge.PROVIDER_ALIASES module-attribute

PROVIDER_ALIASES: dict[str, str] = provider_aliases()

Registry mutation entry points (quick reference):

Method Effect
LLMEngine.provider_aliases() Snapshot of the current alias map (read-only)
LLMEngine.register_provider_alias(alias, provider) Exact-match (case-insensitive) routing
LLMEngine.register_provider_rule(pattern, provider, *, kind="contains" | "startswith") Substring / prefix routing; new rules prepend the rule list
LLMEngine.set_default_provider(provider | None) Fallback when no rule matches; None (the 0.7.9 default) makes unknown-model strings raise ValueError instead of silently routing to Anthropic

Capability matrix

Provider Streaming Structured output Thinking code_execution computer_use file_search google_maps google_search image_generation web_search
anthropic
deepseek
google
litellm
lmstudio
openai

Generated from lazybridge.matrix.provider_capabilities() at docs build time — see tools/mkdocs_provider_table.py.

The table above is generated at docs build time from lazybridge.matrix.provider_capabilities() which in turn reads the ClassVar flags on each provider class. Update the matrix by editing the provider's supports_streaming / supports_structured_output / supports_thinking / supported_native_tools declarations — the table re-renders on the next mkdocs build.

from lazybridge.matrix import provider_capabilities

for name, caps in provider_capabilities().items():
    print(name, caps.streaming, caps.structured_output, caps.thinking)

lazybridge.matrix reference

lazybridge.matrix.provider_capabilities cached

provider_capabilities() -> dict[str, ProviderCapabilities]

Return the capability matrix for every registered provider.

Keys are the provider names recognised by LLMEngine (the provider= argument and the LLMEngine._PROVIDER_RULES map); values are :class:ProviderCapabilities instances.

Cached after first call; the underlying ClassVar declarations are immutable in practice so re-querying the providers each call would just thrash the import system.

Graceful degradation — each provider class is imported lazily and individually. If importing one provider's module fails (e.g. a broken optional SDK that explodes at import time), that provider is omitted from the returned matrix and a :class:UserWarning is issued, rather than letting one bad import break introspection for every other provider.

Source code in lazybridge/matrix.py
@lru_cache(maxsize=1)
def provider_capabilities() -> dict[str, ProviderCapabilities]:
    """Return the capability matrix for every registered provider.

    Keys are the provider names recognised by ``LLMEngine`` (the
    ``provider=`` argument and the ``LLMEngine._PROVIDER_RULES`` map);
    values are :class:`ProviderCapabilities` instances.

    Cached after first call; the underlying ``ClassVar`` declarations
    are immutable in practice so re-querying the providers each call
    would just thrash the import system.

    **Graceful degradation** — each provider class is imported lazily and
    *individually*.  If importing one provider's module fails (e.g. a
    broken optional SDK that explodes at import time), that provider is
    omitted from the returned matrix and a :class:`UserWarning` is issued,
    rather than letting one bad import break introspection for every other
    provider.
    """
    out: dict[str, ProviderCapabilities] = {}
    for name, module_path, attr in _PROVIDER_IMPORTS:
        try:
            cls: Any = getattr(importlib.import_module(module_path), attr)
        except Exception as exc:  # defend against any import-time blow-up
            warnings.warn(
                f"lazybridge.matrix: provider {name!r} is unavailable "
                f"(failed to import {module_path}.{attr}: {exc!r}); "
                f"omitting it from the capability matrix.",
                stacklevel=2,
            )
            continue
        out[name] = ProviderCapabilities(
            native_tools=frozenset(getattr(cls, "supported_native_tools", frozenset())),
            streaming=bool(getattr(cls, "supports_streaming", True)),
            structured_output=bool(getattr(cls, "supports_structured_output", True)),
            thinking=bool(getattr(cls, "supports_thinking", True)),
        )
    return out

lazybridge.matrix.native_tool_support

native_tool_support() -> dict[str, list[str]]

Compact provider → [native-tool names] mapping.

Convenient for README tables and doc generation; the full :class:ProviderCapabilities shape is what most callers want.

Source code in lazybridge/matrix.py
def native_tool_support() -> dict[str, list[str]]:
    """Compact ``provider → [native-tool names]`` mapping.

    Convenient for README tables and doc generation; the full
    :class:`ProviderCapabilities` shape is what most callers want.
    """
    return {name: sorted(t.value for t in caps.native_tools) for name, caps in provider_capabilities().items()}

lazybridge.matrix.ProviderCapabilities dataclass

ProviderCapabilities(native_tools: frozenset[NativeTool] = frozenset(), streaming: bool = True, structured_output: bool = True, thinking: bool = True)

Snapshot of a single provider's declared capabilities.

All four fields come from ClassVar declarations on the provider class; keep them in sync there, not here.

stop_reason normalisation

Each provider exposes its own raw finish-reason vocabulary; LazyBridge maps them to a normalised CompletionResponse.stop_reason so engine loops can decide identically across providers. Notable mappings:

Provider Raw value Normalised
Anthropic end_turn / tool_use / max_tokens / stop_sequence end_turn / tool_use / max_tokens / end_turn
OpenAI stop / tool_calls / length / content_filter end_turn / tool_use / max_tokens / error
Google STOP / MAX_TOKENS / SAFETY / RECITATION / BLOCKLIST end_turn / max_tokens / error (the bucket for non-stop terminations)
DeepSeek passes through OpenAI shape as OpenAI

The Google MAX_TOKENS mapping is fixed in 0.7.9 — pre-fix it was returned as the literal string and broke loops that branched on stop_reason == "max_tokens". Inspect Envelope.metadata.stop_reason to read the normalised value.