Custom providers¶
BaseProvider is the stable extension point for integrating any LLM
backend. The provider registry on LLMEngine routes model strings
to registered providers.
For narrative usage see Guides → Advanced → BaseProvider and Guides → Advanced → Providers (built-in catalogue + tier tables).
Abstract base class¶
lazybridge.BaseProvider ¶
BaseProvider(api_key: str | None = None, model: str | None = None, *, fallback_model: str | None = None, strict_native_tools: bool | None = None, **kwargs: Any)
Bases: ABC
Stable abstract base class for all LLM providers.
Subclass this to integrate any LLM backend with LazyBridge. Plug a
custom provider in by constructing an LLMEngine that routes to it
(see lazybridge/core/executor.py for resolution)::
agent = Agent(engine=LLMEngine("my-model"))
Stability contract The following are guaranteed stable across minor versions:
__init__(api_key, model, **kwargs)signature_init_client(**kwargs)— override to initialise your SDK clientcomplete(request)— synchronous completionstream(request)— synchronous streamingacomplete(request)— async completionastream(request)— async streaming generatordefault_model: str— class-level default model namesupported_native_tools: frozenset[NativeTool]— declare web search etc.get_default_max_tokens(model)— override to set per-model limits_resolve_model(request)— helper: request.model → self.model → default_model_compute_cost(model, input_tokens, output_tokens)— override for cost tracking_check_native_tools(tools)— filters unsupported native tools with a warning
What you MUST implement: complete, stream, acomplete, astream.
What you SHOULD override: _init_client, default_model,
get_default_max_tokens, _compute_cost.
What you MUST NOT do:
- Raise exceptions other than Python built-ins or your SDK's own error types.
LazyBridge does not wrap provider exceptions — they propagate as-is.
- Mutate request — it is shared and must be treated as read-only.
- Block the event loop inside acomplete / astream — use await or
asyncio.get_event_loop().run_in_executor for blocking SDK calls.
Initialise the provider.
Parameters¶
api_key:
Provider API key. If None, _init_client reads it from
an environment variable (standard pattern for all built-in providers).
model:
Model identifier to use for all requests. When None and
default_model is also None (recommended for paid cloud
providers), _resolve_model raises a clear ValueError
rather than silently falling back to an expensive flagship.
fallback_model:
Model to use when neither model= nor request.model is set.
Two forms:
- Explicit string, e.g. fallback_model="gpt-4o-mini" — used
verbatim (tier aliases are resolved normally).
- "cheapest" — automatically resolves to the cheapest tier
alias available on this provider
(super_cheap → cheap → medium, in that order).
When None (default) and no model is configured, a
ValueError is raised with guidance on how to fix it.
strict_native_tools:
When True, requesting an unsupported :class:NativeTool
raises :class:UnsupportedNativeToolError. When None
(default) the class-level :attr:strict_native_tools
attribute is used (typically False).
**kwargs:
Forwarded verbatim to :meth:_init_client.
Source code in lazybridge/core/providers/base.py
default_model
class-attribute
instance-attribute
¶
Class-level default model identifier. Used when neither the request
nor the constructor model= argument specifies a model.
Set to None on paid cloud providers to force explicit model selection
and prevent silent fallback to an expensive flagship.
supported_native_tools
class-attribute
instance-attribute
¶
Declare which :class:~lazybridge.core.types.NativeTool values this
provider supports (e.g. frozenset({NativeTool.WEB_SEARCH})).
Unsupported tools requested by the user are filtered and warned —
or raised, when strict_native_tools=True is set on construction.
supports_streaming
class-attribute
instance-attribute
¶
Does this provider expose stream(...) / astream(...)?
supports_structured_output
class-attribute
instance-attribute
¶
Does this provider accept request.structured_output (Pydantic
model or JSON-schema dict)?
supports_thinking
class-attribute
instance-attribute
¶
Does this provider produce a thinking field on the response (or
reasoning_tokens / thoughts_token_count on usage)?
strict_native_tools
class-attribute
instance-attribute
¶
When True, requesting an unsupported :class:NativeTool raises
:class:UnsupportedNativeToolError instead of warning-and-dropping.
Set on construction (BaseProvider(..., strict_native_tools=True))
or via the subclass. Default False preserves the friendly
pre-W5.1 behaviour for ad-hoc / interactive use. Production setups
should consider opting into strict mode so a misconfigured provider
fails loud rather than degrading to a non-grounded reply.
supports_vision
classmethod
¶
Whether the resolved model accepts image input.
Default implementation does a substring scan against
:attr:_VISION_CAPABLE_MODEL_PATTERNS. Override when the
decision needs custom logic (e.g. version-range checks).
Returns False for None / empty model because we don't
know what the eventual default will be — caller can re-query
once the model is resolved.
Source code in lazybridge/core/providers/base.py
supports_audio
classmethod
¶
Whether the resolved model accepts audio input.
See :meth:supports_vision — same semantics, audio modality.
Source code in lazybridge/core/providers/base.py
is_retryable ¶
Classify a provider exception as retryable, non-retryable, or defer.
The :class:~lazybridge.core.executor.Executor consults this hook
before falling back to its generic status/string heuristic. Override
when the provider SDK raises structured exception types that encode
retry semantics more precisely than HTTP status codes alone — for
example a rate-limit exception that carries a retry_after attribute
distinguishing "back off" (retryable) from "quota exhausted" (not).
Return values
True— retry with backoff.False— do not retry; surface the exception.None— no opinion; Executor falls back to its generic classifier (core.executor._is_retryable) that matchesstatus_code in {429, 5xx}and common transient-error strings.
Default implementation returns None so built-in providers fall
through to the generic path with no behaviour change.
Source code in lazybridge/core/providers/base.py
complete
abstractmethod
¶
Execute a synchronous completion and return a unified response.
Parameters¶
request:
Fully assembled :class:~lazybridge.core.types.CompletionRequest.
Treat as read-only — do not mutate.
Returns¶
CompletionResponse
At minimum, content must be set to the model's text reply.
Populate usage, model, tool_calls, stop_reason
when available. Set raw to the original SDK response object
to allow callers to access provider-specific fields.
Raises¶
Any exception from your SDK is acceptable — LazyBridge propagates them
as-is and handles retry logic in :class:~lazybridge.core.executor.Executor.
Source code in lazybridge/core/providers/base.py
stream
abstractmethod
¶
Stream a completion, yielding :class:~lazybridge.core.types.StreamChunk objects.
The final chunk must have is_final=True and stop_reason set.
Token usage should be reported on the final chunk when available.
Parameters¶
request:
Same as :meth:complete. Treat as read-only.
Yields¶
StreamChunk
Intermediate chunks: delta contains the new text fragment.
Final chunk: is_final=True, stop_reason set, usage populated.
Example skeleton::
def stream(self, request):
for raw_chunk in self._client.stream(...):
yield StreamChunk(delta=raw_chunk.text)
yield StreamChunk(
delta="",
stop_reason="end_turn",
is_final=True,
usage=UsageStats(input_tokens=..., output_tokens=...),
)
Source code in lazybridge/core/providers/base.py
acomplete
abstractmethod
async
¶
Async version of :meth:complete.
Semantics and return contract are identical. Use await for all
blocking operations — never call time.sleep or blocking I/O here.
Source code in lazybridge/core/providers/base.py
astream
abstractmethod
¶
Async streaming generator — async version of :meth:stream.
Implement as an async def generator::
async def astream(self, request):
async for raw_chunk in self._client.astream(...):
yield StreamChunk(delta=raw_chunk.text)
yield StreamChunk(stop_reason="end_turn", is_final=True, usage=...)
The same final-chunk contract as :meth:stream applies.
Source code in lazybridge/core/providers/base.py
get_default_max_tokens ¶
Return the default max_tokens cap for the given model.
Override when your model has a limit lower or higher than 4096.
LazyBridge calls this when max_tokens is not set explicitly.
Source code in lazybridge/core/providers/base.py
Provider registry surface¶
The registry methods are class-level on LLMEngine. They mutate
class-level tables (_PROVIDER_ALIASES, _PROVIDER_RULES,
_PROVIDER_DEFAULT) and are documented under the engine class itself
— see Engines → LLMEngine for the
full method list. For read-only introspection from caller code, use
the public PROVIDER_ALIASES snapshot or
LLMEngine.provider_aliases() (returns a fresh dict[str, str]
copy of the routing aliases — safe to mutate without affecting the
framework).
lazybridge.PROVIDER_ALIASES
module-attribute
¶
Registry mutation entry points (quick reference):
| Method | Effect |
|---|---|
LLMEngine.provider_aliases() |
Snapshot of the current alias map (read-only) |
LLMEngine.register_provider_alias(alias, provider) |
Exact-match (case-insensitive) routing |
LLMEngine.register_provider_rule(pattern, provider, *, kind="contains" | "startswith") |
Substring / prefix routing; new rules prepend the rule list |
LLMEngine.set_default_provider(provider | None) |
Fallback when no rule matches; None (the 0.7.9 default) makes unknown-model strings raise ValueError instead of silently routing to Anthropic |
Capability matrix¶
| Provider | Streaming | Structured output | Thinking | code_execution | computer_use | file_search | google_maps | google_search | image_generation | web_search |
|---|---|---|---|---|---|---|---|---|---|---|
anthropic |
✓ | ✓ | ✓ | ✓ | ✓ | — | — | — | — | ✓ |
deepseek |
✓ | ✓ | ✓ | — | — | — | — | — | — | — |
google |
✓ | ✓ | ✓ | — | — | — | ✓ | ✓ | — | ✓ |
litellm |
✓ | ✓ | — | — | — | — | — | — | — | — |
lmstudio |
✓ | ✓ | — | — | — | — | — | — | — | — |
openai |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — | — | ✓ | ✓ |
Generated from lazybridge.matrix.provider_capabilities() at docs build time — see tools/mkdocs_provider_table.py.
The table above is generated at docs build time from
lazybridge.matrix.provider_capabilities() which in turn reads the
ClassVar flags on each provider class. Update the matrix by
editing the provider's supports_streaming /
supports_structured_output / supports_thinking /
supported_native_tools declarations — the table re-renders on
the next mkdocs build.
from lazybridge.matrix import provider_capabilities
for name, caps in provider_capabilities().items():
print(name, caps.streaming, caps.structured_output, caps.thinking)
lazybridge.matrix reference¶
lazybridge.matrix.provider_capabilities
cached
¶
Return the capability matrix for every registered provider.
Keys are the provider names recognised by LLMEngine (the
provider= argument and the LLMEngine._PROVIDER_RULES map);
values are :class:ProviderCapabilities instances.
Cached after first call; the underlying ClassVar declarations
are immutable in practice so re-querying the providers each call
would just thrash the import system.
Graceful degradation — each provider class is imported lazily and
individually. If importing one provider's module fails (e.g. a
broken optional SDK that explodes at import time), that provider is
omitted from the returned matrix and a :class:UserWarning is issued,
rather than letting one bad import break introspection for every other
provider.
Source code in lazybridge/matrix.py
lazybridge.matrix.native_tool_support ¶
Compact provider → [native-tool names] mapping.
Convenient for README tables and doc generation; the full
:class:ProviderCapabilities shape is what most callers want.
Source code in lazybridge/matrix.py
lazybridge.matrix.ProviderCapabilities
dataclass
¶
ProviderCapabilities(native_tools: frozenset[NativeTool] = frozenset(), streaming: bool = True, structured_output: bool = True, thinking: bool = True)
Snapshot of a single provider's declared capabilities.
All four fields come from ClassVar declarations on the provider
class; keep them in sync there, not here.
stop_reason normalisation¶
Each provider exposes its own raw finish-reason vocabulary; LazyBridge
maps them to a normalised CompletionResponse.stop_reason so engine
loops can decide identically across providers. Notable mappings:
| Provider | Raw value | Normalised |
|---|---|---|
| Anthropic | end_turn / tool_use / max_tokens / stop_sequence |
end_turn / tool_use / max_tokens / end_turn |
| OpenAI | stop / tool_calls / length / content_filter |
end_turn / tool_use / max_tokens / error |
STOP / MAX_TOKENS / SAFETY / RECITATION / BLOCKLIST |
end_turn / max_tokens / error (the bucket for non-stop terminations) |
|
| DeepSeek | passes through OpenAI shape | as OpenAI |
The Google MAX_TOKENS mapping is fixed in 0.7.9 — pre-fix it was
returned as the literal string and broke loops that branched on
stop_reason == "max_tokens". Inspect Envelope.metadata.stop_reason
to read the normalised value.