Getting Started with Busbar

Busbar is a self-hosted LLM gateway: a single static Rust binary that sits between your application and your LLM providers. Point any SDK at it — OpenAI, Anthropic, Gemini, Cohere, Bedrock, or the OpenAI Responses API — and Busbar routes each request to the provider (or pool of providers) you configured, translating between wire protocols when the ingress and egress differ.

It does the same job as LiteLLM or OpenRouter — one API in front of every model and provider — and goes further: it speaks six provider protocols natively (so any vendor’s SDK can point at it, not just OpenAI-shaped ones), ships as one binary with no Python runtime and no third party in your data path, and gives you per-(pool, lane) circuit breaking with in-flight failover. You run it in your own infra and it holds your keys.

This guide takes you from zero to a working request in about five minutes.

What you need

An API key for at least one supported provider (Anthropic, OpenAI, Gemini, Cohere, or AWS Bedrock credentials)
The Busbar binary (see below)
curl or any LLM SDK

Step 1: Get the binary

One-line install (macOS / Linux) — detects your platform, downloads the latest release binary and the provider catalog into the current directory, and prints the next steps:

curl -fsSL https://ai-bus.bar/install.sh | sh

Drops busbar and providers.yaml where you run it (no sudo). To install onto your PATH instead: BUSBAR_INSTALL_DIR=/usr/local/bin curl -fsSL https://ai-bus.bar/install.sh | sh.

Or download manually — grab the archive for your platform from the latest release (Linux x86_64/aarch64, macOS Intel/Apple Silicon, Windows x86_64), plus the provider catalog from ai-bus.bar/providers.yaml. The binary is self-contained — no runtime, no virtualenv, no dependencies:

tar -xzf busbar-*.tar.gz   # extracts the `busbar` binary
chmod +x busbar
./busbar --version

Or build from source (requires Rust 1.87+):

cargo build --release      # binary at target/release/busbar

Step 2: Write a minimal config

Busbar reads two YAML files:

providers.yaml — the shipped provider catalog (protocol, base_url, error maps). You almost never edit this. The one-line installer fetches it for you, or grab it from ai-bus.bar/providers.yaml.
config.yaml — your deployment: which providers to activate, their key env-var names, your models, and optionally pools.

Important — keys are never written into config. api_key_env names the environment variable that holds a provider’s key; Busbar reads the key from there at startup. (Separately, ${VAR} tokens elsewhere in config.yaml are also expanded from the environment at load time.) An unset referenced variable is a loud startup failure, not a silent skip.

Minimal `config.yaml` (one provider, one model, no auth)

This is the smallest config that boots and serves requests. Use it on a local machine to try Busbar quickly.

# config.yaml (dev/minimal — no client auth gate)
providers:
  anthropic:
    api_key_env: ANTHROPIC_KEY   # the NAME of the env var to read the key from — NOT the key itself

models:
  claude-sonnet:
    provider: anthropic
    max_concurrent: 10

The key itself is never in this file. api_key_env: ANTHROPIC_KEY tells Busbar “read this provider’s key from the $ANTHROPIC_KEY environment variable at startup” — so you set the real secret in your environment (Step 3), and config.yaml stays safe to commit and share.

Save this as config.yaml in your working directory.

providers.yaml must also be present. The one-line installer fetches it for you. If you built from source it lives in the repo root; otherwise grab it from ai-bus.bar/providers.yaml.

What the fields mean

Field	What it does
`providers.<name>.api_key_env`	Name of the environment variable holding this provider’s API key
`models.<name>.provider`	Which provider entry in the `providers` block this model calls
`models.<name>.max_concurrent`	Max simultaneous in-flight requests to this model (the concurrency semaphore); must be ≥ 1

providers and models are the only required sections. listen defaults to 0.0.0.0:8080. auth defaults to none (open relay) when omitted — fine for local dev, not for production.

Step 3: Set environment variables and run

# the actual secret — this is what `api_key_env: ANTHROPIC_KEY` in config.yaml points at
export ANTHROPIC_KEY=sk-ant-...

BUSBAR_PROVIDERS=./providers.yaml BUSBAR_CONFIG=./config.yaml ./busbar

Busbar logs a startup event indicating the listen address (busbar listening, with the bound address as a field). It accepts requests immediately: Prometheus/TSC calibration is deferred to a background thread, so it never blocks the hot path at boot.

Check liveness:

curl -s http://localhost:8080/healthz
# → ok

/healthz is always unauthenticated and returns 200 ok when at least one lane is ready, 503 no usable lanes when every lane’s circuit breaker is open. It is side-effect-free and never steals a recovery probe.

Step 4: Send a request

Via curl — Anthropic-format ingress

The model name goes in the URL path: POST /<model-name>/v1/messages. Busbar resolves <model-name> against your configured pools first, then your models, and routes the request to the matching lane.

curl -s http://localhost:8080/claude-sonnet/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "What is a busbar?"}]
  }' | jq .

You get back a standard Anthropic Messages response. Because both ingress and egress are Anthropic here, Busbar relays it as a native same-protocol passthrough. Routing keys off the name in the URL, not the model field in the body.

Via curl — OpenAI-format ingress

The model name goes in the request body: POST /v1/chat/completions. This works with any OpenAI SDK or tool that targets http://localhost:8080 as the base URL.

curl -s http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet",
    "messages": [{"role": "user", "content": "What is a busbar?"}]
  }' | jq .

Busbar translates the request from OpenAI format to Anthropic format on the way out, and the response from Anthropic format back to OpenAI format on the way in — through its intermediate representation, transparently. Your client receives a standard chat.completion object, with the answer in choices[0].message.content.

Via the OpenAI Python SDK

from openai import OpenAI

client = OpenAI(
    api_key="unused",          # no client auth gate in the minimal config
    base_url="http://localhost:8080",
)

response = client.chat.completions.create(
    model="claude-sonnet",     # busbar model name, not OpenAI's
    messages=[{"role": "user", "content": "What is a busbar?"}],
)
print(response.choices[0].message.content)

The OpenAI SDK has no idea it’s talking to an Anthropic backend. Swap model="claude-sonnet" to any model or pool name you configured; no other change required.

Via the Anthropic Python SDK

import anthropic

client = anthropic.Anthropic(
    api_key="unused",
    base_url="http://localhost:8080",
)

message = client.messages.create(
    model="claude-sonnet",     # the model name from your config
    max_tokens=256,
    messages=[{"role": "user", "content": "What is a busbar?"}],
)
print(message.content[0].text)

The Anthropic SDK sends x-api-key; Busbar accepts it on the /v1/messages routes.

Step 5: Add a second provider and a pool

Once the single-provider setup is working, extend the config to introduce a pool. A pool is a named group of models with weighted load balancing, per-member circuit breaking, and automatic failover.

# config.yaml — two providers, two models, one pool, with client auth
auth:
  mode: token
  client_tokens:
    - "${BUSBAR_CLIENT_TOKEN}"

providers:
  anthropic:
    api_key_env: ANTHROPIC_KEY
  openai:
    api_key_env: OPENAI_KEY

models:
  claude-sonnet:
    provider: anthropic
    max_concurrent: 20
  gpt-4o:
    provider: openai
    max_concurrent: 20

pools:
  smart:
    members:
      - target: claude-sonnet
        weight: 2
      - target: gpt-4o
        weight: 1

Set the additional environment variables and restart busbar:

export ANTHROPIC_KEY=sk-ant-...
export OPENAI_KEY=sk-...
export BUSBAR_CLIENT_TOKEN=your-token-here

BUSBAR_PROVIDERS=./providers.yaml BUSBAR_CONFIG=./config.yaml ./busbar

Now call the pool by name. Both ingress styles work against a pool:

# OpenAI ingress — model field selects the pool
curl -s http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer $BUSBAR_CLIENT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"model": "smart", "messages": [{"role": "user", "content": "Hello!"}]}'

# Anthropic ingress — pool name in the URL path
curl -s http://localhost:8080/smart/v1/messages \
  -H "Authorization: Bearer $BUSBAR_CLIENT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"max_tokens": 256, "messages": [{"role": "user", "content": "Hello!"}]}'

With weight: 2 on claude-sonnet and weight: 1 on gpt-4o, Busbar distributes via smooth weighted round-robin — roughly two of every three requests go to Claude and one to GPT-4o. If claude-sonnet’s breaker trips (on upstream 5xx, 429, timeout, or network errors), Busbar fails the request over to gpt-4o — provided the failure happens before the first byte of the response reaches your client. After the first byte, in-flight failover is no longer possible.

What “cross-protocol” means here

claude-sonnet speaks Anthropic; gpt-4o speaks OpenAI. A client using OpenAI-format ingress (/v1/chat/completions) is making an OpenAI request. When Busbar routes that request to claude-sonnet, it translates the request from OpenAI to Anthropic format, and the response back — losslessly, through its intermediate representation. When it routes to gpt-4o, it passes through natively. Your client never needs to know.

Checking health and stats

Liveness (always unauthenticated):

curl -s http://localhost:8080/healthz

Returns 200 ok if any lane is ready, 503 no usable lanes otherwise.

Per-lane topology (/stats):

curl -s http://localhost:8080/stats \
  -H "Authorization: Bearer $BUSBAR_CLIENT_TOKEN" | jq .

/stats goes through the auth middleware, so under auth.mode: token (or governance) it requires a valid token; under mode: none it is open. It returns a per-lane snapshot: model, provider, max_concurrent, inflight, free_slots, ok/err/client_fault counts, usable, dead, dead_reason, cooldown_remaining_s, streak, and budget. A governance key restricted to specific allowed_pools only sees the pools and lanes it can reach.

Prometheus metrics (/metrics):

curl -s http://localhost:8080/metrics \
  -H "Authorization: Bearer $BUSBAR_CLIENT_TOKEN"

Prometheus scrape exposition. Like /stats, /metrics is subject to the auth middleware (it is not auth-exempt — telemetry is a fingerprinting surface), so it requires a token under token/governance mode and is open under none/passthrough. Key metrics: busbar_requests_total, busbar_upstream_failures_total, busbar_breaker_trips_total, busbar_request_duration_seconds, busbar_translations_total.

Common setup variations

auth: none (local dev, open relay)

Omit the auth block entirely, or set mode: none. No Authorization header required. Do not use in production.

auth: passthrough (forward your own key)

auth:
  mode: passthrough

The caller’s own token (Authorization: Bearer, x-api-key, or x-goog-api-key) is forwarded directly to the upstream provider. Use this when each caller has their own provider key and you want Busbar purely for routing and protocol translation, not credential management.

Note: passthrough is incompatible with governance.enabled: true (validation rejects the combination).

Bedrock egress (Busbar signs requests with SigV4)

Add a Bedrock provider. The key env var holds ACCESS_KEY_ID:SECRET_ACCESS_KEY (or with an optional third segment :SESSION_TOKEN):

providers:
  bedrock:
    api_key_env: AWS_BEDROCK_CREDS

models:
  claude-bedrock:
    provider: bedrock
    max_concurrent: 10

export AWS_BEDROCK_CREDS="AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

Busbar signs each outbound request with SigV4 (region parsed from the host); your clients call Busbar normally with a Busbar token. OpenAI-format clients can reach Bedrock backends this way with no SDK changes.

Bedrock ingress (acting as a Bedrock endpoint for native AWS SDK clients) requires auth.mode: passthrough or none. Busbar’s SigV4 is sign-only — it does not verify inbound signatures, so a SigV4-signed request carries no bearer token Busbar can match and is rejected under token or governance mode.

Injecting `max_tokens` for cross-protocol calls

If you route OpenAI-format requests to an Anthropic backend, Anthropic’s API requires a max_tokens field that OpenAI clients often omit. Busbar injects a default only on cross-protocol translation to a backend that requires max_tokens (Anthropic Messages) when the source omitted it. The default is 4096 unless you override it per model:

models:
  claude-sonnet:
    provider: anthropic
    max_concurrent: 20
    default_max_tokens: 8192

A caller-supplied max_tokens is always preserved; this only applies when the field is absent and the egress requires it. It has no effect on same-protocol passthrough.

Production checklist

Before taking Busbar out of dev mode:

Set auth.mode: token with at least one client_tokens entry (or enable governance for per-key virtual tokens)
Set max_concurrent on every model to a value your provider tier actually supports
Set max_requests to -1 (unlimited lifetime budget) or a finite positive budget per model
Verify /healthz returns 200 and /stats shows all lanes usable: true before routing production traffic
Consider health.mode: dead on providers you care about (re-probes tripped lanes so they recover faster after an outage clears)
Set RUST_LOG=info (the default); increase to debug only temporarily for diagnostics

What’s next

Full config reference — every field, default, and validation rule: docs/configuration.md
Pools, breakers, and failover — pool member weighting, breaker tuning, session affinity, context-length failover, exhaustion policies: docs/configuration.md#pools
Running in production — TLS termination, systemd unit, Docker, /stats monitoring, breaker diagnosis: docs/operations.md
Governance — virtual keys, per-key budgets and rate limits, the /admin API: docs/operations.md
Architecture — how the IR works, the six-protocol model, why f64 and not f32: docs/architecture.md