Skip to main content
Relay speaks the OpenAI and Anthropic wire protocols, so you don’t need a new client library — point the official SDK you already use at Relay’s base URL and pass your relay key as the API key. Relay accepts the key whether the SDK sends it as a bearer token or an x-api-key header, so this just works.
SDKBase URLEndpoint it calls
OpenAIhttp://localhost:8080/openai/v1/openai/v1/chat/completions
Anthropichttp://localhost:8080/anthropic/anthropic/v1/messages
Swap http://localhost:8080 for your relay’s public URL in production. The data plane is the inference port (8080 by default).

OpenAI shape

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/openai/v1",
    api_key="<your-relay-key>",
)

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

Anthropic shape

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:8080/anthropic",
    api_key="<your-relay-key>",
)

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[{"role": "user", "content": "hello"}],
)
print(msg.content[0].text)

Streaming

Both SDKs stream as usual — Relay passes the event stream straight through.
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Which models can I call?

The model you pass must be granted by your relay key’s policy and have an enabled host binding. List what your key can reach:
curl http://localhost:8080/v1/models \
  -H "Authorization: Bearer <your-relay-key>"
See Troubleshooting if a model returns 403 or 404.