Introduction
LLMWate provides a unified API gateway for accessing multiple AI models — including GPT-4o, Claude 3.5 Sonnet, Gemini, DeepSeek, Qwen, Meta Llama 3 and more — through a single, OpenAI-compatible interface.
One API key. 45+ models across 10 providers. One unified endpoint.
https://api.llmwate.com/v1
Quick Start
Get up and running in 3 steps:
- 1Get Your API KeySign up and create an API key from your dashboard.
- 2Choose a ModelBrowse models at AI Playground or via
GET /v1/models. - 3Make Your First RequestSend a chat completions request with your API key.
curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'Authentication
All API requests require authentication via Bearer token:
Authorization: Bearer YOUR_API_KEY
Manage your API keys on the API Keys Management page.
Chat Completions
Send a conversation and receive an AI-generated response. Fully compatible with the OpenAI Chat Completions API format.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID (e.g., gpt-4o, claude-3.5-sonnet, deepseek-chat) |
| messages | array | Yes | Array of message objects with role and content |
| temperature | float | No | Sampling temperature (0-2), default 1.0 |
| max_tokens | integer | No | Maximum tokens to generate |
| stream | boolean | No | Enable server-sent events streaming, default false |
curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What is 2+2?"}], "temperature": 0.7, "max_tokens": 256}'from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "What is 2+2?"}])
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({api_key:"YOUR_API_KEY", baseURL:"https://api.llmwate.com/v1"});
const resp = await client.chat.completions.create({model:"gpt-4o",messages:[{"role":"user","content":"What is 2+2?"}]});
console.log(resp.choices[0].message.content);{"id":"chatcmpl-abc123","object":"chat.completion","created":1715623456,"model":"gpt-4o","choices":[{"index":0,"message":{"role":"assistant","content":"2+2 equals 4."},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":8,"total_tokens":20}}List Available Models
Get all available AI models. Supports optional category filtering.
| Parameter | Type | Required | Description |
|---|---|---|---|
| category | string | No | Filter by category: general, coding, reasoning, vision, fast, cheap, chinese |
curl https://api.llmwate.com/v1/models -H "Authorization: Bearer YOUR_API_KEY"
curl "https://api.llmwate.com/v1/models?category=coding" -H "Authorization: Bearer YOUR_API_KEY"
{"models":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI","category":"general","context_length":128000,"pricing":{"prompt":0.0025,"completion":0.01}},{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","category":"coding","context_length":200000,"pricing":{"prompt":0.003,"completion":0.015}}],"total":45}General Coding Reasoning Vision Fast Cheap Chinese
45 models across 10 providers. Browse in the AI Playground.
Auto Router
Automatically select the best model for your task type. Returns recommended models ordered by preference.
| Parameter | Type | Required | Description |
|---|---|---|---|
| task | string | Yes | Task type: general, coding, reasoning, vision, fast, cheap, chinese |
curl "https://api.llmwate.com/v1/models/auto?task=coding" -H "Authorization: Bearer YOUR_API_KEY"
{"task":"coding","primary":{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","reason":"Best overall coding performance"},"alternatives":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI"},{"id":"deepseek-chat","name":"DeepSeek Chat","provider":"DeepSeek"}]}| Task | Best For | Recommended |
|---|---|---|
| coding | Code generation, debugging, review | Claude 3.5 Sonnet, GPT-4o |
| reasoning | Complex reasoning, analysis, math | Claude 3.5 Sonnet, DeepSeek R1 |
| vision | Image understanding, OCR | GPT-4o, Claude 3.5 Sonnet |
| fast | Quick responses, low latency | GPT-4o-mini, Claude 3 Haiku |
| cheap | Cost-effective inference | DeepSeek Chat, Qwen Turbo |
| chinese | Chinese language tasks | Qwen 2.5, DeepSeek Chat |
| general | General conversation | GPT-4o, Claude 3.5 Sonnet |
Account Balance
Check your current account balance, usage, and quota.
curl https://api.llmwate.com/v1/balance -H "Authorization: Bearer YOUR_API_KEY"
{"balance":87.50,"plan":"enterprise","used_this_month":1250000,"quota_this_month":6000000,"quota_reset_at":"2026-06-01T00:00:00Z"}Smart Routing
Per-API-key routing preferences let you prioritize specific providers and configure automatic failover chains. Set via the API Keys page.
Routing Modes
| Mode | Behavior |
|---|---|
| auto | System defaults: use the best available healthy provider based on task type |
| preferred | Always try your preferred provider first; fall back to system chain on failure |
| failover | Only use the preferred provider; return error if it fails (no fallback) |
Provider Health Circuit Breaker
Each provider has a circuit breaker: after 3 consecutive failures, it is marked unhealthy and skipped for 10 minutes before retry.
{"providers":{"openai":{"status":"healthy","consecutive_failures":0,"last_failure":null},"siliconflow":{"status":"healthy","consecutive_failures":0,"last_failure":null},"deepseek":{"status":"unhealthy","consecutive_failures":3,"last_failure":"2026-05-28T10:23:00Z"}},"timestamp":"2026-05-28T10:30:00Z"}Reset all provider circuit breakers (admin).
{"provider": "deepseek"} // optional, resets specific providerChat Status and Provider Health
Check which provider APIs are configured and their current status.
curl https://api.llmwate.com/v1/chat/status -H "Authorization: Bearer YOUR_API_KEY"
{"providers":{"openai":{"status":"configured"},"anthropic":{"status":"unconfigured"},"siliconflow":{"status":"configured"},"google":{"status":"unconfigured"},"deepseek":{"status":"unconfigured"},"qwen":{"status":"unconfigured"},"meta":{"status":"unconfigured"},"mistral":{"status":"unconfigured"},"cohere":{"status":"unconfigured"},"xai":{"status":"unconfigured"},"perplexity":{"status":"unconfigured"}}}Configure additional provider API keys to enable more models.
Streaming Responses
Enable server-sent events (SSE) streaming for real-time token-by-token output. Set stream: true in the request body.
curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Write a story"}],
"stream": true
}'data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}
data: [DONE]| Parameter | Type | Required | Description |
|---|---|---|---|
| task | string | Yes | Task type: general, coding, reasoning, vision, fast, cheap, chinese |
Error Codes
| Code | Meaning | Resolution |
|---|---|---|
| 401 | Invalid or missing API key | Check your API key in the dashboard |
| 403 | Model not available for your plan | Upgrade your subscription plan |
| 422 | Invalid request parameters | Check request body format and types |
| 429 | Rate limit exceeded | Wait and retry, or upgrade your plan |
| 500 | Internal server error | Retry or contact support |
| 503 | Service temporarily unavailable | Check provider status and retry |
Rate Limits
Rate limits are enforced per API key and vary by plan. Limits apply per minute (RPM) and per day (RPD).
| Plan | Requests/min | Tokens/day | Notes |
|---|---|---|---|
| Basic | 60 | 500,000 | - |
| Pro | 120 | 2,000,000 | - |
| Enterprise | 300 | 6,000,000 | - |
| Unlimited | Unlimited | Unlimited | - |
Rate limit headers are returned on every response:
X-RateLimit-Limit: 60 X-RateLimit-Remaining: 45 X-RateLimit-Reset: 1716891660
When exceeded, the API returns 429 Too Many Requests. Retry after the timestamp in X-RateLimit-Reset.
SDK Integration Guide
LLMWate uses an OpenAI-compatible API. Use the official OpenAI SDK with your LLMWate API key and base URL.
Python
Install: pip install openai
from openai import OpenAI
client = OpenAI(
api_key="YOUR_LLMWATE_API_KEY",
base_url="https://api.llmwate.com/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quantum computing in 2 sentences."}],
temperature=0.7,
max_tokens=512
)
print(response.choices[0].message.content)from openai import OpenAI
client = OpenAI(
api_key="YOUR_LLMWATE_API_KEY",
base_url="https://api.llmwate.com/v1"
)
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a Python function to reverse a string."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)import asyncio
from openai import AsyncOpenAI
async def main():
client = AsyncOpenAI(
api_key="YOUR_LLMWATE_API_KEY",
base_url="https://api.llmwate.com/v1"
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is 2+2?"}]
)
print(response.choices[0].message.content)
asyncio.run(main())from openai import OpenAI
client = OpenAI(
api_key="YOUR_LLMWATE_API_KEY",
base_url="https://api.llmwate.com/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful Python expert."},
{"role": "user", "content": "How do I sort a list in Python?"}
],
temperature=0.5
)
print(response.choices[0].message.content)Node.js / TypeScript
Install: npm install openai
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.LLMWATE_API_KEY,
baseURL: "https://api.llmwate.com/v1"
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain quantum computing in 2 sentences." }],
temperature: 0.7,
max_tokens: 512
});
console.log(response.choices[0].message.content);import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.LLMWATE_API_KEY,
baseURL: "https://api.llmwate.com/v1"
});
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Write a Python function to reverse a string." }],
stream: true
});
for await (const chunk of stream) {
if (chunk.choices[0].delta.content) {
process.stdout.write(chunk.choices[0].delta.content);
}
}cURL
curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Count to 5"}], "stream": true}'curl https://api.llmwate.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of Japan?"}
],
"temperature": 0.7,
"max_tokens": 100
}'Go
Install: go get github.com/sashabaranov/go-openai
package main
import (
"context"
openai "github.com/sashabaranov/go-openai"
)
func main() {
client := openai.NewClient("YOUR_LLMWATE_API_KEY")
client.BaseURL = "https://api.llmwate.com/v1/"
resp, err := client.CreateChatCompletion(
context.Background(),
openai.ChatCompletionRequest{
Model: "gpt-4o",
Messages: []openai.ChatCompletionMessage{
{Role: "user", Content: "Explain quantum computing in 2 sentences."},
},
},
)
if err != nil {
panic(err)
}
println(resp.Choices[0].Message.Content)
}Java
Using Spring Boot RestTemplate:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>@RestController
public class LLMController {
private final RestTemplate restTemplate;
public LLMController() {
this.restTemplate = new RestTemplate();
this.restTemplate.getInterceptors().add((request, body, execution) -> {
request.getHeaders().add("Authorization", "Bearer " + "YOUR_LLMWATE_API_KEY");
request.getHeaders().add("Content-Type", "application/json");
return execution.execute(request, body);
});
}
@PostMapping("/chat")
public String chat(@RequestBody Map<String, Object> request) {
String url = "https://api.llmwate.com/v1/chat/completions";
HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, new HttpHeaders());
ResponseEntity<Map> response = restTemplate.postForEntity(url, entity, Map.class);
Map body = response.getBody();
List choices = (List) body.get("choices");
Map message = (Map) ((Map) choices.get(0)).get("message");
return (String) message.get("content");
}
}Ruby
Install: gem install openai
require "openai"
client = OpenAI::Client.new(
api_key: ENV["LLMWATE_API_KEY"],
uri_base: "https://api.llmwate.com/v1"
)
response = client.chat(
parameters: {
model: "gpt-4o",
messages: [
{ role: "user", content: "Explain quantum computing in 2 sentences." }
],
temperature: 0.7
}
)
puts response.dig("choices", 0, "message", "content")PHP
<?php
$api_key = "YOUR_LLMWATE_API_KEY";
$url = "https://api.llmwate.com/v1/chat/completions";
$data = [
"model" => "gpt-4o",
"messages" => [
["role" => "user", "content" => "Explain quantum computing in 2 sentences."]
],
"temperature" => 0.7
];
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_HTTPHEADER, [
"Authorization: Bearer " . $api_key,
"Content-Type: application/json"
]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
$result = json_decode($response, true);
echo $result["choices"][0]["message"]["content"];
?>SDK Configuration Reference
| Setting | Value | Description |
|---|---|---|
| API Key | string | Your LLMWate API key from the dashboard |
| Base URL | string | https://api.llmwate.com/v1 |
| Default Model | string | gpt-4o (or any available model) |
| Timeout | integer | Request timeout in seconds (default varies by SDK) |
Environment Variables
# .env file LLMWATE_API_KEY=lmw_your_api_key_here LLMWATE_BASE_URL=https://api.llmwate.com/v1
from dotenv import load_dotenv
from openai import OpenAI
import os
load_dotenv()
client = OpenAI(
api_key=os.getenv("LLMWATE_API_KEY"),
base_url=os.getenv("LLMWATE_BASE_URL", "https://api.llmwate.com/v1")
)Common Patterns
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Return JSON with name and age fields."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}
}
}
)
# Access: response.choices[0].message.content (JSON string)from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the weather in Tokyo?"}],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
}
]
)
# Tool calls available in response.choices[0].message.tool_callsfrom openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(
model="gpt-4o", # or claude-3.5-sonnet for vision
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}
]
)
print(response.choices[0].message.content)Troubleshooting
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Check your API key in the dashboard |
| 403 Forbidden | Model not available | Verify your plan includes the model |
| 429 Rate Limited | Too many requests | Add delay between requests or upgrade plan |
| Connection Error | Network issues | Check your internet connection |
API Tester
InteractiveAPI Marketplace
Browse all available models with real-time pricing. Click any model to open it in the Playground.