Plain Text

Authorizations

Authorization

string

header

required

All endpoints require Bearer Token authentication. Add to the request header:

Authorization: Bearer YOUR_API_KEY

YOUR_API_KEY is the API Token (sk-... format).

Body

application/json

model

string

default:claude-opus-4-7

required

Model name. Common values:

claude-opus-4-7
gemini-2.5-pro
nemotron-3-nano-omni
gpt-5.4 / gpt-5.5 / kimi-k2.6 / gemini-3-pro-preview and others

Examples:

"claude-opus-4-7"

"gemini-2.5-pro"

"nemotron-3-nano-omni"

prompt

string

required

User prompt, up to 100,000 characters.

Maximum string length: 100000

Example:

"Summarize the theory of relativity in two sentences."

sync

boolean

default:false

Synchronous mode. When true, the endpoint blocks until the upstream completes and returns the full response (if stream=true at the same time, returns an SSE stream); when false, the endpoint returns the task ID immediately, and results are fetched via GET /v1/tasks/{task_id} or the SSE endpoint.

Example:

false

stream

boolean

default:false

Whether to stream. When true, the Submit response includes stream.url pointing to the SSE subscription path; streaming chunks are unified as the OpenAI chat.completion.chunk format.

Example:

false

max_tokens

integer | null

Generation token limit. Optional.

Required range: x >= 1

Example:

64

temperature

number | null

Sampling temperature, range [0, 2]. Optional.

Required range: 0 <= x <= 2

Example:

0.3

system_prompt

string | null

System instruction, prepended to the conversation context. Optional, up to 10,000 characters.

Maximum string length: 10000

Example:

"You are a terse assistant."

reasoning

boolean | null

Whether to include reasoning tokens. Passed through to the upstream; concrete semantics depend on the upstream model (thinking models like gemini-2.5-pro may require true).

Response

Task created (async mode) / full response (sync mode)

Submit response, conforming to the unified task standard shape. results / error are fixed at null during submit; they are returned via GET /v1/tasks/{task_id} after the task completes or fails. In sync=true, stream=false mode, the endpoint directly returns the full OpenAI ChatCompletion JSON (does not follow this shape).

string

required

Task ID, formatted as task-llmrouter-{timestamp}-{8random}.

Example:

"task-llmrouter-1776874565-yq3szvcu"

object

enum<string>

required

Available options:

llm.generation.task

Example:

"llm.generation.task"

type

enum<string>

required

Available options:

llm

Example:

"llm"

model

string

required

The model name submitted by the client (echoed verbatim)

Example:

"claude-opus-4-7"

status

enum<string>

required

Available options:

pending

Example:

"pending"

progress

integer

required

Example:

0

created

integer

required

Example:

1776874565

stream

object

Returns {url: ...} when stream=true; null when stream=false.

Show child attributes

results

object[] | null

Fixed at null during submit; returned via GET /v1/tasks/{task_id} after the task completes — results[0] is the full OpenAI ChatCompletion response.

Example:

null

error

object

Fixed at null during submit; returned via GET /v1/tasks/{task_id} when the task fails.

Example:

null

Image Series

Video Series

Audio Series

Text Series

Task Management

File Management

Authorizations

Body

Response

Image Series

Video Series

Audio Series

Text Series

Task Management

File Management

Documentation Index

Authorizations

Body

Response