Image Understanding

Authorizations

Authorization

string

header

required

All endpoints require Bearer Token authentication. Add to the request header:

Authorization: Bearer YOUR_API_KEY

YOUR_API_KEY is the API Token (sk-... format).

Body

application/json

model

string

default:claude-opus-4-7

required

Model name. Common vision models:

claude-opus-4-7
gemini-2.5-pro
gpt-5.5 (single-image set only; does not support video / audio)
nemotron-3-nano-omni (single image only)

Examples:

"claude-opus-4-7"

"gemini-2.5-pro"

"nemotron-3-nano-omni"

prompt

string

required

User prompt, up to 100,000 characters.

Maximum string length: 100000

Example:

"Describe this image in one sentence."

image_urls

string[]

required

Array of image sources (1–10 images). Each element accepts one of the following two forms:

Publicly reachable HTTP/HTTPS URL
data:image/<type>;base64,<payload> data URI (base64 inline)

Model constraints:

nemotron-3-nano-omni: single image only; image_urls.length > 1 → 422; when a URL is intermittently unreachable, fall back to an inline data URI
Other models: multiple images supported (up to 10)

Base64 data is not size-validated; oversized payloads may trigger 422.

Required array length: 1 - 10 elements

Example:

[
  "https://fal.media/files/lion/AOtzfcyHpx-MOITAUeMrK.jpeg"
]

sync

boolean

default:false

Synchronous mode (see llm-text schema).

Example:

false

stream

boolean

default:false

Whether to stream (see llm-text schema).

Example:

false

max_tokens

integer | null

Generation token limit. Optional.

Required range: x >= 1

Example:

64

temperature

number | null

Sampling temperature, range [0, 2]. Optional.

Required range: 0 <= x <= 2

Example:

0.3

system_prompt

string | null

System instruction. Optional.

Maximum string length: 10000

Example:

"You are a vision assistant."

reasoning

boolean | null

Whether to include reasoning tokens. Some thinking models require this to be set to true.

Response

Task created (async mode) / full response (sync mode)

Submit response, conforming to the unified task standard shape. results / error are fixed at null during submit; they are returned via GET /v1/tasks/{task_id} after the task completes or fails. In sync=true, stream=false mode, the endpoint directly returns the full OpenAI ChatCompletion JSON.

string

required

Task ID, formatted as task-llmrouter-{timestamp}-{8random}.

Example:

"task-llmrouter-1776874565-yq3szvcu"

object

enum<string>

required

Available options:

llm.generation.task

Example:

"llm.generation.task"

type

enum<string>

required

Available options:

llm

Example:

"llm"

model

string

required

The model name submitted by the client (echoed verbatim)

Example:

"claude-opus-4-7"

status

enum<string>

required

Available options:

pending

Example:

"pending"

progress

integer

required

Example:

0

created

integer

required

Example:

1776874565

stream

object

Returns {url: ...} when stream=true; null when stream=false.

Show child attributes

results

object[] | null

Fixed at null during submit; returned via GET /v1/tasks/{task_id} after the task completes — results[0] is the full OpenAI ChatCompletion response.

Example:

null

error

object

Fixed at null during submit; returned via GET /v1/tasks/{task_id} when the task fails.

Example:

null

Image Series

Video Series

Audio Series

Text Series

Task Management

File Management

Image Understanding

Authorizations

Body

Response

Image Series

Video Series

Audio Series

Text Series

Task Management

File Management

Documentation Index

Authorizations

Body

Response