Skip to main content
POST
/
v1
/
llm
/
generations
curl --request POST \
  --url https://api.foxapi.cc/v1/llm/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.4",
  "messages": [
    {
      "role": "user",
      "content": "count 1 to 3"
    }
  ],
  "stream": false,
  "max_tokens": 32
}
'
{
  "id": "task-llm-1776874481-rj6bs3yb",
  "object": "llm.generation.task",
  "type": "llm",
  "model": "gpt-5.4",
  "status": "pending",
  "progress": 0,
  "created": 1776874481,
  "stream": null,
  "results": null,
  "error": null
}

Documentation Index

Fetch the complete documentation index at: https://docs.foxapi.cc/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

All endpoints require Bearer Token authentication. Add to the request header:

Authorization: Bearer YOUR_API_KEY

YOUR_API_KEY is the API Token (sk-... format).

Body

application/json

Request body in messages[] form (OpenAI Chat compatible). Apart from the fields listed below, other OpenAI-compatible parameters (temperature, top_p, stop, frequency_penalty, etc.) are used per the OpenAI Chat spec.

model
enum<string>
required

Model name. Strict whitelist; unlisted values return 422 model_not_supported.

Special behaviors:

ModelNotes
gpt-5.4When provided, max_tokens must be ≥ 16
gpt-5.5When provided, max_tokens must be ≥ 16
kimi-k2.6Thinking model: SSE correctly passes through delta.reasoning_content, but GET /v1/tasks/{task_id}'s results[0].message.content does not accumulate reasoning_content
claude-opus-4-6 / claude-opus-4-7 / claude-sonnet-4-6max_tokens is required
gemini-3-pro-preview / gemini-3.1-pro-preview
Available options:
gpt-5.4,
gpt-5.5,
kimi-k2.6,
claude-opus-4-6,
claude-opus-4-7,
claude-sonnet-4-6,
gemini-3-pro-preview,
gemini-3.1-pro-preview
Example:

"claude-opus-4-7"

messages
object[]
required

Conversation messages array. messages[*].content may be a string (plain text) or array (multimodal blocks). Multimodal block type ∈ {text, image_url, video_url, audio_url, file_url}; not all models support all types — unsupported types return 422 unsupported_content_type.

Content type support:

Modeltextimage_urlvideo_urlaudio_urlfile_url
gpt-5.4 / gpt-5.5 / kimi-k2.6
claude-*
gemini-*
Example:
[
{ "role": "user", "content": "count 1 to 3" }
]
stream
boolean
default:false

Whether to stream.

Behavior differences:

ValueSubmit response stream fieldSSE endpoint
falsenullNot available
true{"url": "/v1/llm/generations/{task_id}/stream"}Available; meanwhile task.data accumulates the full response
Example:

false

max_tokens
integer | null

Generation token limit. Model constraints:

  • For gpt-5.4 / gpt-5.5, when provided, must be ≥ 16, otherwise 422
  • claude-* requires it; otherwise 422
  • Optional for kimi-k2.6 / gemini-*
Example:

64

temperature
number | null

Sampling temperature.

top_p
number | null

Nucleus sampling.

stop

Stop sequences.

Response

Task created

Submit response, conforming to the unified task standard shape. results / error are fixed at null during submit; they are returned via GET /v1/tasks/{task_id} after the task completes or fails

id
string
required

Task ID, formatted as task-llm-{timestamp}-{8random}. Used for GET /v1/tasks/{task_id} queries or GET /v1/llm/generations/{task_id}/stream SSE subscriptions

Example:

"task-llm-1776874565-yq3szvcu"

object
enum<string>
required

Object type, fixed at llm.generation.task

Available options:
llm.generation.task
Example:

"llm.generation.task"

type
enum<string>
required

Media type, fixed at llm

Available options:
llm
Example:

"llm"

model
string
required

The model name submitted by the client (echoed verbatim)

Example:

"claude-opus-4-7"

status
enum<string>
required

Task status, fixed at pending during submit

Available options:
pending
Example:

"pending"

progress
integer
required

Progress 0-100, fixed at 0 during submit

Example:

0

created
integer
required

Creation time (Unix seconds)

Example:

1776874565

stream
object

Returns {url: ...} when stream=true; null when stream=false. The client uses this to decide whether to connect to SSE

results
object[] | null

Fixed at null during submit; returned via GET /v1/tasks/{task_id} after the task completes — results[0] is the full OpenAI ChatCompletion response.

Known limitation: for model=kimi-k2.6 (a thinking model), reasoning_content is not accumulated into the final message.content, so results[0].message.content may be an empty string

Example:

null

error
object

Fixed at null during submit; returned via GET /v1/tasks/{task_id} when the task fails

Example:

null