Skip to main content
POST
/
v1
/
audios
/
generations
curl --request POST \
  --url https://api.foxapi.cc/v1/audios/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "paraformer-v2",
  "file_urls": [
    "https://example.com/audio/meeting.wav"
  ]
}
'
{
  "created": 1757165031,
  "id": "task-unified-1757165031-uyujaw3d",
  "model": "<string>",
  "object": "audio.generation.task",
  "progress": 0,
  "status": "pending",
  "task_info": {
    "can_cancel": true,
    "estimated_time": 45
  },
  "type": "audio"
}

Documentation Index

Fetch the complete documentation index at: https://docs.foxapi.cc/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

All APIs require Bearer Token authentication

Add to request header:

Authorization: Bearer YOUR_API_KEY

Body

application/json
model
string
default:paraformer-v2
required

paraformer-v2: Supports Chinese, English, Japanese, and other languages paraformer-8k-v2: 8kHz sample rate, Chinese only

Examples:

"paraformer-v2"

"paraformer-8k-v2"

file_urls
string[]
required

Audio file URL list

Notes:

  • Supports publicly accessible URLs via HTTP/HTTPS
  • Up to 100 URLs per request
  • Supported formats: aac, amr, avi, flac, flv, m4a, mkv, mov, mp3, mp4, mpeg, ogg, opus, wav, webm, wma, wmv
  • Single file must not exceed 2GB and 12 hours in duration
Required array length: 1 - 100 elements
Example:
["https://example.com/audio/meeting.wav"]
language_hints
string[] | null

Language hints for recognition

Notes:

  • Only supported by paraformer-v2, not applicable to paraformer-8k-v2
  • Supported language codes: zh (Chinese), en (English), ja (Japanese), yue (Cantonese), ko (Korean), de (German), fr (French), ru (Russian)
Example:
["zh", "en"]
channel_id
integer[] | null

Audio track index

Notes:

  • Index starts from 0, [0] means the first track
  • Default is [0] (only process the first track)
  • Each specified track is billed independently

Do not pass this parameter unless necessary.

Example:
[0]
recognition
object

Recognition configuration

Notes:

  • Includes disfluency filtering, timestamp alignment, hot words, and sensitive word filter settings
  • If not provided, default configuration is used

Do not pass this parameter unless necessary.

diarization
object

Speaker diarization configuration

Notes:

  • Includes diarization toggle and speaker count hint
  • If not provided, speaker diarization is not enabled

Do not pass this parameter unless necessary.

Response

Task created successfully

created
integer

Task creation timestamp

Example:

1757165031

id
string

Task ID

Example:

"task-unified-1757165031-uyujaw3d"

model
string

Actual model name used

object
enum<string>

Specific task type

Available options:
audio.generation.task
progress
integer

Task progress percentage (0-100)

Required range: 0 <= x <= 100
Example:

0

status
enum<string>

Task status

Available options:
pending,
processing,
completed,
failed
Example:

"pending"

task_info
object

Asynchronous task info

type
enum<string>

Task output type

Available options:
audio
Example:

"audio"