Scribe V2 语音识别

curl --request POST \ --url https://api.foxapi.cc/v1/audios/generations \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "scribe-v2", "audio_url": "https://samplelib.com/lib/preview/mp3/sample-3s.mp3" } '

{ "created": 1757165031, "id": "task-unified-1757165031-uyujaw3d", "model": "<string>", "object": "audio.generation.task", "progress": 0, "status": "pending", "task_info": { "can_cancel": true, "estimated_time": 45 }, "type": "audio" }

授权

Authorization

string

header

必填

所有接口均需要使用Bearer Token进行认证

使用时在请求头中添加：

Authorization: Bearer YOUR_API_KEY

请求体

application/json

model

string

默认值:scribe-v2

必填

scribe-v2：支持 diarize、音频事件标注与 keyterms 的语音识别模型

示例:

"scribe-v2"

audio_url

string

必填

待识别音频文件 URL

说明：

需为 HTTP/HTTPS 可访问地址
音频文件需可被系统直接访问和读取

示例:

"https://samplelib.com/lib/preview/mp3/sample-3s.mp3"

language_code

string | null

音频语言代码

说明：

支持 ISO-639-1 或 ISO-639-3 代码
例如：zh / zho / en / eng
不传时由模型自动检测

示例:

"zh"

tag_audio_events

boolean

默认值:true

是否标注笑声、掌声等音频事件。默认开启。

示例:

true

diarize

boolean

默认值:true

是否进行说话人分离。默认开启。

示例:

true

keyterms

string[] | null

偏置词 / 短语列表

说明：

最多 100 个条目
每个条目最多 50 个字符
用于提升特定术语或专有名词的识别倾向

非必须不要传这个参数。

Maximum array length: 100

Maximum string length: 50

示例:

[
  "project kickoff",
  "quarterly results",
  "speech to text"
]

响应

任务创建成功

created

integer

任务创建时间戳

示例:

1757165031

string

任务ID

示例:

"task-unified-1757165031-uyujaw3d"

model

string

实际使用的模型名称

object

enum<string>

任务的具体类型

可用选项:

audio.generation.task

progress

integer

任务进度百分比 (0-100)

必填范围: 0 <= x <= 100

示例:

0

status

enum<string>

任务状态

可用选项:

pending,

processing,

completed,

failed

示例:

"pending"

task_info

object

异步任务信息

Show child attributes

type

enum<string>

任务的输出类型

可用选项:

audio

示例:

"audio"

图像系列

视频系列

音频系列

语言系列

任务管理

文件管理

Scribe V2 语音识别

授权

所有接口均需要使用Bearer Token进行认证

请求体

响应

图像系列

视频系列

音频系列

语言系列

任务管理

文件管理

Documentation Index

授权

所有接口均需要使用Bearer Token进行认证

请求体

响应