音频生成

了解如何从文本或音频提示生成音频。

除了生成文本和图像之外，某些模型还允许您生成对提示的语音音频响应，并使用音频输入来提示模型。音频输入可以包含比单独文本更丰富的数据，从而允许模型检测输入中的语气、变化和其他细微差别。

您可以使用这些音频功能来：

生成文本正文的语音音频摘要（文本输入、音频输出）
对录音执行情绪分析（音频输入、文本输出）
与模型进行异步语音到语音交互（音频输入、音频输出）

OpenAI 为简单语音转文本和文本转语音提供了其他模型 - 当您的任务需要这些转换（而不是模型中的动态内容）时，TTS 和 STT 模型将提高性能和成本效益。

快速入门

要生成音频或使用音频作为输入，您可以在 REST API 中使用聊天完成端点，如以下示例所示。您可以从您选择的 HTTP 客户端使用 REST API，也可以使用 OpenAI 的官方 SDK 之一作为您的首选编程语言。

创建对提示的类似人类的音频响应

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import { writeFileSync } from "node:fs";
import OpenAI from "openai";

const openai = new OpenAI();

// Generate an audio response to the given prompt
const response = await openai.chat.completions.create({
  model: "gpt-4o-audio-preview",
  modalities: ["text", "audio"],
  audio: { voice: "alloy", format: "wav" },
  messages: [
    {
      role: "user",
      content: "Is a golden retriever a good family dog?"
    }
  ]
});

// Inspect returned data
console.log(response.choices[0]);

// Write audio data to a file
writeFileSync(
  "dog.wav",
  Buffer.from(response.choices[0].message.audio.data, 'base64'),
  { encoding: "utf-8" }
);

多轮次对话

使用模型的音频输出作为多轮次对话的输入需要生成的 ID，该 ID 显示在音频生成的响应数据中。以下是您可能从中接收的消息的示例 JSON 数据结构：/chat/completions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
  "index": 0,
  "message": {
    "role": "assistant",
    "content": null,
    "refusal": null,
    "audio": {
      "id": "audio_abc123",
      "expires_at": 1729018505,
      "data": "<bytes omitted>",
      "transcript": "Yes, golden retrievers are known to be ..."
    }
  },
  "finish_reason": "stop"
}

的值 above 提供了一个标识符，您可以在新请求的消息中使用，如以下示例所示。message.audio.idassistant/chat/completions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
curl "https://api.openai.com/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -d '{
        "model": "gpt-4o-audio-preview",
        "modalities": ["text", "audio"],
        "audio": { "voice": "alloy", "format": "wav" },
        "messages": [
            {
                "role": "user",
                "content": "Is a golden retriever a good family dog?"
            },
            {
                "role": "assistant",
                "audio": {
                    "id": "audio_abc123"
                }
            },
            {
                "role": "user",
                "content": "Why do you say they are loyal?"
            }
        ]
    }'