跳至主要内容

如何流式传输聊天模型响应

所有 聊天模型 都实现了 Runnable 接口,该接口带有标准 Runnable 方法(即 invokebatchstreamstreamEvents)的**默认**实现。本指南介绍了如何使用这些方法来流式传输聊天模型的输出。

提示

**默认**实现**不支持**逐令牌流式传输,而是会返回一个AsyncGenerator,它会在一块中生成所有模型输出。它旨在确保可以将模型替换为任何其他模型,因为它支持相同的标准接口。

逐令牌流式传输输出的能力取决于提供商是否已实现逐令牌流式传输支持。

您可以查看哪些集成支持逐令牌流式传输

流式传输

在下面,我们使用 --- 来帮助可视化令牌之间的分隔符。

选择您的聊天模型

安装依赖项

yarn add @langchain/openai 

添加环境变量

OPENAI_API_KEY=your-api-key

实例化模型

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
const stream = await model.stream(
"Write me a 1 verse song about goldfish on the moon"
);

for await (const chunk of stream) {
console.log(`${chunk.content}\n---`);
}

---
Here's
---
a one
---
-
---
verse song about goldfish on
---
the moon:

Verse
---
:
Swimming
---
through the stars
---
,
---
in
---
a cosmic
---
lag
---
oon
---

Little
---
golden
---
scales
---
,
---
reflecting the moon
---

No
---
gravity to
---
hold them,
---
they
---
float with
---
glee
Goldfish
---
astron
---
auts, on a lunar
---
sp
---
ree
---

Bub
---
bles rise
---
like
---
com
---
ets, in the
---
star
---
ry night
---

Their fins like
---
tiny
---
rockets, a
---
w
---
ondrous sight
Who
---
knew
---
these
---
small
---
creatures
---
,
---
could con
---
quer space?
---

Goldfish on the moon,
---
with
---
such
---
fis
---
hy grace
---

---

---

流式传输事件

聊天模型还支持标准的streamEvents() 方法来流式传输链内更细粒度的事件。

如果您正在从包含多个步骤的较大 LLM 应用程序中流式传输输出(例如,由提示、聊天模型和解析器组成的链),则此方法很有用。

const eventStream = await model.streamEvents(
"Write me a 1 verse song about goldfish on the moon",
{
version: "v2",
}
);

const events = [];
for await (const event of eventStream) {
events.push(event);
}

events.slice(0, 3);
[
{
event: "on_chat_model_start",
data: { input: "Write me a 1 verse song about goldfish on the moon" },
name: "ChatAnthropic",
tags: [],
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "",
additional_kwargs: [Object],
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "",
name: undefined,
additional_kwargs: {
id: "msg_01JaaH9ZUXg7bUnxzktypRak",
type: "message",
role: "assistant",
model: "claude-3-5-sonnet-20240620"
},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "Here's",
additional_kwargs: {},
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "Here's",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
}
]

下一步

您现在已经了解了流式传输聊天模型响应的几种方法。

接下来,请查看本指南,以了解有关使用其他 LangChain 模块进行流式传输的更多信息。


此页面是否有帮助?


您也可以在 GitHub 上留下详细的反馈 GitHub.