如何流式传输聊天模型响应
所有 聊天模型 都实现了 Runnable 接口,该接口带有标准 Runnable 方法(即 invoke
、batch
、stream
、streamEvents
)的**默认**实现。本指南介绍了如何使用这些方法来流式传输聊天模型的输出。
提示
**默认**实现**不支持**逐令牌流式传输,而是会返回一个AsyncGenerator
,它会在一块中生成所有模型输出。它旨在确保可以将模型替换为任何其他模型,因为它支持相同的标准接口。
逐令牌流式传输输出的能力取决于提供商是否已实现逐令牌流式传输支持。
您可以查看哪些集成支持逐令牌流式传输。
流式传输
在下面,我们使用 ---
来帮助可视化令牌之间的分隔符。
选择您的聊天模型
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- Groq
- VertexAI
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const model = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const model = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const model = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const model = new ChatGroq({
model: "mixtral-8x7b-32768",
temperature: 0
});
安装依赖项
提示
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const model = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
const stream = await model.stream(
"Write me a 1 verse song about goldfish on the moon"
);
for await (const chunk of stream) {
console.log(`${chunk.content}\n---`);
}
---
Here's
---
a one
---
-
---
verse song about goldfish on
---
the moon:
Verse
---
:
Swimming
---
through the stars
---
,
---
in
---
a cosmic
---
lag
---
oon
---
Little
---
golden
---
scales
---
,
---
reflecting the moon
---
No
---
gravity to
---
hold them,
---
they
---
float with
---
glee
Goldfish
---
astron
---
auts, on a lunar
---
sp
---
ree
---
Bub
---
bles rise
---
like
---
com
---
ets, in the
---
star
---
ry night
---
Their fins like
---
tiny
---
rockets, a
---
w
---
ondrous sight
Who
---
knew
---
these
---
small
---
creatures
---
,
---
could con
---
quer space?
---
Goldfish on the moon,
---
with
---
such
---
fis
---
hy grace
---
---
---
流式传输事件
聊天模型还支持标准的streamEvents()
方法来流式传输链内更细粒度的事件。
如果您正在从包含多个步骤的较大 LLM 应用程序中流式传输输出(例如,由提示、聊天模型和解析器组成的链),则此方法很有用。
const eventStream = await model.streamEvents(
"Write me a 1 verse song about goldfish on the moon",
{
version: "v2",
}
);
const events = [];
for await (const event of eventStream) {
events.push(event);
}
events.slice(0, 3);
[
{
event: "on_chat_model_start",
data: { input: "Write me a 1 verse song about goldfish on the moon" },
name: "ChatAnthropic",
tags: [],
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "",
additional_kwargs: [Object],
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "",
name: undefined,
additional_kwargs: {
id: "msg_01JaaH9ZUXg7bUnxzktypRak",
type: "message",
role: "assistant",
model: "claude-3-5-sonnet-20240620"
},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "Here's",
additional_kwargs: {},
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "Here's",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
}
]
下一步
您现在已经了解了流式传输聊天模型响应的几种方法。
接下来,请查看本指南,以了解有关使用其他 LangChain 模块进行流式传输的更多信息。