如何流式传输聊天模型响应

所有聊天模型都实现了 Runnable 接口，该接口带有标准 runnable 方法（即 invoke、batch、stream、streamEvents）的默认实现。本指南介绍了如何使用这些方法从聊天模型流式传输输出。

提示

默认实现不提供令牌逐个流式传输的支持，而是返回一个 AsyncGenerator，它将在单个块中生成所有模型输出。它的存在是为了确保模型可以替换为任何其他模型，因为它支持相同的标准接口。

逐个令牌流式传输输出的能力取决于提供商是否已实现逐个令牌流式传输支持。

你可以在此处查看哪些集成支持逐个令牌流式传输。

流式传输

下面，我们使用 --- 来帮助可视化令牌之间的分隔符。

选择你的聊天模型

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/groq

yarn add @langchain/groq 

pnpm add @langchain/groq 

添加环境变量

GROQ_API_KEY=your-api-key

实例化模型

import { ChatGroq } from "@langchain/groq";

const model = new ChatGroq({
  model: "llama-3.3-70b-versatile",
  temperature: 0
});

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/openai

yarn add @langchain/openai 

pnpm add @langchain/openai 

添加环境变量

OPENAI_API_KEY=your-api-key

实例化模型

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0
});

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/anthropic

yarn add @langchain/anthropic 

pnpm add @langchain/anthropic 

添加环境变量

ANTHROPIC_API_KEY=your-api-key

实例化模型

import { ChatAnthropic } from "@langchain/anthropic";

const model = new ChatAnthropic({
  model: "claude-3-5-sonnet-20240620",
  temperature: 0
});

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/community

yarn add @langchain/community 

pnpm add @langchain/community 

添加环境变量

FIREWORKS_API_KEY=your-api-key

实例化模型

import { ChatFireworks } from "@langchain/community/chat_models/fireworks";

const model = new ChatFireworks({
  model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
  temperature: 0
});

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/mistralai

yarn add @langchain/mistralai 

pnpm add @langchain/mistralai 

添加环境变量

MISTRAL_API_KEY=your-api-key

实例化模型

import { ChatMistralAI } from "@langchain/mistralai";

const model = new ChatMistralAI({
  model: "mistral-large-latest",
  temperature: 0
});

安装依赖项

提示

请参阅本节有关安装集成包的通用说明.

npm
yarn
pnpm

npm i @langchain/google-vertexai

yarn add @langchain/google-vertexai 

pnpm add @langchain/google-vertexai 

添加环境变量

GOOGLE_APPLICATION_CREDENTIALS=credentials.json

实例化模型

import { ChatVertexAI } from "@langchain/google-vertexai";

const model = new ChatVertexAI({
  model: "gemini-1.5-flash",
  temperature: 0
});

const stream = await model.stream(
  "Write me a 1 verse song about goldfish on the moon"
);

for await (const chunk of stream) {
  console.log(`${chunk.content}\n---`);
}

---
Here's
---
 a one
---
-
---
verse song about goldfish on
---
 the moon:

Verse
---
:
Swimming
---
 through the stars
---
,
---
 in
---
 a cosmic
---
 lag
---
oon
---

Little
---
 golden
---
 scales
---
,
---
 reflecting the moon
---

No
---
 gravity to
---
 hold them,
---
 they
---
 float with
---
 glee
Goldfish
---
 astron
---
auts, on a lunar
---
 sp
---
ree
---

Bub
---
bles rise
---
 like
---
 com
---
ets, in the
---
 star
---
ry night
---

Their fins like
---
 tiny
---
 rockets, a
---
 w
---
ondrous sight
Who
---
 knew
---
 these
---
 small
---
 creatures
---
,
---
 could con
---
quer space?
---

Goldfish on the moon,
---
 with
---
 such
---
 fis
---
hy grace
---

---

---

流式传输事件

聊天模型还支持标准的 streamEvents() 方法，用于从链中流式传输更细粒度的事件。

如果你要从包含多个步骤（例如，由提示、聊天模型和解析器组成的链）的更大型 LLM 应用程序流式传输输出，则此方法非常有用

const eventStream = await model.streamEvents(
  "Write me a 1 verse song about goldfish on the moon",
  {
    version: "v2",
  }
);

const events = [];
for await (const event of eventStream) {
  events.push(event);
}

events.slice(0, 3);

[
  {
    event: "on_chat_model_start",
    data: { input: "Write me a 1 verse song about goldfish on the moon" },
    name: "ChatAnthropic",
    tags: [],
    run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
    metadata: {
      ls_provider: "openai",
      ls_model_name: "claude-3-5-sonnet-20240620",
      ls_model_type: "chat",
      ls_temperature: 1,
      ls_max_tokens: 2048,
      ls_stop: undefined
    }
  },
  {
    event: "on_chat_model_stream",
    run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
    name: "ChatAnthropic",
    tags: [],
    metadata: {
      ls_provider: "openai",
      ls_model_name: "claude-3-5-sonnet-20240620",
      ls_model_type: "chat",
      ls_temperature: 1,
      ls_max_tokens: 2048,
      ls_stop: undefined
    },
    data: {
      chunk: AIMessageChunk {
        lc_serializable: true,
        lc_kwargs: {
          content: "",
          additional_kwargs: [Object],
          tool_calls: [],
          invalid_tool_calls: [],
          tool_call_chunks: [],
          response_metadata: {}
        },
        lc_namespace: [ "langchain_core", "messages" ],
        content: "",
        name: undefined,
        additional_kwargs: {
          id: "msg_01JaaH9ZUXg7bUnxzktypRak",
          type: "message",
          role: "assistant",
          model: "claude-3-5-sonnet-20240620"
        },
        response_metadata: {},
        id: undefined,
        tool_calls: [],
        invalid_tool_calls: [],
        tool_call_chunks: [],
        usage_metadata: undefined
      }
    }
  },
  {
    event: "on_chat_model_stream",
    run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
    name: "ChatAnthropic",
    tags: [],
    metadata: {
      ls_provider: "openai",
      ls_model_name: "claude-3-5-sonnet-20240620",
      ls_model_type: "chat",
      ls_temperature: 1,
      ls_max_tokens: 2048,
      ls_stop: undefined
    },
    data: {
      chunk: AIMessageChunk {
        lc_serializable: true,
        lc_kwargs: {
          content: "Here's",
          additional_kwargs: {},
          tool_calls: [],
          invalid_tool_calls: [],
          tool_call_chunks: [],
          response_metadata: {}
        },
        lc_namespace: [ "langchain_core", "messages" ],
        content: "Here's",
        name: undefined,
        additional_kwargs: {},
        response_metadata: {},
        id: undefined,
        tool_calls: [],
        invalid_tool_calls: [],
        tool_call_chunks: [],
        usage_metadata: undefined
      }
    }
  }
]

下一步

你现在已经了解了几种流式传输聊天模型响应的方法。

接下来，查看本指南以了解有关使用其他 LangChain 模块进行流式传输的更多信息。

如何流式传输聊天模型响应

流式传输

选择你的聊天模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

流式传输事件

下一步

此页面对您有帮助吗？

你也可以留下详细的反馈在 GitHub 上.

流式传输​

选择你的聊天模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

安装依赖项

添加环境变量

实例化模型

流式传输事件​

下一步​

此页面对您有帮助吗？

你也可以留下详细的反馈 在 GitHub 上.

流式传输

流式传输事件

下一步

你也可以留下详细的反馈在 GitHub 上.