正在迁移出 ConversationTokenBufferMemory
如果您尝试迁移出以下列出的旧内存类之一,请按照本指南操作
内存类型 | 描述 |
---|---|
ConversationTokenBufferMemory | 仅保留对话中最新的消息,并限制对话中的令牌总数不超过特定限制。 |
ConversationTokenBufferMemory
在原始对话历史记录之上应用额外的处理,以修剪对话历史记录的大小,使其适合聊天模型的上下文窗口。
此处理功能可以使用 LangChain 的内置 trimMessages 函数来完成。
我们将首先探索一种直接的方法,该方法涉及将处理逻辑应用于整个对话历史记录。
虽然这种方法易于实现,但它有一个缺点:随着对话的增长,延迟也会增加,因为逻辑会在每次轮次中重新应用于对话中的所有先前交流。
更高级的策略侧重于增量更新对话历史记录,以避免冗余处理。
例如,LangGraph 的 关于摘要的操作指南 演示了如何在保持对话的运行摘要的同时丢弃较旧的消息,确保它们在后续轮次中不会被重新处理。
设置
依赖项
- npm
- yarn
- pnpm
npm i @langchain/openai @langchain/core zod
yarn add @langchain/openai @langchain/core zod
pnpm add @langchain/openai @langchain/core zod
环境变量
process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";
详情
重新实现 ConversationTokenBufferMemory 逻辑
在这里,我们将使用 trimMessages
来保留系统消息和对话中最新的消息,并限制对话中的令牌总数不超过特定限制。
import {
AIMessage,
HumanMessage,
SystemMessage,
} from "@langchain/core/messages";
const messages = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];
import { trimMessages } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
const selectedMessages = await trimMessages(messages, {
// Please see API reference for trimMessages for other ways to specify a token counter.
tokenCounter: new ChatOpenAI({ model: "gpt-4o" }),
maxTokens: 80, // <-- token limit
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
strategy: "last",
includeSystem: true, // <-- Keep the system message
});
for (const msg of selectedMessages) {
console.log(msg);
}
SystemMessage {
"content": "you're a good assistant, you always respond with a joke.",
"additional_kwargs": {},
"response_metadata": {}
}
HumanMessage {
"content": "and who is harrison chasing anyways",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "why is 42 always the answer?",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Because it's the only number that's constantly right, even when it doesn't add up!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "What did the cow say?",
"additional_kwargs": {},
"response_metadata": {}
}
LangGraph 的现代用法
下面的示例展示了如何使用 LangGraph 添加简单的对话预处理逻辑。
如果您想避免每次都在整个对话历史记录上运行计算,您可以按照 关于摘要的操作指南,该指南演示了如何丢弃较旧的消息,确保它们在后续轮次中不会被重新处理。
详情
import { v4 as uuidv4 } from "uuid";
import { ChatOpenAI } from "@langchain/openai";
import {
StateGraph,
MessagesAnnotation,
END,
START,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";
// Define a chat model
const model = new ChatOpenAI({ model: "gpt-4o" });
// Define the function that calls the model
const callModel = async (
state: typeof MessagesAnnotation.State
): Promise<Partial<typeof MessagesAnnotation.State>> => {
const selectedMessages = await trimMessages(state.messages, {
tokenCounter: (messages) => messages.length, // Simple message count instead of token count
maxTokens: 5, // Allow up to 5 messages
strategy: "last",
startOn: "human",
includeSystem: true,
allowPartial: false,
});
const response = await model.invoke(selectedMessages);
// With LangGraph, we're able to return a single message, and LangGraph will concatenate
// it to the existing list
return { messages: [response] };
};
// Define a new graph
const workflow = new StateGraph(MessagesAnnotation)
// Define the two nodes we will cycle between
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
const app = workflow.compile({
// Adding memory is straightforward in LangGraph!
// Just pass a checkpointer to the compile method.
checkpointer: new MemorySaver(),
});
// The thread id is a unique key that identifies this particular conversation
// ---
// NOTE: this must be `thread_id` and not `threadId` as the LangGraph internals expect `thread_id`
// ---
const thread_id = uuidv4();
const config = { configurable: { thread_id }, streamMode: "values" as const };
const inputMessage = {
role: "user",
content: "hi! I'm bob",
};
for await (const event of await app.stream(
{ messages: [inputMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
// Here, let's confirm that the AI remembers our name!
const followUpMessage = {
role: "user",
content: "what was my name?",
};
// ---
// NOTE: You must pass the same thread id to continue the conversation
// we do that here by passing the same `config` object to the `.stream` call.
// ---
for await (const event of await app.stream(
{ messages: [followUpMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob
Hello, Bob! How can I assist you today?
what was my name?
You mentioned that your name is Bob. How can I help you today?
与预构建的 langgraph 代理一起使用
此示例展示了如何将 Agent Executor 与使用 createReactAgent 函数构建的预构建代理一起使用。
如果您正在使用 旧的 LangChain 预构建代理 之一,您应该能够使用新的 LangGraph 预构建代理 替换该代码,后者利用聊天模型的原生工具调用功能,并且很可能开箱即用效果更好。
详情
import { z } from "zod";
import { v4 as uuidv4 } from "uuid";
import { BaseMessage, trimMessages } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";
import { MemorySaver } from "@langchain/langgraph";
import { createReactAgent } from "@langchain/langgraph/prebuilt";
const getUserAge = tool(
(name: string): string => {
// This is a placeholder for the actual implementation
if (name.toLowerCase().includes("bob")) {
return "42 years old";
}
return "41 years old";
},
{
name: "get_user_age",
description: "Use this tool to find the user's age.",
schema: z.string().describe("the name of the user"),
}
);
const memory = new MemorySaver();
const model2 = new ChatOpenAI({ model: "gpt-4o" });
const stateModifier = async (
messages: BaseMessage[]
): Promise<BaseMessage[]> => {
// We're using the message processor defined above.
return trimMessages(messages, {
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});
};
const app2 = createReactAgent({
llm: model2,
tools: [getUserAge],
checkpointSaver: memory,
messageModifier: stateModifier,
});
// The thread id is a unique key that identifies
// this particular conversation.
// We'll just generate a random uuid here.
const threadId2 = uuidv4();
const config2 = {
configurable: { thread_id: threadId2 },
streamMode: "values" as const,
};
// Tell the AI that our name is Bob, and ask it to use a tool to confirm
// that it's capable of working like an agent.
const inputMessage2 = {
role: "user",
content: "hi! I'm bob. What is my age?",
};
for await (const event of await app2.stream(
{ messages: [inputMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
// Confirm that the chat bot has access to previous conversation
// and can respond to the user saying that the user's name is Bob.
const followUpMessage2 = {
role: "user",
content: "do you remember my name?",
};
for await (const event of await app2.stream(
{ messages: [followUpMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob. What is my age?
42 years old
Hi Bob! You are 42 years old.
do you remember my name?
Yes, your name is Bob! If there's anything else you'd like to know or discuss, feel free to ask.
LCEL:添加预处理步骤
添加复杂对话管理的最简单方法是在聊天模型前面引入一个预处理步骤,并将完整的对话历史记录传递给预处理步骤。
这种方法在概念上很简单,并且在许多情况下都有效;例如,如果使用 RunnableWithMessageHistory 而不是包装聊天模型,请使用预处理器包装聊天模型。
这种方法的明显缺点是,由于以下两个原因,随着对话历史记录的增长,延迟开始增加
- 随着对话变得更长,可能需要从您用于存储对话历史记录的任何存储中获取更多数据(如果不是将其存储在内存中)。
- 预处理逻辑最终会进行大量冗余计算,重复对话先前步骤的计算。
如果您想使用聊天模型的工具调用功能,请记住在向其添加历史记录预处理步骤之前,将工具绑定到模型!
详情
import { ChatOpenAI } from "@langchain/openai";
import {
AIMessage,
HumanMessage,
SystemMessage,
BaseMessage,
trimMessages,
} from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const model3 = new ChatOpenAI({ model: "gpt-4o" });
const whatDidTheCowSay = tool(
(): string => {
return "foo";
},
{
name: "what_did_the_cow_say",
description: "Check to see what the cow said.",
schema: z.object({}),
}
);
const messageProcessor = trimMessages({
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});
// Note that we bind tools to the model first!
const modelWithTools = model3.bindTools([whatDidTheCowSay]);
const modelWithPreprocessor = messageProcessor.pipe(modelWithTools);
const fullHistory = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];
// We pass it explicitly to the modelWithPreprocessor for illustrative purposes.
// If you're using `RunnableWithMessageHistory` the history will be automatically
// read from the source that you configure.
const result = await modelWithPreprocessor.invoke(fullHistory);
console.log(result);
AIMessage {
"id": "chatcmpl-AB6uzWscxviYlbADFeDlnwIH82Fzt",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 16,
"promptTokens": 95,
"totalTokens": 111
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [
{
"name": "what_did_the_cow_say",
"args": {},
"type": "tool_call",
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 95,
"output_tokens": 16,
"total_tokens": 111
}
}
如果您需要实现更高效的逻辑并且现在想使用 RunnableWithMessageHistory
,那么实现此目的的方法是从 BaseChatMessageHistory 继承子类,并为 addMessages
定义适当的逻辑(这不仅仅是附加历史记录,而是重写历史记录)。
除非您有充分的理由实施此解决方案,否则您应该改用 LangGraph。
下一步
探索 LangGraph 的持久性
使用简单的 LCEL 添加持久性(对于更复杂的用例,请优先选择 LangGraph)
使用消息历史记录