从 ConversationTokenBufferMemory 迁移
如果您试图从下面列出的旧内存类之一迁移,请遵循本指南
内存类型 | 描述 |
---|---|
ConversationTokenBufferMemory | 仅保留对话中最新的消息,前提是对话中的令牌总数不超过某个限制。 |
ConversationTokenBufferMemory
在原始对话历史记录的基础上应用额外的处理,以将对话历史记录修剪到适合聊天模型上下文窗口的大小。
此处理功能可以使用 LangChain 的内置 trimMessages 函数来实现。
我们将首先探索一种简单的实现方法,该方法涉及将处理逻辑应用于整个对话历史记录。
虽然这种方法易于实现,但它有一个缺点:随着对话的增长,延迟也会增加,因为逻辑会在每次轮询时重新应用于对话中的所有先前交换。
更高级的策略侧重于增量更新对话历史记录以避免冗余处理。
例如,LangGraph 关于摘要的操作指南 演示了如何在丢弃旧消息的同时维护对话的运行摘要,确保它们在后续轮询期间不会被重新处理。
设置
依赖项
- npm
- yarn
- pnpm
npm i @langchain/openai @langchain/core zod
yarn add @langchain/openai @langchain/core zod
pnpm add @langchain/openai @langchain/core zod
环境变量
process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";
详细信息
重新实现 ConversationTokenBufferMemory 逻辑
在这里,我们将使用 trimMessages
来保留系统消息和对话中最新的消息,前提是对话中的令牌总数不超过某个限制。
import {
AIMessage,
HumanMessage,
SystemMessage,
} from "@langchain/core/messages";
const messages = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];
import { trimMessages } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
const selectedMessages = await trimMessages(messages, {
// Please see API reference for trimMessages for other ways to specify a token counter.
tokenCounter: new ChatOpenAI({ model: "gpt-4o" }),
maxTokens: 80, // <-- token limit
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
strategy: "last",
includeSystem: true, // <-- Keep the system message
});
for (const msg of selectedMessages) {
console.log(msg);
}
SystemMessage {
"content": "you're a good assistant, you always respond with a joke.",
"additional_kwargs": {},
"response_metadata": {}
}
HumanMessage {
"content": "and who is harrison chasing anyways",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "why is 42 always the answer?",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Because it's the only number that's constantly right, even when it doesn't add up!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "What did the cow say?",
"additional_kwargs": {},
"response_metadata": {}
}
使用 LangGraph 的现代方法
以下示例演示了如何使用 LangGraph 添加简单的对话预处理逻辑。
如果您想避免每次都对整个对话历史记录运行计算,您可以按照 关于摘要的操作指南 进行操作,该指南演示了如何在丢弃旧消息的同时维护对话的运行摘要,确保它们在后续轮询期间不会被重新处理。
详细信息
import { v4 as uuidv4 } from "uuid";
import { ChatOpenAI } from "@langchain/openai";
import {
StateGraph,
MessagesAnnotation,
END,
START,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";
// Define a chat model
const model = new ChatOpenAI({ model: "gpt-4o" });
// Define the function that calls the model
const callModel = async (
state: typeof MessagesAnnotation.State
): Promise<Partial<typeof MessagesAnnotation.State>> => {
const selectedMessages = await trimMessages(state.messages, {
tokenCounter: (messages) => messages.length, // Simple message count instead of token count
maxTokens: 5, // Allow up to 5 messages
strategy: "last",
startOn: "human",
includeSystem: true,
allowPartial: false,
});
const response = await model.invoke(selectedMessages);
// With LangGraph, we're able to return a single message, and LangGraph will concatenate
// it to the existing list
return { messages: [response] };
};
// Define a new graph
const workflow = new StateGraph(MessagesAnnotation)
// Define the two nodes we will cycle between
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
const app = workflow.compile({
// Adding memory is straightforward in LangGraph!
// Just pass a checkpointer to the compile method.
checkpointer: new MemorySaver(),
});
// The thread id is a unique key that identifies this particular conversation
// ---
// NOTE: this must be `thread_id` and not `threadId` as the LangGraph internals expect `thread_id`
// ---
const thread_id = uuidv4();
const config = { configurable: { thread_id }, streamMode: "values" as const };
const inputMessage = {
role: "user",
content: "hi! I'm bob",
};
for await (const event of await app.stream(
{ messages: [inputMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
// Here, let's confirm that the AI remembers our name!
const followUpMessage = {
role: "user",
content: "what was my name?",
};
// ---
// NOTE: You must pass the same thread id to continue the conversation
// we do that here by passing the same `config` object to the `.stream` call.
// ---
for await (const event of await app.stream(
{ messages: [followUpMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob
Hello, Bob! How can I assist you today?
what was my name?
You mentioned that your name is Bob. How can I help you today?
使用预构建的 langgraph 代理
此示例展示了使用 Agent Executor 的方法,该 Executor 使用 createReactAgent 函数构建了预构建的代理。
如果您使用的是 旧版 LangChain 预构建代理,您应该能够用新的 LangGraph 预构建代理 替换该代码,该代理利用聊天模型的本机工具调用功能,并且可能会开箱即用地工作得更好。
详细信息
import { z } from "zod";
import { v4 as uuidv4 } from "uuid";
import { BaseMessage, trimMessages } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";
import { MemorySaver } from "@langchain/langgraph";
import { createReactAgent } from "@langchain/langgraph/prebuilt";
const getUserAge = tool(
(name: string): string => {
// This is a placeholder for the actual implementation
if (name.toLowerCase().includes("bob")) {
return "42 years old";
}
return "41 years old";
},
{
name: "get_user_age",
description: "Use this tool to find the user's age.",
schema: z.string().describe("the name of the user"),
}
);
const memory = new MemorySaver();
const model2 = new ChatOpenAI({ model: "gpt-4o" });
const stateModifier = async (
messages: BaseMessage[]
): Promise<BaseMessage[]> => {
// We're using the message processor defined above.
return trimMessages(messages, {
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});
};
const app2 = createReactAgent({
llm: model2,
tools: [getUserAge],
checkpointSaver: memory,
messageModifier: stateModifier,
});
// The thread id is a unique key that identifies
// this particular conversation.
// We'll just generate a random uuid here.
const threadId2 = uuidv4();
const config2 = {
configurable: { thread_id: threadId2 },
streamMode: "values" as const,
};
// Tell the AI that our name is Bob, and ask it to use a tool to confirm
// that it's capable of working like an agent.
const inputMessage2 = {
role: "user",
content: "hi! I'm bob. What is my age?",
};
for await (const event of await app2.stream(
{ messages: [inputMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
// Confirm that the chat bot has access to previous conversation
// and can respond to the user saying that the user's name is Bob.
const followUpMessage2 = {
role: "user",
content: "do you remember my name?",
};
for await (const event of await app2.stream(
{ messages: [followUpMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob. What is my age?
42 years old
Hi Bob! You are 42 years old.
do you remember my name?
Yes, your name is Bob! If there's anything else you'd like to know or discuss, feel free to ask.
LCEL:添加预处理步骤
在聊天模型前面引入一个预处理步骤,并将完整的对话历史记录传递给预处理步骤,这是添加复杂对话管理的最简单方法。
这种方法在概念上很简单,并且在许多情况下都能奏效;例如,如果使用 RunnableWithMessageHistory 而不是包装聊天模型,请用预处理器包装聊天模型。
这种方法的明显缺点是,随着对话历史记录的增长,延迟开始增加,原因有两个。
- 随着对话的进行,可能需要从存储对话历史记录的任何存储库中获取更多数据(如果不在内存中存储)。
- 预处理逻辑最终会进行大量的冗余计算,重复对话前几个步骤的计算。
如果你想使用聊天模型的工具调用功能,请记住在向模型添加历史预处理步骤之前将工具绑定到模型!
详细信息
import { ChatOpenAI } from "@langchain/openai";
import {
AIMessage,
HumanMessage,
SystemMessage,
BaseMessage,
trimMessages,
} from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const model3 = new ChatOpenAI({ model: "gpt-4o" });
const whatDidTheCowSay = tool(
(): string => {
return "foo";
},
{
name: "what_did_the_cow_say",
description: "Check to see what the cow said.",
schema: z.object({}),
}
);
const messageProcessor = trimMessages({
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});
// Note that we bind tools to the model first!
const modelWithTools = model3.bindTools([whatDidTheCowSay]);
const modelWithPreprocessor = messageProcessor.pipe(modelWithTools);
const fullHistory = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];
// We pass it explicitly to the modelWithPreprocessor for illustrative purposes.
// If you're using `RunnableWithMessageHistory` the history will be automatically
// read from the source that you configure.
const result = await modelWithPreprocessor.invoke(fullHistory);
console.log(result);
AIMessage {
"id": "chatcmpl-AB6uzWscxviYlbADFeDlnwIH82Fzt",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 16,
"promptTokens": 95,
"totalTokens": 111
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [
{
"name": "what_did_the_cow_say",
"args": {},
"type": "tool_call",
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 95,
"output_tokens": 16,
"total_tokens": 111
}
}
如果你需要实现更有效的逻辑,并且想现在使用 RunnableWithMessageHistory
,实现方法是继承自 BaseChatMessageHistory 并为 addMessages
定义适当的逻辑(不只是简单地追加历史记录,而是重写历史记录)。
除非你有充分的理由实施此解决方案,否则你应该使用 LangGraph。
下一步
探索使用 LangGraph 持久化
使用简单的 LCEL 添加持久化(对于更复杂的用例,建议使用 LangGraph)
使用消息历史记录