如何在聊天机器人中添加记忆
聊天机器人的一个关键功能是能够使用先前对话回合的内容作为上下文。此状态管理可以采取多种形式,包括
- 简单地将先前消息塞入聊天模型提示中。
- 上述操作,但修剪旧消息以减少模型需要处理的干扰信息量。
- 更复杂的修改,例如为长时间运行的对话合成摘要。
我们将在下面详细介绍一些技术!
此操作指南之前使用 RunnableWithMessageHistory 构建了聊天机器人。您可以在 v0.2 文档 中访问本教程的此版本。
LangGraph 实现提供了许多比 RunnableWithMessageHistory
更好的优势,包括能够持久化应用程序状态的任意组件(而不仅仅是消息)。
设置
您需要安装一些软件包,选择您的聊天模型并设置其环境变量。
- npm
- yarn
- pnpm
npm i @langchain/core @langchain/langgraph
yarn add @langchain/core @langchain/langgraph
pnpm add @langchain/core @langchain/langgraph
让我们设置一个将在下面示例中使用的聊天模型。
选择您的聊天模型
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- Groq
- VertexAI
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const model = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const model = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const model = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const model = new ChatGroq({
model: "mixtral-8x7b-32768",
temperature: 0
});
安装依赖项
请查看 此部分,了解有关安装集成软件包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const model = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
消息传递
最简单的记忆形式是简单地将聊天历史消息传递到链中。以下是一个示例
import { HumanMessage, AIMessage } from "@langchain/core/messages";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
],
new MessagesPlaceholder("messages"),
]);
const chain = prompt.pipe(llm);
await chain.invoke({
messages: [
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
new AIMessage("J'adore la programmation."),
new HumanMessage("What did you just say?"),
],
});
AIMessage {
"id": "chatcmpl-ABSxUXVIBitFRBh9MpasB5jeEHfCA",
"content": "I said \"J'adore la programmation,\" which means \"I love programming\" in French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 18,
"promptTokens": 58,
"totalTokens": 76
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 58,
"output_tokens": 18,
"total_tokens": 76
}
}
我们可以看到,通过将先前的对话传递到链中,它可以使用它作为上下文来回答问题。这是聊天机器人记忆的基础概念 - 本指南的其余部分将演示用于传递或重新格式化消息的便捷技术。
自动历史管理
先前的示例将消息显式地传递到链(和模型)中。这是一种完全可接受的方法,但它确实需要外部管理新消息。LangChain 还提供了一种使用 LangGraph 的持久性来构建具有记忆的应用程序的方法。您可以在编译图时通过提供 checkpointer
来启用 LangGraph 应用程序中的持久性。
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
// Define the function that calls the model
const callModel = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...state.messages,
];
const response = await llm.invoke(messages);
return { messages: response };
};
const workflow = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const memory = new MemorySaver();
const app = workflow.compile({ checkpointer: memory });
我们将在这里传递到对话的最新输入,并让 LangGraph 使用 checkpointer 跟踪对话历史记录
await app.invoke(
{
messages: [
{
role: "user",
content: "Translate to French: I love programming.",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "227b82a9-4084-46a5-ac79-ab9a3faa140e",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 35,
"output_tokens": 5,
"total_tokens": 40
}
}
]
}
await app.invoke(
{
messages: [
{
role: "user",
content: "What did I just ask you?",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "1a0560a4-9dcb-47a1-b441-80717e229706",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "4f233a7d-4b08-4f53-bb60-cf0141a59721",
"content": "What did I just ask you?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVs5QnlPfbihTOmJrCVg1Dh7Ol",
"content": "You asked me to translate \"I love programming\" into French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 13,
"promptTokens": 55,
"totalTokens": 68
},
"finish_reason": "stop",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 55,
"output_tokens": 13,
"total_tokens": 68
}
}
]
}
修改聊天历史记录
修改存储的聊天消息可以帮助您的聊天机器人处理各种情况。以下是一些示例
修剪消息
LLM 和聊天模型的上下文窗口有限,即使您没有直接遇到限制,您可能也希望限制模型需要处理的干扰信息量。一个解决方案是在将历史消息传递到模型之前对其进行修剪。让我们使用一个包含我们上面声明的 app
的示例历史记录
const demoEphemeralChatHistory = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];
await app.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What's my name?" },
],
},
{
configurable: { thread_id: "2" },
}
);
{
messages: [
HumanMessage {
"id": "63057c3d-f980-4640-97d6-497a9f83ddee",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "c9f0c20a-8f55-4909-b281-88f2a45c4f05",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "fd7fb3a0-7bc7-4e84-99a9-731b30637b55",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "09b0debb-1d4a-4856-8821-b037f5d96ecf",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "edc13b69-25a0-40ac-81b3-175e65dc1a9a",
"content": "What's my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWKCTdRuh2ZifXsvFHSo5z5I0J",
"content": "Your name is Nemo! How can I assist you today, Nemo?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 14,
"promptTokens": 63,
"totalTokens": 77
},
"finish_reason": "stop",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 63,
"output_tokens": 14,
"total_tokens": 77
}
}
]
}
我们可以看到,该应用程序记得预加载的名称。
假设我们有一个非常小的上下文窗口,我们希望将传递给模型的消息数量限制为最近的 2 条。我们可以使用内置的 trimMessages 工具,在消息到达提示之前,根据其 token 数量来修剪消息。在这种情况下,我们将每条消息计为 1 个“token”,并且只保留最后两条消息。
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";
// Define trimmer
// count each message as 1 "token" (tokenCounter: (msgs) => msgs.length) and keep only the last two messages
const trimmer = trimMessages({
strategy: "last",
maxTokens: 2,
tokenCounter: (msgs) => msgs.length,
});
// Define the function that calls the model
const callModel2 = async (state: typeof MessagesAnnotation.State) => {
const trimmedMessages = await trimmer.invoke(state.messages);
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...trimmedMessages,
];
const response = await llm.invoke(messages);
return { messages: response };
};
const workflow2 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel2)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const app2 = workflow2.compile({ checkpointer: new MemorySaver() });
让我们调用这个新应用并检查响应。
await app2.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What is my name?" },
],
},
{
configurable: { thread_id: "3" },
}
);
{
messages: [
HumanMessage {
"id": "0d9330a0-d9d1-4aaf-8171-ca1ac6344f7c",
"content": "What is my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "3a24e88b-7525-4797-9fcd-d751a378d22c",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "276039c8-eba8-4c68-b015-81ec7704140d",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "2ad4f461-20e1-4982-ba3b-235cb6b02abd",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "52213cae-953a-463d-a4a0-a7368c9ee4db",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWe9BRDl1pmzkNIDawWwU3hvKm",
"content": "I'm sorry, but I don't have access to personal information about you unless you've shared it with me during our conversation. How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 30,
"promptTokens": 39,
"totalTokens": 69
},
"finish_reason": "stop",
"system_fingerprint": "fp_3537616b13"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 39,
"output_tokens": 30,
"total_tokens": 69
}
}
]
}
我们可以看到 trimMessages
被调用了,并且只有最近的兩条消息会被传递给模型。在这种情况下,这意味着模型忘记了我们给它起的名字。
查看我们的 如何修剪消息的指南 获取更多信息。
摘要记忆
我们也可以用其他方式使用相同的模式。例如,我们可以使用另一个 LLM 调用来生成对话的摘要,然后再调用我们的应用程序。让我们重新创建我们的聊天记录。
const demoEphemeralChatHistory2 = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];
现在,让我们更新模型调用函数,将以前的交互信息提炼成一个摘要。
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { RemoveMessage } from "@langchain/core/messages";
// Define the function that calls the model
const callModel3 = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability. " +
"The provided chat history includes a summary of the earlier conversation.";
const systemMessage = { role: "system", content: systemPrompt };
const messageHistory = state.messages.slice(0, -1); // exclude the most recent user input
// Summarize the messages if the chat history reaches a certain size
if (messageHistory.length >= 4) {
const lastHumanMessage = state.messages[state.messages.length - 1];
// Invoke the model to generate conversation summary
const summaryPrompt =
"Distill the above chat messages into a single summary message. " +
"Include as many specific details as you can.";
const summaryMessage = await llm.invoke([
...messageHistory,
{ role: "user", content: summaryPrompt },
]);
// Delete messages that we no longer want to show up
const deleteMessages = state.messages.map(
(m) => new RemoveMessage({ id: m.id })
);
// Re-add user message
const humanMessage = { role: "user", content: lastHumanMessage.content };
// Call the model with summary & response
const response = await llm.invoke([
systemMessage,
summaryMessage,
humanMessage,
]);
return {
messages: [summaryMessage, humanMessage, response, ...deleteMessages],
};
} else {
const response = await llm.invoke([systemMessage, ...state.messages]);
return { messages: response };
}
};
const workflow3 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel3)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const app3 = workflow3.compile({ checkpointer: new MemorySaver() });
让我们看看它是否记得我们给它起的名字。
await app3.invoke(
{
messages: [
...demoEphemeralChatHistory2,
{ role: "user", content: "What did I say my name was?" },
],
},
{
configurable: { thread_id: "4" },
}
);
{
messages: [
AIMessage {
"id": "chatcmpl-ABSxXjFDj6WRo7VLSneBtlAxUumPE",
"content": "Nemo greeted the assistant and asked how it was doing, to which the assistant responded that it was fine.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 22,
"promptTokens": 60,
"totalTokens": 82
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 60,
"output_tokens": 22,
"total_tokens": 82
}
},
HumanMessage {
"id": "8b1309b7-c09e-47fb-9ab3-34047f6973e3",
"content": "What did I say my name was?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxYAQKiBsQ6oVypO4CLFDsi1HRH",
"content": "You mentioned that your name is Nemo.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 8,
"promptTokens": 73,
"totalTokens": 81
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 73,
"output_tokens": 8,
"total_tokens": 81
}
}
]
}
请注意,再次调用应用程序将继续累积历史记录,直到达到指定的消息数量(在本例中为四条)。在那之后,我们将从初始摘要加上新消息生成另一个摘要,依此类推。