如何为聊天机器人添加记忆
聊天机器人的一个关键功能是它们能够使用先前对话轮次的内容作为上下文。这种状态管理可以采用多种形式,包括
- 简单地将之前的消息塞进聊天模型提示中。
- 与上述相同,但修剪旧消息以减少模型必须处理的分散注意力的信息量。
- 更复杂的修改,例如为长时间运行的对话合成摘要。
我们将在下面更详细地介绍一些技术!
本操作指南之前使用 RunnableWithMessageHistory 构建了一个聊天机器人。您可以在 v0.2 文档中访问本教程的此版本。
LangGraph 实现提供了许多优于 RunnableWithMessageHistory
的优势,包括持久化应用程序状态的任意组件(而不仅仅是消息)的能力。
设置
您需要安装一些软件包,选择您的聊天模型,并设置其环境变量。
- npm
- yarn
- pnpm
npm i @langchain/core @langchain/langgraph
yarn add @langchain/core @langchain/langgraph
pnpm add @langchain/core @langchain/langgraph
让我们设置一个聊天模型,我们将在下面的示例中使用它。
选择您的聊天模型
- Groq
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- VertexAI
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const model = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const model = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const model = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const model = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const model = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
消息传递
最简单的记忆形式只是将聊天历史消息传递到链中。 这是一个例子
import { HumanMessage, AIMessage } from "@langchain/core/messages";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
],
new MessagesPlaceholder("messages"),
]);
const chain = prompt.pipe(llm);
await chain.invoke({
messages: [
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
new AIMessage("J'adore la programmation."),
new HumanMessage("What did you just say?"),
],
});
AIMessage {
"id": "chatcmpl-ABSxUXVIBitFRBh9MpasB5jeEHfCA",
"content": "I said \"J'adore la programmation,\" which means \"I love programming\" in French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 18,
"promptTokens": 58,
"totalTokens": 76
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 58,
"output_tokens": 18,
"total_tokens": 76
}
}
我们可以看到,通过将之前的对话传递到链中,它可以将其用作上下文来回答问题。 这是聊天机器人记忆的基本概念 - 本指南的其余部分将演示用于传递或重新格式化消息的便捷技术。
自动历史记录管理
之前的示例显式地将消息传递给链(和模型)。 这是一种完全可以接受的方法,但这确实需要对新消息进行外部管理。 LangChain 还提供了一种使用 LangGraph 的持久性来构建具有记忆的应用程序的方法。 您可以通过在编译图时提供 checkpointer
来在 LangGraph 应用程序中启用持久性。
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
// Define the function that calls the model
const callModel = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...state.messages,
];
const response = await llm.invoke(messages);
return { messages: response };
};
const workflow = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const memory = new MemorySaver();
const app = workflow.compile({ checkpointer: memory });
我们将在此处将最新的输入传递给对话,并让 LangGraph 使用 checkpointer 跟踪对话历史记录
await app.invoke(
{
messages: [
{
role: "user",
content: "Translate to French: I love programming.",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "227b82a9-4084-46a5-ac79-ab9a3faa140e",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 35,
"output_tokens": 5,
"total_tokens": 40
}
}
]
}
await app.invoke(
{
messages: [
{
role: "user",
content: "What did I just ask you?",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "1a0560a4-9dcb-47a1-b441-80717e229706",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "4f233a7d-4b08-4f53-bb60-cf0141a59721",
"content": "What did I just ask you?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVs5QnlPfbihTOmJrCVg1Dh7Ol",
"content": "You asked me to translate \"I love programming\" into French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 13,
"promptTokens": 55,
"totalTokens": 68
},
"finish_reason": "stop",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 55,
"output_tokens": 13,
"total_tokens": 68
}
}
]
}
修改聊天历史记录
修改存储的聊天消息可以帮助您的聊天机器人处理各种情况。 以下是一些示例
修剪消息
LLM 和聊天模型具有有限的上下文窗口,即使您没有直接达到限制,您可能也希望限制模型必须处理的分散注意力的程度。 一种解决方案是在将历史消息传递给模型之前修剪它们。 让我们使用上面声明的 app
的示例历史记录
const demoEphemeralChatHistory = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];
await app.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What's my name?" },
],
},
{
configurable: { thread_id: "2" },
}
);
{
messages: [
HumanMessage {
"id": "63057c3d-f980-4640-97d6-497a9f83ddee",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "c9f0c20a-8f55-4909-b281-88f2a45c4f05",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "fd7fb3a0-7bc7-4e84-99a9-731b30637b55",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "09b0debb-1d4a-4856-8821-b037f5d96ecf",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "edc13b69-25a0-40ac-81b3-175e65dc1a9a",
"content": "What's my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWKCTdRuh2ZifXsvFHSo5z5I0J",
"content": "Your name is Nemo! How can I assist you today, Nemo?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 14,
"promptTokens": 63,
"totalTokens": 77
},
"finish_reason": "stop",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 63,
"output_tokens": 14,
"total_tokens": 77
}
}
]
}
我们可以看到该应用记住了预加载的名称。
但是,假设我们的上下文窗口非常小,并且我们希望将传递给模型的消息数量修剪为仅最近的 2 条消息。 我们可以使用内置的 trimMessages 实用程序,根据消息的令牌计数在消息到达我们的提示之前修剪消息。 在这种情况下,我们将每个消息计数为 1 个“令牌”,并且仅保留最近的两条消息
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";
// Define trimmer
// count each message as 1 "token" (tokenCounter: (msgs) => msgs.length) and keep only the last two messages
const trimmer = trimMessages({
strategy: "last",
maxTokens: 2,
tokenCounter: (msgs) => msgs.length,
});
// Define the function that calls the model
const callModel2 = async (state: typeof MessagesAnnotation.State) => {
const trimmedMessages = await trimmer.invoke(state.messages);
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...trimmedMessages,
];
const response = await llm.invoke(messages);
return { messages: response };
};
const workflow2 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel2)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const app2 = workflow2.compile({ checkpointer: new MemorySaver() });
让我们调用这个新的应用并检查响应
await app2.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What is my name?" },
],
},
{
configurable: { thread_id: "3" },
}
);
{
messages: [
HumanMessage {
"id": "0d9330a0-d9d1-4aaf-8171-ca1ac6344f7c",
"content": "What is my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "3a24e88b-7525-4797-9fcd-d751a378d22c",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "276039c8-eba8-4c68-b015-81ec7704140d",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "2ad4f461-20e1-4982-ba3b-235cb6b02abd",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "52213cae-953a-463d-a4a0-a7368c9ee4db",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWe9BRDl1pmzkNIDawWwU3hvKm",
"content": "I'm sorry, but I don't have access to personal information about you unless you've shared it with me during our conversation. How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 30,
"promptTokens": 39,
"totalTokens": 69
},
"finish_reason": "stop",
"system_fingerprint": "fp_3537616b13"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 39,
"output_tokens": 30,
"total_tokens": 69
}
}
]
}
我们可以看到 trimMessages
已被调用,并且只有最近的两条消息将被传递给模型。 在这种情况下,这意味着模型忘记了我们给它的名称。
请查看我们的关于修剪消息的操作指南以了解更多信息。
摘要记忆
我们也可以以其他方式使用相同的模式。 例如,我们可以使用额外的 LLM 调用来生成对话摘要,然后再调用我们的应用。 让我们重新创建我们的聊天历史记录
const demoEphemeralChatHistory2 = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];
现在,让我们更新模型调用函数以将之前的交互提炼成摘要
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { RemoveMessage } from "@langchain/core/messages";
// Define the function that calls the model
const callModel3 = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability. " +
"The provided chat history includes a summary of the earlier conversation.";
const systemMessage = { role: "system", content: systemPrompt };
const messageHistory = state.messages.slice(0, -1); // exclude the most recent user input
// Summarize the messages if the chat history reaches a certain size
if (messageHistory.length >= 4) {
const lastHumanMessage = state.messages[state.messages.length - 1];
// Invoke the model to generate conversation summary
const summaryPrompt =
"Distill the above chat messages into a single summary message. " +
"Include as many specific details as you can.";
const summaryMessage = await llm.invoke([
...messageHistory,
{ role: "user", content: summaryPrompt },
]);
// Delete messages that we no longer want to show up
const deleteMessages = state.messages.map(
(m) => new RemoveMessage({ id: m.id })
);
// Re-add user message
const humanMessage = { role: "user", content: lastHumanMessage.content };
// Call the model with summary & response
const response = await llm.invoke([
systemMessage,
summaryMessage,
humanMessage,
]);
return {
messages: [summaryMessage, humanMessage, response, ...deleteMessages],
};
} else {
const response = await llm.invoke([systemMessage, ...state.messages]);
return { messages: response };
}
};
const workflow3 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel3)
.addEdge(START, "model")
.addEdge("model", END);
// Add simple in-memory checkpointer
const app3 = workflow3.compile({ checkpointer: new MemorySaver() });
让我们看看它是否记住了我们给它的名字
await app3.invoke(
{
messages: [
...demoEphemeralChatHistory2,
{ role: "user", content: "What did I say my name was?" },
],
},
{
configurable: { thread_id: "4" },
}
);
{
messages: [
AIMessage {
"id": "chatcmpl-ABSxXjFDj6WRo7VLSneBtlAxUumPE",
"content": "Nemo greeted the assistant and asked how it was doing, to which the assistant responded that it was fine.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 22,
"promptTokens": 60,
"totalTokens": 82
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 60,
"output_tokens": 22,
"total_tokens": 82
}
},
HumanMessage {
"id": "8b1309b7-c09e-47fb-9ab3-34047f6973e3",
"content": "What did I say my name was?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxYAQKiBsQ6oVypO4CLFDsi1HRH",
"content": "You mentioned that your name is Nemo.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 8,
"promptTokens": 73,
"totalTokens": 81
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 73,
"output_tokens": 8,
"total_tokens": 81
}
}
]
}
请注意,再次调用该应用将继续累积历史记录,直到达到指定的消息数量(在我们的例子中为四条)。 到那时,我们将从初始摘要加上新消息等生成另一个摘要。