跳至主要内容

如何修剪消息

先决条件

本指南假设您熟悉以下概念

本指南中的方法还需要 @langchain/core>=0.2.8。有关升级指南,请参阅 此处

所有模型都有有限的上下文窗口,这意味着它们可以接受的令牌数量有限。如果您有非常长的消息或一个积累了很长消息历史记录的链/代理,您需要管理传递给模型的消息长度。

trimMessages 实用程序提供了一些基本策略来修剪消息列表,使其具有特定的令牌长度。

获取最后 maxTokens 个令牌

要获取消息列表中的最后 maxTokens 个令牌,我们可以设置 strategy: "last"。请注意,对于我们的 tokenCounter,我们可以传递一个函数(稍后将详细介绍)或一个语言模型(因为语言模型具有消息令牌计数方法)。当您修剪消息以适合特定模型的上下文窗口时,传递一个模型是有意义的。

import {
AIMessage,
HumanMessage,
SystemMessage,
trimMessages,
} from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";

const messages = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("what do you call a speechless parrot"),
];

const trimmed = await trimMessages(messages, {
maxTokens: 45,
strategy: "last",
tokenCounter: new ChatOpenAI({ modelName: "gpt-4" }),
});

console.log(
trimmed
.map((x) =>
JSON.stringify(
{
role: x._getType(),
content: x.content,
},
null,
2
)
)
.join("\n\n")
);
{
"role": "human",
"content": "and who is harrison chasing anyways"
}

{
"role": "ai",
"content": "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
}

{
"role": "human",
"content": "what do you call a speechless parrot"
}

如果我们想要始终保留初始系统消息,我们可以指定 includeSystem: true

await trimMessages(messages, {
maxTokens: 45,
strategy: "last",
tokenCounter: new ChatOpenAI({ modelName: "gpt-4" }),
includeSystem: true,
});
[
SystemMessage {
lc_serializable: true,
lc_kwargs: {
content: "you're a good assistant, you always respond with a joke.",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "you're a good assistant, you always respond with a joke.",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'what do you call a speechless parrot',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'what do you call a speechless parrot',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

如果我们允许拆分消息的内容,我们可以指定 allowPartial: true

await trimMessages(messages, {
maxTokens: 50,
strategy: "last",
tokenCounter: new ChatOpenAI({ modelName: "gpt-4" }),
includeSystem: true,
allowPartial: true,
});
[
SystemMessage {
lc_serializable: true,
lc_kwargs: {
content: "you're a good assistant, you always respond with a joke.",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "you're a good assistant, you always respond with a joke.",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'what do you call a speechless parrot',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'what do you call a speechless parrot',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

如果我们需要确保我们的第一条消息(不包括系统消息)始终是特定类型,我们可以指定 startOn

await trimMessages(messages, {
maxTokens: 60,
strategy: "last",
tokenCounter: new ChatOpenAI({ modelName: "gpt-4" }),
includeSystem: true,
startOn: "human",
});
[
SystemMessage {
lc_serializable: true,
lc_kwargs: {
content: "you're a good assistant, you always respond with a joke.",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "you're a good assistant, you always respond with a joke.",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'and who is harrison chasing anyways',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'and who is harrison chasing anyways',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'what do you call a speechless parrot',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'what do you call a speechless parrot',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

获取前 maxTokens 个令牌

我们可以通过指定 strategy: "first" 来执行获取 maxTokens 个令牌的反向操作。

await trimMessages(messages, {
maxTokens: 45,
strategy: "first",
tokenCounter: new ChatOpenAI({ modelName: "gpt-4" }),
});
[
SystemMessage {
lc_serializable: true,
lc_kwargs: {
content: "you're a good assistant, you always respond with a joke.",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "you're a good assistant, you always respond with a joke.",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: "i wonder why it's called langchain",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "i wonder why it's called langchain",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

编写自定义令牌计数器

我们可以编写一个自定义令牌计数器函数,该函数接收消息列表并返回一个整数。

import { encodingForModel } from "@langchain/core/utils/tiktoken";
import {
BaseMessage,
HumanMessage,
AIMessage,
ToolMessage,
SystemMessage,
MessageContent,
MessageContentText,
} from "@langchain/core/messages";

async function strTokenCounter(
messageContent: MessageContent
): Promise<number> {
if (typeof messageContent === "string") {
return (await encodingForModel("gpt-4")).encode(messageContent).length;
} else {
if (messageContent.every((x) => x.type === "text" && x.text)) {
return (await encodingForModel("gpt-4")).encode(
(messageContent as MessageContentText[])
.map(({ text }) => text)
.join("")
).length;
}
throw new Error(
`Unsupported message content ${JSON.stringify(messageContent)}`
);
}
}

async function tiktokenCounter(messages: BaseMessage[]): Promise<number> {
let numTokens = 3; // every reply is primed with <|start|>assistant<|message|>
const tokensPerMessage = 3;
const tokensPerName = 1;

for (const msg of messages) {
let role: string;
if (msg instanceof HumanMessage) {
role = "user";
} else if (msg instanceof AIMessage) {
role = "assistant";
} else if (msg instanceof ToolMessage) {
role = "tool";
} else if (msg instanceof SystemMessage) {
role = "system";
} else {
throw new Error(`Unsupported message type ${msg.constructor.name}`);
}

numTokens +=
tokensPerMessage +
(await strTokenCounter(role)) +
(await strTokenCounter(msg.content));

if (msg.name) {
numTokens += tokensPerName + (await strTokenCounter(msg.name));
}
}

return numTokens;
}

await trimMessages(messages, {
maxTokens: 45,
strategy: "last",
tokenCounter: tiktokenCounter,
});
[
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'what do you call a speechless parrot',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'what do you call a speechless parrot',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

链接

trimMessages 可以以命令式(如上所示)或声明式方式使用,使其易于与链中的其他组件组合。

import { ChatOpenAI } from "@langchain/openai";
import { trimMessages } from "@langchain/core/messages";

const llm = new ChatOpenAI({ model: "gpt-4o" });

// Notice we don't pass in messages. This creates
// a RunnableLambda that takes messages as input
const trimmer = trimMessages({
maxTokens: 45,
strategy: "last",
tokenCounter: llm,
includeSystem: true,
});

const chain = trimmer.pipe(llm);
await chain.invoke(messages);
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Thanks! I do try to keep things light. But for a more serious answer, "LangChain" is likely named to reflect its focus on language processing and the way it connects different components or models together—essentially forming a "chain" of linguistic operations. The "Lang" part emphasizes its focus on language, while "Chain" highlights the interconnected workflows it aims to facilitate.',
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: { function_call: undefined, tool_calls: undefined },
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Thanks! I do try to keep things light. But for a more serious answer, "LangChain" is likely named to reflect its focus on language processing and the way it connects different components or models together—essentially forming a "chain" of linguistic operations. The "Lang" part emphasizes its focus on language, while "Chain" highlights the interconnected workflows it aims to facilitate.',
name: undefined,
additional_kwargs: { function_call: undefined, tool_calls: undefined },
response_metadata: {
tokenUsage: { completionTokens: 77, promptTokens: 59, totalTokens: 136 },
finish_reason: 'stop'
},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: { input_tokens: 59, output_tokens: 77, total_tokens: 136 }
}

查看 LangSmith 跟踪,我们可以看到,在将消息传递给模型之前,它们首先会被修剪。

只看修剪器,我们可以看到它是一个 Runnable 对象,可以像所有 Runnable 一样调用。

await trimmer.invoke(messages);
[
SystemMessage {
lc_serializable: true,
lc_kwargs: {
content: "you're a good assistant, you always respond with a joke.",
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: "you're a good assistant, you always respond with a joke.",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
},
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'Hmmm let me think.\n' +
'\n' +
"Why, he's probably chasing after the last cup of coffee in the office!",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: undefined
},
HumanMessage {
lc_serializable: true,
lc_kwargs: {
content: 'what do you call a speechless parrot',
additional_kwargs: {},
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'what do you call a speechless parrot',
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined
}
]

与 ChatMessageHistory 一起使用

修剪消息在处理聊天历史记录时特别有用,聊天历史记录可以变得任意长。

import { InMemoryChatMessageHistory } from "@langchain/core/chat_history";
import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { HumanMessage, trimMessages } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";

const chatHistory = new InMemoryChatMessageHistory(messages.slice(0, -1));

const dummyGetSessionHistory = async (sessionId: string) => {
if (sessionId !== "1") {
throw new Error("Session not found");
}
return chatHistory;
};

const llm = new ChatOpenAI({ model: "gpt-4o" });

const trimmer = trimMessages({
maxTokens: 45,
strategy: "last",
tokenCounter: llm,
includeSystem: true,
});

const chain = trimmer.pipe(llm);
const chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: dummyGetSessionHistory,
});
await chainWithHistory.invoke(
[new HumanMessage("what do you call a speechless parrot")],
{ configurable: { sessionId: "1" } }
);
AIMessage {
lc_serializable: true,
lc_kwargs: {
content: 'A "polly-no-want-a-cracker"!',
tool_calls: [],
invalid_tool_calls: [],
additional_kwargs: { function_call: undefined, tool_calls: undefined },
response_metadata: {}
},
lc_namespace: [ 'langchain_core', 'messages' ],
content: 'A "polly-no-want-a-cracker"!',
name: undefined,
additional_kwargs: { function_call: undefined, tool_calls: undefined },
response_metadata: {
tokenUsage: { completionTokens: 11, promptTokens: 57, totalTokens: 68 },
finish_reason: 'stop'
},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
usage_metadata: { input_tokens: 57, output_tokens: 11, total_tokens: 68 }
}

查看 LangSmith 跟踪,我们可以看到我们检索了所有消息,但在将消息传递给模型之前,它们被修剪为仅包含系统消息和最后一条人类消息。

API 参考

有关所有参数的完整描述,请访问 API 参考


此页面是否有帮助?


您也可以在 GitHub 上留下详细的反馈 GitHub.