跳至主要内容

对话式 RAG

在许多问答应用中,我们希望允许用户进行来回对话,这意味着应用需要某种形式的“记忆”,可以记住过去的问题和答案,以及将这些信息纳入其当前思考的逻辑。

在本指南中,我们将重点介绍添加用于合并历史消息的逻辑。 有关聊天历史记录管理的更多详细信息,请点击此处查看

我们将介绍两种方法

  1. 链,其中我们始终执行检索步骤;
  2. 代理,其中我们赋予 LLM 是否以及如何执行检索步骤(或多个步骤)的权限。

对于外部知识源,我们将使用 Lilian Weng 在 LLM 驱动的自主代理 博文中提到的与 RAG 教程 中相同的博客文章。

设置

依赖项

在本演练中,我们将使用 OpenAI 聊天模型和嵌入以及 Memory 向量存储,但这里显示的所有内容都适用于任何 聊天模型LLM嵌入 以及 向量存储检索器

我们将使用以下软件包

npm install --save langchain @langchain/openai langchain cheerio

我们需要设置环境变量 OPENAI_API_KEY

export OPENAI_API_KEY=YOUR_KEY

LangSmith

您使用 LangChain 构建的许多应用都包含多个步骤,其中包含多次调用 LLM。随着这些应用变得越来越复杂,能够检查链或代理内部究竟发生了什么变得至关重要。使用 LangSmith 是最好的方法。

请注意,LangSmith 不是必需的,但它非常有用。如果您确实想要使用 LangSmith,在您通过上面的链接注册后,请确保设置环境变量以开始记录跟踪

export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=YOUR_KEY

# Reduce tracing latency if you are not in a serverless environment
# export LANGCHAIN_CALLBACKS_BACKGROUND=true

首先,让我们重新审视我们构建在 Lilian Weng LLM 驱动的自主代理 博客文章上的基于 RAG 教程 的问答应用程序。

选择您的聊天模型

安装依赖项

yarn add @langchain/openai 

添加环境变量

OPENAI_API_KEY=your-api-key

实例化模型

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";

// 1. Load, chunk and index the contents of the blog to create a retriever.
const loader = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/",
{
selector: ".post-content, .post-title, .post-header",
}
);
const docs = await loader.load();

const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const splits = await textSplitter.splitDocuments(docs);
const vectorstore = await MemoryVectorStore.fromDocuments(
splits,
new OpenAIEmbeddings()
);
const retriever = vectorstore.asRetriever();

// 2. Incorporate the retriever into a question-answering chain.
const systemPrompt =
"You are an assistant for question-answering tasks. " +
"Use the following pieces of retrieved context to answer " +
"the question. If you don't know the answer, say that you " +
"don't know. Use three sentences maximum and keep the " +
"answer concise." +
"\n\n" +
"{context}";

const prompt = ChatPromptTemplate.fromMessages([
["system", systemPrompt],
["human", "{input}"],
]);

const questionAnswerChain = await createStuffDocumentsChain({
llm,
prompt,
});

const ragChain = await createRetrievalChain({
retriever,
combineDocsChain: questionAnswerChain,
});
const response = await ragChain.invoke({
input: "What is Task Decomposition?",
});
console.log(response.answer);
Task decomposition involves breaking down large and complex tasks into smaller, more manageable subgoals or steps. This approach helps agents or models efficiently handle intricate tasks by simplifying them into easier components. Task decomposition can be achieved through techniques like Chain of Thought, Tree of Thoughts, or by using task-specific instructions and human input.

请注意,我们使用了内置的链式构造函数 `createStuffDocumentsChain` 和 `createRetrievalChain`,因此我们解决方案的基本要素是

  1. 检索器;
  2. 提示;
  3. LLM。

这将简化整合聊天历史记录的过程。

添加聊天历史记录

我们构建的链使用输入查询直接检索相关上下文。但在对话环境中,用户查询可能需要对话上下文才能理解。例如,考虑以下对话

人类:“什么是任务分解?”

AI:“任务分解涉及将复杂任务分解成更小更简单的步骤,以便代理或模型更易于管理。”

人类:“有哪些常见的分解方法?”

为了回答第二个问题,我们的系统需要理解“它”指的是“任务分解”。

我们需要更新我们现有应用程序的两个方面

  1. **提示**:更新我们的提示以支持历史消息作为输入。
  2. **上下文化问题**:添加一个子链,接收最新的用户问题并根据聊天历史记录重新表述问题。这可以简单地理解为构建一个新的“历史感知”检索器。之前我们有
    • 查询 -> 检索器
      现在我们将有
    • (查询,对话历史记录) -> LLM -> 重新表述的查询 -> 检索器

上下文化问题

首先,我们需要定义一个子链,它接收历史消息和最新的用户问题,并在问题引用历史信息中的任何信息时重新表述问题。

我们将使用一个包含名为“chat_history”的 `MessagesPlaceholder` 变量的提示。这允许我们使用“chat_history”输入键将消息列表传递给提示,这些消息将插入系统消息之后和包含最新问题的用户消息之前。

请注意,我们利用了一个辅助函数 createHistoryAwareRetriever 来完成此步骤,该函数管理 `chat_history` 为空的情况,并在其他情况下依次应用 `prompt.pipe(llm).pipe(new StringOutputParser()).pipe(retriever)`。

`createHistoryAwareRetriever` 构建一个接受键 `input` 和 `chat_history` 作为输入的链,并具有与检索器相同的输出模式。

import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
import { MessagesPlaceholder } from "@langchain/core/prompts";

const contextualizeQSystemPrompt =
"Given a chat history and the latest user question " +
"which might reference context in the chat history, " +
"formulate a standalone question which can be understood " +
"without the chat history. Do NOT answer the question, " +
"just reformulate it if needed and otherwise return it as is.";

const contextualizeQPrompt = ChatPromptTemplate.fromMessages([
["system", contextualizeQSystemPrompt],
new MessagesPlaceholder("chat_history"),
["human", "{input}"],
]);

const historyAwareRetriever = await createHistoryAwareRetriever({
llm,
retriever,
rephrasePrompt: contextualizeQPrompt,
});

此链将输入查询的重新表述附加到我们的检索器,以便检索包含对话的上下文。

现在我们可以构建我们的完整问答链。这与将检索器更新为我们新的 `historyAwareRetriever` 一样简单。

同样,我们将使用 createStuffDocumentsChain 生成一个 `questionAnswerChain2`,具有输入键 `context`、`chat_history` 和 `input`——它接收检索到的上下文以及对话历史记录和查询以生成答案。更详细的解释请参考 这里

我们使用 createRetrievalChain 构建我们最终的 `ragChain2`。此链依次应用 `historyAwareRetriever` 和 `questionAnswerChain2`,保留中间输出,例如检索到的上下文,以方便使用。它具有输入键 `input` 和 `chat_history`,并在其输出中包含 `input`、`chat_history`、`context` 和 `answer`。

const qaPrompt = ChatPromptTemplate.fromMessages([
["system", systemPrompt],
new MessagesPlaceholder("chat_history"),
["human", "{input}"],
]);

const questionAnswerChain2 = await createStuffDocumentsChain({
llm,
prompt: qaPrompt,
});

const ragChain2 = await createRetrievalChain({
retriever: historyAwareRetriever,
combineDocsChain: questionAnswerChain2,
});

让我们尝试一下。下面我们问了一个问题,并紧接着问了一个需要上下文才能返回合理响应的后续问题。因为我们的链包含一个 `"chat_history"` 输入,所以调用者需要管理聊天历史记录。我们可以通过将输入和输出消息追加到一个列表来实现

import { BaseMessage, HumanMessage, AIMessage } from "@langchain/core/messages";

let chatHistory: BaseMessage[] = [];

const question = "What is Task Decomposition?";
const aiMsg1 = await ragChain2.invoke({
input: question,
chat_history: chatHistory,
});
chatHistory = chatHistory.concat([
new HumanMessage(question),
new AIMessage(aiMsg1.answer),
]);

const secondQuestion = "What are common ways of doing it?";
const aiMsg2 = await ragChain2.invoke({
input: secondQuestion,
chat_history: chatHistory,
});

console.log(aiMsg2.answer);
Common ways of doing Task Decomposition include:
1. Using simple prompting with an LLM, such as asking it to outline steps or subgoals for a task.
2. Employing task-specific instructions, like "Write a story outline" for writing a novel.
3. Incorporating human inputs for guidance.
Additionally, advanced approaches like Chain of Thought (CoT) and Tree of Thoughts (ToT) can further refine the process, and using an external classical planner with PDDL (as in LLM+P) is another option.

聊天历史记录的状态管理

在这里,我们已经讨论了如何添加应用程序逻辑以整合历史输出,但我们仍然手动更新聊天历史记录并将它们插入每个输入。在实际的问答应用程序中,我们需要一种持久化聊天历史记录的方法,以及一种自动插入和更新历史记录的方法。

为此,我们可以使用

有关如何将这些类一起使用以创建状态对话链的详细介绍,请前往 如何添加消息历史记录(内存) LCEL 页面。

`RunnableWithMessageHistory` 的实例会为您管理聊天历史记录。它们接受一个配置,其中包含一个键(默认情况下为 `"sessionId"`),指定要获取的对话历史记录并将其附加到输入,并将输出附加到相同的对话历史记录。以下是一个示例

import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { ChatMessageHistory } from "langchain/stores/message/in_memory";

const demoEphemeralChatMessageHistoryForChain = new ChatMessageHistory();

const conversationalRagChain = new RunnableWithMessageHistory({
runnable: ragChain2,
getMessageHistory: (_sessionId) => demoEphemeralChatMessageHistoryForChain,
inputMessagesKey: "input",
historyMessagesKey: "chat_history",
outputMessagesKey: "answer",
});
const result1 = await conversationalRagChain.invoke(
{ input: "What is Task Decomposition?" },
{ configurable: { sessionId: "abc123" } }
);
console.log(result1.answer);
Task Decomposition involves breaking down complicated tasks into smaller, more manageable subgoals. Techniques such as the Chain of Thought (CoT) and Tree of Thoughts extend this by decomposing problems into multiple thought steps and exploring multiple reasoning possibilities at each step. LLMs can perform task decomposition using simple prompts, task-specific instructions, or human inputs, and some approaches like LLM+P involve using external classical planners.
const result2 = await conversationalRagChain.invoke(
{ input: "What are common ways of doing it?" },
{ configurable: { sessionId: "abc123" } }
);
console.log(result2.answer);
Common ways of doing task decomposition include:

1. Using simple prompting with an LLM, such as "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?"
2. Utilizing task-specific instructions, like "Write a story outline." for writing a novel.
3. Incorporating human inputs to guide and refine the decomposition process.

Additionally, the LLM+P approach utilizes an external classical planner, involving PDDL to describe and plan complex tasks.

整合在一起

为了方便起见,我们将所有必要步骤整合到一个代码单元中

import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { ChatMessageHistory } from "langchain/stores/message/in_memory";
import { BaseChatMessageHistory } from "@langchain/core/chat_history";

const llm2 = new ChatOpenAI({ model: "gpt-3.5-turbo", temperature: 0 });

// Construct retriever
const loader2 = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/",
{
selector: ".post-content, .post-title, .post-header",
}
);

const docs2 = await loader2.load();

const textSplitter2 = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const splits2 = await textSplitter2.splitDocuments(docs2);
const vectorstore2 = await MemoryVectorStore.fromDocuments(
splits2,
new OpenAIEmbeddings()
);
const retriever2 = vectorstore2.asRetriever();

// Contextualize question
const contextualizeQSystemPrompt2 =
"Given a chat history and the latest user question " +
"which might reference context in the chat history, " +
"formulate a standalone question which can be understood " +
"without the chat history. Do NOT answer the question, " +
"just reformulate it if needed and otherwise return it as is.";

const contextualizeQPrompt2 = ChatPromptTemplate.fromMessages([
["system", contextualizeQSystemPrompt2],
new MessagesPlaceholder("chat_history"),
["human", "{input}"],
]);

const historyAwareRetriever2 = await createHistoryAwareRetriever({
llm: llm2,
retriever: retriever2,
rephrasePrompt: contextualizeQPrompt2,
});

// Answer question
const systemPrompt2 =
"You are an assistant for question-answering tasks. " +
"Use the following pieces of retrieved context to answer " +
"the question. If you don't know the answer, say that you " +
"don't know. Use three sentences maximum and keep the " +
"answer concise." +
"\n\n" +
"{context}";

const qaPrompt2 = ChatPromptTemplate.fromMessages([
["system", systemPrompt2],
new MessagesPlaceholder("chat_history"),
["human", "{input}"],
]);

const questionAnswerChain3 = await createStuffDocumentsChain({
llm,
prompt: qaPrompt2,
});

const ragChain3 = await createRetrievalChain({
retriever: historyAwareRetriever2,
combineDocsChain: questionAnswerChain3,
});

// Statefully manage chat history
const store2: Record<string, BaseChatMessageHistory> = {};

function getSessionHistory2(sessionId: string): BaseChatMessageHistory {
if (!(sessionId in store2)) {
store2[sessionId] = new ChatMessageHistory();
}
return store2[sessionId];
}

const conversationalRagChain2 = new RunnableWithMessageHistory({
runnable: ragChain3,
getMessageHistory: getSessionHistory2,
inputMessagesKey: "input",
historyMessagesKey: "chat_history",
outputMessagesKey: "answer",
});

// Example usage
const query2 = "What is Task Decomposition?";

for await (const s of await conversationalRagChain2.stream(
{ input: query2 },
{ configurable: { sessionId: "unique_session_id" } }
)) {
console.log(s);
console.log("----");
}
{ input: 'What is Task Decomposition?' }
----
{ chat_history: [] }
----
{
context: [
Document {
pageContent: 'Fig. 1. Overview of a LLM-powered autonomous agent system.\n' +
'Component One: Planning#\n' +
'A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\n' +
'Task Decomposition#\n' +
'Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\n' +
'Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.',
metadata: [Object],
id: undefined
},
Document {
pageContent: 'Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.\n' +
'Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\n' +
'Self-Reflection#',
metadata: [Object],
id: undefined
},
Document {
pageContent: 'Planning\n' +
'\n' +
'Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\n' +
'Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n' +
'\n' +
'\n' +
'Memory\n' +
'\n' +
'Short-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\n' +
'Long-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\n' +
'\n' +
'\n' +
'Tool use\n' +
'\n' +
'The agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.',
metadata: [Object],
id: undefined
},
Document {
pageContent: 'Resources:\n' +
'1. Internet access for searches and information gathering.\n' +
'2. Long Term memory management.\n' +
'3. GPT-3.5 powered Agents for delegation of simple tasks.\n' +
'4. File output.\n' +
'\n' +
'Performance Evaluation:\n' +
'1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n' +
'2. Constructively self-criticize your big-picture behavior constantly.\n' +
'3. Reflect on past decisions and strategies to refine your approach.\n' +
'4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.',
metadata: [Object],
id: undefined
}
]
}
----
{ answer: '' }
----
{ answer: 'Task' }
----
{ answer: ' decomposition' }
----
{ answer: ' involves' }
----
{ answer: ' breaking' }
----
{ answer: ' down' }
----
{ answer: ' a' }
----
{ answer: ' complex' }
----
{ answer: ' task' }
----
{ answer: ' into' }
----
{ answer: ' smaller' }
----
{ answer: ' and' }
----
{ answer: ' more' }
----
{ answer: ' manageable' }
----
{ answer: ' sub' }
----
{ answer: 'goals' }
----
{ answer: ' or' }
----
{ answer: ' steps' }
----
{ answer: '.' }
----
{ answer: ' This' }
----
{ answer: ' process' }
----
{ answer: ' allows' }
----
{ answer: ' an' }
----
{ answer: ' agent' }
----
{ answer: ' or' }
----
{ answer: ' model' }
----
{ answer: ' to' }
----
{ answer: ' efficiently' }
----
{ answer: ' handle' }
----
{ answer: ' intricate' }
----
{ answer: ' tasks' }
----
{ answer: ' by' }
----
{ answer: ' dividing' }
----
{ answer: ' them' }
----
{ answer: ' into' }
----
{ answer: ' simpler' }
----
{ answer: ' components' }
----
{ answer: '.' }
----
{ answer: ' Task' }
----
{ answer: ' decomposition' }
----
{ answer: ' can' }
----
{ answer: ' be' }
----
{ answer: ' achieved' }
----
{ answer: ' through' }
----
{ answer: ' techniques' }
----
{ answer: ' like' }
----
{ answer: ' Chain' }
----
{ answer: ' of' }
----
{ answer: ' Thought' }
----
{ answer: ',' }
----
{ answer: ' Tree' }
----
{ answer: ' of' }
----
{ answer: ' Thoughts' }
----
{ answer: ',' }
----
{ answer: ' or' }
----
{ answer: ' by' }
----
{ answer: ' using' }
----
{ answer: ' task' }
----
{ answer: '-specific' }
----
{ answer: ' instructions' }
----
{ answer: '.' }
----
{ answer: '' }
----
{ answer: '' }
----

代理

代理利用 LLM 的推理能力在执行过程中做出决策。使用代理可以让您将检索过程中的某些判断权委托给代理。尽管代理的行为不如链式结构可预测,但在这种情况下,它们提供了一些优势

  • 代理直接生成检索器的输入,而无需我们显式地构建上下文化,就像我们上面所做的那样;
  • 代理可以在查询服务中执行多个检索步骤,或者完全不执行检索步骤(例如,响应用户的通用问候语)。

检索工具

代理可以访问“工具”并管理其执行。在本例中,我们将我们的检索器转换为 LangChain 工具,供代理使用

import { createRetrieverTool } from "langchain/tools/retriever";

const tool = createRetrieverTool(retriever, {
name: "blog_post_retriever",
description:
"Searches and returns excerpts from the Autonomous Agents blog post.",
});
const tools = [tool];

工具是 LangChain 可运行对象,并实现了常用的接口

console.log(await tool.invoke({ query: "task decomposition" }));
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.
Self-Reflection#

Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.

(3) Task execution: Expert models execute on the specific tasks and log results.
Instruction:

With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

代理构造函数

现在我们已经定义了工具和 LLM,我们可以创建代理。我们将使用 LangGraph 来构建代理。目前,我们正在使用高级接口来构建代理,但 LangGraph 的优点在于,此高级接口由一个低级、高度可控的 API 支持,如果您想修改代理逻辑,可以使用该 API。

import { createReactAgent } from "@langchain/langgraph/prebuilt";

const agentExecutor = createReactAgent({ llm, tools });

现在我们可以尝试一下。请注意,到目前为止它还没有状态(我们还需要添加内存)

const query = "What is Task Decomposition?";

for await (const s of await agentExecutor.stream({
messages: [new HumanMessage(query)],
})) {
console.log(s);
console.log("----");
}
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABABtUmgD1ZlOHZd0nD9TR8yb3mMe",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_dWxEY41mg9VSLamVYHltsUxL",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 19,
"promptTokens": 66,
"totalTokens": 85
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_3537616b13"
},
"tool_calls": [
{
"name": "blog_post_retriever",
"args": {
"query": "Task Decomposition"
},
"type": "tool_call",
"id": "call_dWxEY41mg9VSLamVYHltsUxL"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 66,
"output_tokens": 19,
"total_tokens": 85
}
}
]
}
}
----
{
tools: {
messages: [
ToolMessage {
"content": "Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\n\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\nSelf-Reflection#\n\n(3) Task execution: Expert models execute on the specific tasks and log results.\nInstruction:\n\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n\nMemory\n\nShort-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\nLong-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\n\n\nTool use\n\nThe agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.",
"name": "blog_post_retriever",
"additional_kwargs": {},
"response_metadata": {},
"tool_call_id": "call_dWxEY41mg9VSLamVYHltsUxL"
}
]
}
}
----
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABABuSj5FHmHFdeR2Pv7Cxcmq5aQz",
"content": "Task Decomposition is a technique that allows an agent to break down a complex task into smaller, more manageable subtasks or steps. The primary goal is to simplify the task to ensure efficient execution and better understanding. \n\n### Methods in Task Decomposition:\n1. **Chain of Thought (CoT)**:\n - **Description**: This technique involves instructing the model to “think step by step” to decompose hard tasks into smaller ones. It transforms large tasks into multiple manageable tasks, enhancing the model's performance and providing insight into its thinking process. \n - **Example**: When given a complex problem, the model outlines sequential steps to reach a solution.\n\n2. **Tree of Thoughts**:\n - **Description**: This extends CoT by exploring multiple reasoning possibilities at each step. The problem is decomposed into multiple thought steps, with several thoughts generated per step, forming a sort of decision tree.\n - **Example**: For a given task, the model might consider various alternative actions at each stage, evaluating each before proceeding.\n\n3. **LLM with Prompts**:\n - **Description**: Basic task decomposition can be done via simple prompts like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" This can also be guided by task-specific instructions or human inputs when necessary.\n - **Example**: Asking the model to list the subgoals for writing a novel might produce an outline broken down into chapters, character development, and plot points.\n\n4. **LLM+P**:\n - **Description**: This approach involves outsourcing long-term planning to an external classical planner using Planning Domain Definition Language (PDDL). The task is translated into a PDDL problem by the model, planned using classical planning tools, and then translated back into natural language.\n - **Example**: In robotics, translating a task into PDDL and then using a domain-specific planner to generate a sequence of actions.\n\n### Applications:\n- **Planning**: Helps an agent plan tasks by breaking them into clear, manageable steps.\n- **Self-Reflection**: Allows agents to reflect and refine their actions, learning from past mistakes to improve future performance.\n- **Memory**: Utilizes short-term memory for immediate context and long-term memory for retaining and recalling information over extended periods.\n- **Tool Use**: Enables the agent to call external APIs for additional information or capabilities not inherent in the model.\n\nIn essence, task decomposition leverages various methodologies to simplify complex tasks, ensuring better performance, improved reasoning, and effective task execution.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 522,
"promptTokens": 821,
"totalTokens": 1343
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 821,
"output_tokens": 522,
"total_tokens": 1343
}
}
]
}
}
----

LangGraph 带有内置的持久性,因此我们不需要使用 ChatMessageHistory!相反,我们可以将检查点直接传递给我们的 LangGraph 代理

import { MemorySaver } from "@langchain/langgraph";

const memory = new MemorySaver();

const agentExecutorWithMemory = createReactAgent({
llm,
tools,
checkpointSaver: memory,
});

这就是构建对话式 RAG 代理所需的全部操作。

让我们观察一下它的行为。请注意,如果我们输入不需要检索步骤的查询,代理不会执行检索步骤

const config = { configurable: { thread_id: "abc123" } };

for await (const s of await agentExecutorWithMemory.stream(
{ messages: [new HumanMessage("Hi! I'm bob")] },
config
)) {
console.log(s);
console.log("----");
}
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABACGc1vDPUSHYN7YVkuUMwpKR20P",
"content": "Hello, Bob! How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 12,
"promptTokens": 64,
"totalTokens": 76
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 64,
"output_tokens": 12,
"total_tokens": 76
}
}
]
}
}
----

此外,如果我们输入需要检索步骤的查询,代理会生成工具的输入

for await (const s of await agentExecutorWithMemory.stream(
{ messages: [new HumanMessage(query)] },
config
)) {
console.log(s);
console.log("----");
}
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABACI6WN7hkfJjFhIUBGt3TswtPOv",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_Lys2G4TbOMJ6RBuVvKnFSK4V",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 19,
"promptTokens": 89,
"totalTokens": 108
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_f82f5b050c"
},
"tool_calls": [
{
"name": "blog_post_retriever",
"args": {
"query": "Task Decomposition"
},
"type": "tool_call",
"id": "call_Lys2G4TbOMJ6RBuVvKnFSK4V"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 89,
"output_tokens": 19,
"total_tokens": 108
}
}
]
}
}
----
{
tools: {
messages: [
ToolMessage {
"content": "Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\n\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\nSelf-Reflection#\n\n(3) Task execution: Expert models execute on the specific tasks and log results.\nInstruction:\n\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n\nMemory\n\nShort-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\nLong-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\n\n\nTool use\n\nThe agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.",
"name": "blog_post_retriever",
"additional_kwargs": {},
"response_metadata": {},
"tool_call_id": "call_Lys2G4TbOMJ6RBuVvKnFSK4V"
}
]
}
}
----
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABACJu56eYSAyyMNaV9UEUwHS8vRu",
"content": "Task Decomposition is a method used to break down complicated tasks into smaller, more manageable steps. This approach leverages the \"Chain of Thought\" (CoT) technique, which prompts models to \"think step by step\" to enhance performance on complex tasks. Here’s a summary of the key concepts related to Task Decomposition:\n\n1. **Chain of Thought (CoT):**\n - A prompting technique that encourages models to decompose hard tasks into simpler steps, transforming big tasks into multiple manageable sub-tasks.\n - CoT helps to provide insights into the model’s thinking process.\n\n2. **Tree of Thoughts:**\n - An extension of CoT, this approach explores multiple reasoning paths at each step.\n - It creates a tree structure by generating multiple thoughts per step, and uses search methods like breadth-first search (BFS) or depth-first search (DFS) to explore these thoughts.\n - Each state is evaluated by a classifier or majority vote.\n\n3. **Methods for Task Decomposition:**\n - Simple prompting such as instructing with phrases like \"Steps for XYZ: 1., 2., 3.\" or \"What are the subgoals for achieving XYZ?\".\n - Using task-specific instructions like \"Write a story outline\" for specific tasks such as writing a novel.\n - Incorporating human inputs for better granularity.\n\n4. **LLM+P (Long-horizon Planning):**\n - A method that involves using an external classical planner for long-horizon planning.\n - The process involves translating the problem into a Planning Domain Definition Language (PDDL) problem, using a classical planner to generate a PDDL plan, and then translating it back into natural language.\n\nTask Decomposition is essential in planning complex tasks, allowing for efficient handling by breaking them into sub-tasks and sub-goals. This process is integral to the functioning of autonomous agent systems and enhances their capability to execute intricate tasks effectively.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 396,
"promptTokens": 844,
"totalTokens": 1240
},
"finish_reason": "stop",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 844,
"output_tokens": 396,
"total_tokens": 1240
}
}
]
}
}
----

上面,代理没有将我们的查询逐字插入工具,而是去除了“什么”和“是”等不必要的词语。

同样的原理允许代理在必要时使用对话的上下文

const query3 =
"What according to the blog post are common ways of doing it? redo the search";

for await (const s of await agentExecutorWithMemory.stream(
{ messages: [new HumanMessage(query3)] },
config
)) {
console.log(s);
console.log("----");
}
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABACPZzSugzrREQRO4mVQfI3cQOeL",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_5nSZb396Tcg73Pok6Bx1XV8b",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 22,
"promptTokens": 1263,
"totalTokens": 1285
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [
{
"name": "blog_post_retriever",
"args": {
"query": "common ways of doing task decomposition"
},
"type": "tool_call",
"id": "call_5nSZb396Tcg73Pok6Bx1XV8b"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 1263,
"output_tokens": 22,
"total_tokens": 1285
}
}
]
}
}
----
{
tools: {
messages: [
ToolMessage {
"content": "Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\n\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\nSelf-Reflection#\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n\nMemory\n\nShort-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\nLong-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\n\n\nTool use\n\nThe agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.",
"name": "blog_post_retriever",
"additional_kwargs": {},
"response_metadata": {},
"tool_call_id": "call_5nSZb396Tcg73Pok6Bx1XV8b"
}
]
}
}
----
{
agent: {
messages: [
AIMessage {
"id": "chatcmpl-ABACQt9pT5dKCTaGQpVawcmCCWdET",
"content": "According to the blog post, common ways of performing Task Decomposition include:\n\n1. **Using Large Language Models (LLMs) with Simple Prompting:**\n - Providing clear and structured prompts such as \"Steps for XYZ: 1., 2., 3.\" or asking \"What are the subgoals for achieving XYZ?\"\n - This allows the model to break down the tasks step-by-step.\n\n2. **Task-Specific Instructions:**\n - Employing specific instructions tailored to the task at hand, for example, \"Write a story outline\" for writing a novel.\n - These instructions guide the model in decomposing the task appropriately.\n\n3. **Involving Human Inputs:**\n - Integrating insights and directives from humans to aid in the decomposition process.\n - This can ensure that the breakdown is comprehensive and accurately reflects the nuances of the task.\n\n4. **LLM+P Approach for Long-Horizon Planning:**\n - Utilizing an external classical planner by translating the problem into Planning Domain Definition Language (PDDL).\n - The process involves:\n 1. Translating the problem into “Problem PDDL”.\n 2. Requesting a classical planner to generate a PDDL plan based on an existing “Domain PDDL”.\n 3. Translating the PDDL plan back into natural language.\n\nThese methods enable effective management and execution of complex tasks by transforming them into simpler, more manageable components.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 292,
"promptTokens": 2010,
"totalTokens": 2302
},
"finish_reason": "stop",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 2010,
"output_tokens": 292,
"total_tokens": 2302
}
}
]
}
}
----

请注意,代理能够推断出我们查询中的“它”指的是“任务分解”,并因此生成了一个合理的搜索查询——在本例中是“任务分解的常见方法”。

整合在一起

为了方便起见,我们将所有必要步骤整合到一个代码单元中

import { ChatOpenAI } from "@langchain/openai";
import { MemorySaver } from "@langchain/langgraph";
import { createReactAgent } from "@langchain/langgraph/prebuilt";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { createRetrieverTool } from "langchain/tools/retriever";

const memory3 = new MemorySaver();
const llm3 = new ChatOpenAI({ model: "gpt-4o", temperature: 0 });

// Construct retriever
const loader3 = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/",
{
selector: ".post-content, .post-title, .post-header",
}
);

const docs3 = await loader3.load();

const textSplitter3 = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const splits3 = await textSplitter3.splitDocuments(docs3);
const vectorstore3 = await MemoryVectorStore.fromDocuments(
splits3,
new OpenAIEmbeddings()
);
const retriever3 = vectorstore3.asRetriever();

// Build retriever tool
const tool3 = createRetrieverTool(retriever3, {
name: "blog_post_retriever",
description:
"Searches and returns excerpts from the Autonomous Agents blog post.",
});
const tools3 = [tool3];

const agentExecutor3 = createReactAgent({
llm: llm3,
tools: tools3,
checkpointSaver: memory3,
});

下一步

我们已经介绍了构建基本的对话式问答应用程序的步骤

  • 我们使用链式结构构建了一个可预测的应用程序,该应用程序为每个用户输入生成搜索查询;
  • 我们使用代理构建了一个“决定”何时以及如何生成搜索查询的应用程序。

要探索不同类型的检索器和检索策略,请访问 检索器 部分的操作指南。

有关 LangChain 的对话内存抽象的详细介绍,请访问 如何添加消息历史记录(内存) LCEL 页面。


此页面对您有帮助吗?


您也可以留下详细的反馈 在 GitHub 上.