构建检索增强生成 (RAG) 应用:第二部分
在许多问答应用中,我们希望允许用户进行来回对话,这意味着应用程序需要某种形式的“记忆”来记住过去的问题和答案,以及一些逻辑来将其纳入当前的思考中。
这是多部分教程的第二部分
在这里,我们专注于添加逻辑以整合历史消息。 这涉及到 聊天记录 的管理。
我们将介绍两种方法
对于外部知识来源,我们将使用 第一部分 RAG 教程中 Lilian Weng 的 LLM Powered Autonomous Agents 博客文章。
设置
组件
我们将需要从 LangChain 的集成套件中选择三个组件。
一个 聊天模型
选择你的聊天模型
- Groq
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- VertexAI
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const llm = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const llm = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const llm = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
请参阅 此部分 获取关于安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const llm = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
一个 嵌入模型
选择你的嵌入模型
- OpenAI
- Azure
- AWS
- VertexAI
- MistralAI
- Cohere
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
OPENAI_API_KEY=your-api-key
import { OpenAIEmbeddings } from "@langchain/openai";
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-large"
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
AZURE_OPENAI_API_INSTANCE_NAME=<YOUR_INSTANCE_NAME>
AZURE_OPENAI_API_KEY=<YOUR_KEY>
AZURE_OPENAI_API_VERSION="2024-02-01"
import { AzureOpenAIEmbeddings } from "@langchain/openai";
const embeddings = new AzureOpenAIEmbeddings({
azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002"
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/aws
yarn add @langchain/aws
pnpm add @langchain/aws
BEDROCK_AWS_REGION=your-region
import { BedrockEmbeddings } from "@langchain/aws";
const embeddings = new BedrockEmbeddings({
model: "amazon.titan-embed-text-v1"
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
import { VertexAIEmbeddings } from "@langchain/google-vertexai";
const embeddings = new VertexAIEmbeddings({
model: "text-embedding-004"
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
MISTRAL_API_KEY=your-api-key
import { MistralAIEmbeddings } from "@langchain/mistralai";
const embeddings = new MistralAIEmbeddings({
model: "mistral-embed"
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/cohere
yarn add @langchain/cohere
pnpm add @langchain/cohere
COHERE_API_KEY=your-api-key
import { CohereEmbeddings } from "@langchain/cohere";
const embeddings = new CohereEmbeddings({
model: "embed-english-v3.0"
});
以及一个 向量存储
选择你的向量存储
- 内存
- Chroma
- FAISS
- MongoDB
- PGVector
- Pinecone
- Qdrant
安装依赖项
- npm
- yarn
- pnpm
npm i langchain
yarn add langchain
pnpm add langchain
import { MemoryVectorStore } from "langchain/vectorstores/memory";
const vectorStore = new MemoryVectorStore(embeddings);
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
import { Chroma } from "@langchain/community/vectorstores/chroma";
const vectorStore = new Chroma(embeddings, {
collectionName: "a-test-collection",
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
import { FaissStore } from "@langchain/community/vectorstores/faiss";
const vectorStore = new FaissStore(embeddings, {});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/mongodb
yarn add @langchain/mongodb
pnpm add @langchain/mongodb
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb"
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const collection = client
.db(process.env.MONGODB_ATLAS_DB_NAME)
.collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);
const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection: collection,
indexName: "vector_index",
textKey: "text",
embeddingKey: "embedding",
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
import { PGVectorStore } from "@langchain/community/vectorstores/pgvector";
const vectorStore = await PGVectorStore.initialize(embeddings, {})
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/pinecone
yarn add @langchain/pinecone
pnpm add @langchain/pinecone
import { PineconeStore } from "@langchain/pinecone";
import { Pinecone as PineconeClient } from "@pinecone-database/pinecone";
const pinecone = new PineconeClient();
const vectorStore = new PineconeStore(embeddings, {
pineconeIndex,
maxConcurrency: 5,
});
安装依赖项
- npm
- yarn
- pnpm
npm i @langchain/qdrant
yarn add @langchain/qdrant
pnpm add @langchain/qdrant
import { QdrantVectorStore } from "@langchain/qdrant";
const vectorStore = await QdrantVectorStore.fromExistingCollection(embeddings, {
url: process.env.QDRANT_URL,
collectionName: "langchainjs-testing",
});
依赖项
此外,我们将使用以下包
- npm
- yarn
- pnpm
npm i langchain @langchain/community @langchain/langgraph cheerio
yarn add langchain @langchain/community @langchain/langgraph cheerio
pnpm add langchain @langchain/community @langchain/langgraph cheerio
LangSmith
你使用 LangChain 构建的许多应用程序将包含多个步骤和多次 LLM 调用。随着这些应用程序变得越来越复杂,能够检查你的链或代理内部到底发生了什么是至关重要的。最好的方法是使用 LangSmith。
请注意,LangSmith 不是必需的,但它很有帮助。如果你想使用 LangSmith,在 上面的链接 注册后,请确保设置你的环境变量以开始记录追踪信息
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY=YOUR_KEY
# Reduce tracing latency if you are not in a serverless environment
# export LANGCHAIN_CALLBACKS_BACKGROUND=true
链
让我们首先回顾一下我们在 第一部分 中构建的向量存储,它索引了 Lilian Weng 的一篇 LLM Powered Autonomous Agents 博客文章。
import "cheerio";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
// Load and chunk contents of the blog
const pTagSelector = "p";
const cheerioLoader = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/",
{
selector: pTagSelector,
}
);
const docs = await cheerioLoader.load();
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const allSplits = await splitter.splitDocuments(docs);
// Index chunks
await vectorStore.addDocuments(allSplits);
在 RAG 教程的 第一部分 中,我们将用户输入、检索到的上下文和生成的答案表示为状态中的单独键。对话体验可以使用 消息 序列自然地表示。除了来自用户和助手的信息外,检索到的文档和其他工件可以通过 工具消息 整合到消息序列中。这促使我们使用消息序列来表示 RAG 应用程序的状态。具体来说,我们将有:
- 用户输入作为
HumanMessage
; - 向量存储查询作为带有工具调用的
AIMessage
; - 检索到的文档作为
ToolMessage
; - 最终响应作为
AIMessage
。
这种状态模型非常通用,LangGraph 提供了一个内置版本以方便使用
import { MessagesAnnotation, StateGraph } from "@langchain/langgraph";
const graph = new StateGraph(MessagesAnnotation);
利用 工具调用 与检索步骤进行交互还有另一个好处,即检索的查询是由我们的模型生成的。这在对话环境中尤其重要,在对话环境中,用户查询可能需要根据聊天记录进行上下文关联。例如,考虑以下交流:
用户:“什么是任务分解?”
AI:“任务分解涉及将复杂任务分解为更小更简单的步骤,使代理或模型更容易管理。”
用户:“有哪些常见的方法?”
在这种情况下,模型可以生成诸如 “任务分解的常用方法”
之类的查询。工具调用自然地促进了这一点。正如 RAG 教程的 查询分析 部分所述,这允许模型将用户查询重写为更有效的搜索查询。它还为不涉及检索步骤的直接响应提供支持(例如,响应来自用户的通用问候)。
让我们将检索步骤变成一个 工具
import { z } from "zod";
import { tool } from "@langchain/core/tools";
const retrieveSchema = z.object({ query: z.string() });
const retrieve = tool(
async ({ query }) => {
const retrievedDocs = await vectorStore.similaritySearch(query, 2);
const serialized = retrievedDocs
.map(
(doc) => `Source: ${doc.metadata.source}\nContent: ${doc.pageContent}`
)
.join("\n");
return [serialized, retrievedDocs];
},
{
name: "retrieve",
description: "Retrieve information related to a query.",
schema: retrieveSchema,
responseFormat: "content_and_artifact",
}
);
有关创建工具的更多详细信息,请参阅 本指南。
我们的图将由三个节点组成:
- 一个节点,用于处理用户输入,要么生成检索器的查询,要么直接响应;
- 一个用于检索器工具的节点,该节点执行检索步骤;
- 一个节点,用于使用检索到的上下文生成最终响应。
我们在下面构建它们。请注意,我们利用了另一个预构建的 LangGraph 组件,ToolNode,它执行工具并将结果作为 ToolMessage
添加到状态。
import {
AIMessage,
HumanMessage,
SystemMessage,
ToolMessage,
} from "@langchain/core/messages";
import { MessagesAnnotation } from "@langchain/langgraph";
import { ToolNode } from "@langchain/langgraph/prebuilt";
// Step 1: Generate an AIMessage that may include a tool-call to be sent.
async function queryOrRespond(state: typeof MessagesAnnotation.State) {
const llmWithTools = llm.bindTools([retrieve]);
const response = await llmWithTools.invoke(state.messages);
// MessagesState appends messages to state instead of overwriting
return { messages: [response] };
}
// Step 2: Execute the retrieval.
const tools = new ToolNode([retrieve]);
// Step 3: Generate a response using the retrieved content.
async function generate(state: typeof MessagesAnnotation.State) {
// Get generated ToolMessages
let recentToolMessages = [];
for (let i = state["messages"].length - 1; i >= 0; i--) {
let message = state["messages"][i];
if (message instanceof ToolMessage) {
recentToolMessages.push(message);
} else {
break;
}
}
let toolMessages = recentToolMessages.reverse();
// Format into prompt
const docsContent = toolMessages.map((doc) => doc.content).join("\n");
const systemMessageContent =
"You are an assistant for question-answering tasks. " +
"Use the following pieces of retrieved context to answer " +
"the question. If you don't know the answer, say that you " +
"don't know. Use three sentences maximum and keep the " +
"answer concise." +
"\n\n" +
`${docsContent}`;
const conversationMessages = state.messages.filter(
(message) =>
message instanceof HumanMessage ||
message instanceof SystemMessage ||
(message instanceof AIMessage && message.tool_calls.length == 0)
);
const prompt = [
new SystemMessage(systemMessageContent),
...conversationMessages,
];
// Run
const response = await llm.invoke(prompt);
return { messages: [response] };
}
最后,我们将我们的应用程序编译成一个单独的 graph
对象。在本例中,我们只是将步骤连接成一个序列。我们还允许第一个 query_or_respond
步骤“短路”,并在它不生成工具调用时直接响应用户。这允许我们的应用程序支持对话体验——例如,响应可能不需要检索步骤的通用问候
import { StateGraph } from "@langchain/langgraph";
import { toolsCondition } from "@langchain/langgraph/prebuilt";
const graphBuilder = new StateGraph(MessagesAnnotation)
.addNode("queryOrRespond", queryOrRespond)
.addNode("tools", tools)
.addNode("generate", generate)
.addEdge("__start__", "queryOrRespond")
.addConditionalEdges("queryOrRespond", toolsCondition, {
__end__: "__end__",
tools: "tools",
})
.addEdge("tools", "generate")
.addEdge("generate", "__end__");
const graph = graphBuilder.compile();
// Note: tslab only works inside a jupyter notebook. Don't worry about running this code yourself!
import * as tslab from "tslab";
const image = await graph.getGraph().drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();
await tslab.display.png(new Uint8Array(arrayBuffer));
让我们测试我们的应用程序。
展开以查看 `prettyPrint` 代码。
import { BaseMessage, isAIMessage } from "@langchain/core/messages";
const prettyPrint = (message: BaseMessage) => {
let txt = `[${message._getType()}]: ${message.content}`;
if ((isAIMessage(message) && message.tool_calls?.length) || 0 > 0) {
const tool_calls = (message as AIMessage)?.tool_calls
?.map((tc) => `- ${tc.name}(${JSON.stringify(tc.args)})`)
.join("\n");
txt += ` \nTools: \n${tool_calls}`;
}
console.log(txt);
};
请注意,它可以适当地响应不需要额外检索步骤的消息
let inputs1 = { messages: [{ role: "user", content: "Hello" }] };
for await (const step of await graph.stream(inputs1, {
streamMode: "values",
})) {
const lastMessage = step.messages[step.messages.length - 1];
prettyPrint(lastMessage);
console.log("-----\n");
}
[human]: Hello
-----
[ai]: Hello! How can I assist you today?
-----
当执行搜索时,我们可以流式传输步骤以观察查询生成、检索和答案生成
let inputs2 = {
messages: [{ role: "user", content: "What is Task Decomposition?" }],
};
for await (const step of await graph.stream(inputs2, {
streamMode: "values",
})) {
const lastMessage = step.messages[step.messages.length - 1];
prettyPrint(lastMessage);
console.log("-----\n");
}
[human]: What is Task Decomposition?
-----
[ai]:
Tools:
- retrieve({"query":"Task Decomposition"})
-----
[tool]: Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain
Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: System message:Think step by step and reason yourself to the right decisions to make sure we get it right.
You will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.Then you will output the content of each file including ALL code.
Each file must strictly follow a markdown code block format, where the following tokens must be replaced such that
FILENAME is the lowercase file name including the file extension,
LANG is the markup code block language for the code’s language, and CODE is the code:FILENAMEYou will start with the “entrypoint” file, then go to the ones that are imported by that file, and so on.
Please note that the code should be fully functional. No placeholders.Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.
-----
[ai]: Task decomposition is the process of breaking down a complex task into smaller, more manageable steps or subgoals. This can be achieved through various methods, such as using prompts for large language models (LLMs), task-specific instructions, or human inputs. It helps in simplifying the problem-solving process and enhances understanding of the task at hand.
-----
查看 此处 的 LangSmith 追踪。
聊天记录的状态管理
本教程的这一部分之前使用了 RunnableWithMessageHistory 抽象。你可以在 v0.2 文档 中访问该版本的文档。
截至 LangChain 的 v0.3 版本发布,我们建议 LangChain 用户利用 LangGraph 持久化 将 memory
整合到新的 LangChain 应用程序中。
如果你的代码已经依赖于 RunnableWithMessageHistory
或 BaseChatMessageHistory
,则你无需进行任何更改。我们不打算在不久的将来弃用此功能,因为它适用于简单的聊天应用程序,并且任何使用 RunnableWithMessageHistory
的代码都将继续按预期工作。
有关更多详细信息,请参阅 如何迁移到 LangGraph 内存。
在生产环境中,问答应用程序通常会将聊天记录持久化到数据库中,并能够适当地读取和更新它。
LangGraph 实现了内置的 持久化层,使其成为支持多轮对话的聊天应用程序的理想选择。
要管理多个对话轮次和线程,我们所要做的就是在编译我们的应用程序时指定一个 检查点。由于我们图中的节点正在将消息附加到状态,因此我们将在多次调用中保留一致的聊天记录。
LangGraph 配备了一个简单的内存检查点,我们在下面使用它。有关更多详细信息,包括如何使用不同的持久化后端(例如,SQLite 或 Postgres),请参阅其 文档。
有关如何管理消息历史记录的详细演练,请前往 如何添加消息历史记录(内存) 指南。
import { MemorySaver } from "@langchain/langgraph";
const checkpointer = new MemorySaver();
const graphWithMemory = graphBuilder.compile({ checkpointer });
// Specify an ID for the thread
const threadConfig = {
configurable: { thread_id: "abc123" },
streamMode: "values" as const,
};
我们现在可以像以前一样调用
let inputs3 = {
messages: [{ role: "user", content: "What is Task Decomposition?" }],
};
for await (const step of await graphWithMemory.stream(inputs3, threadConfig)) {
const lastMessage = step.messages[step.messages.length - 1];
prettyPrint(lastMessage);
console.log("-----\n");
}
[human]: What is Task Decomposition?
-----
[ai]:
Tools:
- retrieve({"query":"Task Decomposition"})
-----
[tool]: Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain
Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: System message:Think step by step and reason yourself to the right decisions to make sure we get it right.
You will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.Then you will output the content of each file including ALL code.
Each file must strictly follow a markdown code block format, where the following tokens must be replaced such that
FILENAME is the lowercase file name including the file extension,
LANG is the markup code block language for the code’s language, and CODE is the code:FILENAMEYou will start with the “entrypoint” file, then go to the ones that are imported by that file, and so on.
Please note that the code should be fully functional. No placeholders.Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.
-----
[ai]: Task decomposition is the process of breaking down a complex task into smaller, more manageable steps or subgoals. This can be achieved through various methods, such as using prompts for large language models (LLMs), task-specific instructions, or human inputs. It helps in simplifying the problem-solving process and enhances understanding of the task at hand.
-----
let inputs4 = {
messages: [
{ role: "user", content: "Can you look up some common ways of doing it?" },
],
};
for await (const step of await graphWithMemory.stream(inputs4, threadConfig)) {
const lastMessage = step.messages[step.messages.length - 1];
prettyPrint(lastMessage);
console.log("-----\n");
}
[human]: Can you look up some common ways of doing it?
-----
[ai]:
Tools:
- retrieve({"query":"common methods of task decomposition"})
-----
[tool]: Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain
Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: be provided by other developers (as in Plugins) or self-defined (as in function calls).HuggingGPT (Shen et al. 2023) is a framework to use ChatGPT as the task planner to select models available in HuggingFace platform according to the model descriptions and summarize the response based on the execution results.The system comprises of 4 stages:(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.Instruction:(2) Model selection: LLM distributes the tasks to expert models, where the request is framed as a multiple-choice question. LLM is presented with a list of models to choose from. Due to the limited context length, task type based filtration is needed.Instruction:(3) Task execution: Expert models execute on the specific tasks and log results.Instruction:(4) Response generation:
-----
[ai]: Common ways of task decomposition include using large language models (LLMs) with simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?", employing task-specific instructions (e.g., "Write a story outline"), and incorporating human inputs. Additionally, methods like the Tree of Thoughts approach explore multiple reasoning possibilities at each step, creating a structured tree of thoughts. These techniques facilitate breaking down tasks into manageable components for better execution.
-----
请注意,模型在第二个问题中生成的查询包含了对话上下文。
LangSmith 追踪在此处特别有信息量,因为我们可以准确地看到在每个步骤中哪些消息对我们的聊天模型是可见的。
代理
代理 利用 LLM 的推理能力在执行期间做出决策。使用代理允许你卸载对检索过程的额外自由裁量权。尽管它们的行为不如上面的“链”那样可预测,但它们能够执行多个检索步骤来服务于查询,或迭代单个搜索。
下面我们组装一个最小的 RAG 代理。使用 LangGraph 的 预构建的 ReAct 代理构造函数,我们可以在一行中完成此操作。
查看 LangGraph 的 Agentic RAG 教程,了解更高级的公式。
import { createReactAgent } from "@langchain/langgraph/prebuilt";
const agent = createReactAgent({ llm: llm, tools: [retrieve] });
让我们检查一下图
// Note: tslab only works inside a jupyter notebook. Don't worry about running this code yourself!
import * as tslab from "tslab";
const image = await agent.getGraph().drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();
await tslab.display.png(new Uint8Array(arrayBuffer));
与我们之前的实现的关键区别在于,这里的工具调用循环回到原始 LLM 调用,而不是结束运行的最终生成步骤。然后,模型可以使用检索到的上下文回答问题,或者生成另一个工具调用以获取更多信息。
让我们测试一下。我们构建一个通常需要迭代检索步骤序列才能回答的问题
let inputMessage = `What is the standard method for Task Decomposition?
Once you get the answer, look up common extensions of that method.`;
let inputs5 = { messages: [{ role: "user", content: inputMessage }] };
for await (const step of await agent.stream(inputs5, {
streamMode: "values",
})) {
const lastMessage = step.messages[step.messages.length - 1];
prettyPrint(lastMessage);
console.log("-----\n");
}
[human]: What is the standard method for Task Decomposition?
Once you get the answer, look up common extensions of that method.
-----
[ai]:
Tools:
- retrieve({"query":"standard method for Task Decomposition"})
-----
[tool]: Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain
Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: System message:Think step by step and reason yourself to the right decisions to make sure we get it right.
You will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.Then you will output the content of each file including ALL code.
Each file must strictly follow a markdown code block format, where the following tokens must be replaced such that
FILENAME is the lowercase file name including the file extension,
LANG is the markup code block language for the code’s language, and CODE is the code:FILENAMEYou will start with the “entrypoint” file, then go to the ones that are imported by that file, and so on.
Please note that the code should be fully functional. No placeholders.Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.
-----
[ai]:
Tools:
- retrieve({"query":"common extensions of Task Decomposition method"})
-----
[tool]: Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain
Source: https://lilianweng.github.io/posts/2023-06-23-agent/
Content: be provided by other developers (as in Plugins) or self-defined (as in function calls).HuggingGPT (Shen et al. 2023) is a framework to use ChatGPT as the task planner to select models available in HuggingFace platform according to the model descriptions and summarize the response based on the execution results.The system comprises of 4 stages:(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.Instruction:(2) Model selection: LLM distributes the tasks to expert models, where the request is framed as a multiple-choice question. LLM is presented with a list of models to choose from. Due to the limited context length, task type based filtration is needed.Instruction:(3) Task execution: Expert models execute on the specific tasks and log results.Instruction:(4) Response generation:
-----
[ai]: ### Standard Method for Task Decomposition
The standard method for task decomposition involves breaking down hard tasks into smaller, more manageable steps. This can be achieved through various approaches:
1. **Chain of Thought (CoT)**: This method transforms large tasks into multiple manageable tasks, providing insight into the model's reasoning process.
2. **Prompting**: Using simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?" to guide the decomposition.
3. **Task-Specific Instructions**: Providing specific instructions tailored to the task, such as "Write a story outline" for writing a novel.
4. **Human Inputs**: Involving human input to assist in the decomposition process.
### Common Extensions of Task Decomposition
Several extensions have been developed to enhance the task decomposition process:
1. **Tree of Thoughts (ToT)**: This method extends CoT by exploring multiple reasoning possibilities at each step. It decomposes the problem into multiple thought steps and generates various thoughts per step, creating a tree structure. The search process can utilize either breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier or through majority voting.
2. **LLM+P**: This approach involves using an external classical planner for long-horizon planning, integrating planning domains to enhance the decomposition process.
3. **HuggingGPT**: This framework utilizes ChatGPT as a task planner to select models from the HuggingFace platform based on model descriptions. It consists of four stages:
- **Task Planning**: Parsing user requests into multiple tasks with attributes like task type, ID, dependencies, and arguments.
- **Model Selection**: Distributing tasks to expert models based on a multiple-choice question format.
- **Task Execution**: Expert models execute specific tasks and log results.
- **Response Generation**: Compiling the results into a coherent response.
These extensions aim to improve the efficiency and effectiveness of task decomposition, making it easier to manage complex tasks.
-----
请注意,代理
- 生成一个查询来搜索任务分解的标准方法;
- 接收到答案后,生成第二个查询来搜索它的常见扩展;
- 收到所有必要的上下文后,回答问题。
我们可以在 LangSmith 追踪 中看到完整的步骤序列,以及延迟和其他元数据。
下一步
我们已经介绍了构建基本对话式问答应用程序的步骤
- 我们使用链构建了一个可预测的应用程序,该应用程序每个用户输入最多生成一个查询;
- 我们使用代理构建了一个可以迭代查询序列的应用程序。
要探索不同类型的检索器和检索策略,请访问操作指南的 检索器 部分。
有关 LangChain 对话记忆抽象的详细演练,请访问 如何添加消息历史记录(内存) 指南。