总结文本
本教程演示了如何使用内置链和 LangGraph 进行文本摘要。
有关此页面的先前版本,其中展示了旧版链 RefineDocumentsChain,请参阅此处。
假设您有一组文档(PDF、Notion 页面、客户问题等),并且您想要总结内容。
LLM 是一个很棒的工具,因为它们精通理解和综合文本。
在 检索增强生成 的上下文中,总结文本可以帮助提炼大量检索到的文档中的信息,从而为 LLM 提供上下文。
在本演练中,我们将介绍如何使用 LLM 总结来自多个文档的内容。
概念
我们将涵盖的概念是
使用语言模型。
使用文档加载器,特别是 CheerioWebBaseLoader 从 HTML 网页加载内容。
总结或以其他方式组合文档的两种方法。
- Stuff,它只是将文档连接到一个提示中;
- Map-reduce,用于更大的文档集。这会将文档拆分为批次,总结这些批次,然后总结摘要。
设置
Jupyter Notebook
本教程和其他教程最好在 Jupyter notebooks 中运行。在交互式环境中浏览指南是更好地理解它们的好方法。有关如何安装的说明,请参阅此处。
安装
要安装 LangChain,请运行
bash npm2yarn npm i langchain @langchain/core
有关更多详细信息,请参阅我们的安装指南。
LangSmith
您使用 LangChain 构建的许多应用程序将包含多个步骤,其中包含多次 LLM 调用。随着这些应用程序变得越来越复杂,能够检查链或代理内部究竟发生了什么是至关重要的。最好的方法是使用 LangSmith。
在上面的链接注册后,请确保设置您的环境变量以开始记录跟踪
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="..."
# Reduce tracing latency if you are not in a serverless environment
# export LANGCHAIN_CALLBACKS_BACKGROUND=true
概述
构建摘要器的核心问题是如何将您的文档传递到 LLM 的上下文窗口中。两种常见的方法是
Stuff
:只需将所有文档“塞入”到一个提示中。这是最简单的方法。Map-reduce
:在“map”步骤中单独总结每个文档,然后将摘要“reduce”为最终摘要。
请注意,当对子文档的理解不依赖于前面的上下文时,map-reduce 特别有效。例如,当总结大量的较短文档语料库时。在其他情况下,例如总结具有内在顺序的小说或文本主体时,迭代改进可能更有效。
首先我们加载我们的文档。我们将使用 WebBaseLoader 加载一篇博文
import "cheerio";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
const pTagSelector = "p";
const cheerioLoader = new CheerioWebBaseLoader(
"https://lilianweng.github.io/posts/2023-06-23-agent/",
{
selector: pTagSelector,
}
);
const docs = await cheerioLoader.load();
接下来,让我们选择一个 聊天模型
选择您的聊天模型
- Groq
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- VertexAI
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const llm = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const llm = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const llm = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
请参阅 此部分,了解有关安装集成包的通用说明.
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const llm = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
Stuff:在单个 LLM 调用中总结
我们可以使用 createStuffDocumentsChain,特别是当使用更大的上下文窗口模型时,例如
- 128k 令牌 OpenAI
gpt-4o
- 200k 令牌 Anthropic
claude-3-5-sonnet-20240620
链将获取文档列表,将它们全部插入到一个提示中,并将该提示传递给 LLM
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { PromptTemplate } from "@langchain/core/prompts";
// Define prompt
const prompt = PromptTemplate.fromTemplate(
"Summarize the main themes in these retrieved docs: {context}"
);
// Instantiate
const chain = await createStuffDocumentsChain({
llm: llm,
outputParser: new StringOutputParser(),
prompt,
});
// Invoke
const result = await chain.invoke({ context: docs });
console.log(result);
The retrieved documents discuss the development and capabilities of autonomous agents powered by large language models (LLMs). Here are the main themes:
1. **LLM as a Core Controller**: LLMs are positioned as the central intelligence in autonomous agent systems, capable of performing complex tasks beyond simple text generation. They can be framed as general problem solvers, with various implementations like AutoGPT, GPT-Engineer, and BabyAGI serving as proof-of-concept demonstrations.
2. **Task Decomposition and Planning**: Effective task management is crucial for LLMs. Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are highlighted for breaking down complex tasks into manageable steps. CoT encourages step-by-step reasoning, while ToT explores multiple reasoning paths, enhancing the agent's problem-solving capabilities.
3. **Integration of External Tools**: The use of external tools significantly enhances LLM capabilities. Frameworks like MRKL and Toolformer allow LLMs to interact with various APIs and tools, improving their performance in specific tasks. This modular approach enables LLMs to route inquiries to specialized modules, combining neural and symbolic reasoning.
4. **Self-Reflection and Learning**: Self-reflection mechanisms are essential for agents to learn from past actions and improve over time. Approaches like ReAct and Reflexion integrate reasoning with action, allowing agents to evaluate their performance and adjust strategies based on feedback.
5. **Memory and Context Management**: The documents discuss different types of memory (sensory, short-term, long-term) and their relevance to LLMs. The challenge of finite context length in LLMs is emphasized, as it limits the ability to retain and utilize historical information effectively. Techniques like external memory storage and vector databases are suggested to mitigate these limitations.
6. **Challenges and Limitations**: Several challenges are identified, including the reliability of natural language interfaces, difficulties in long-term planning, and the need for robust task decomposition. The documents note that LLMs may struggle with unexpected errors and formatting issues, which can hinder their performance in real-world applications.
7. **Emerging Applications**: The potential applications of LLM-powered agents are explored, including scientific discovery, autonomous design, and interactive simulations (e.g., generative agents mimicking human behavior). These applications demonstrate the versatility and innovative possibilities of LLMs in various domains.
Overall, the documents present a comprehensive overview of the current state of LLM-powered autonomous agents, highlighting their capabilities, methodologies, and the challenges they face in practical implementations.
流式传输
请注意,我们还可以逐个令牌地流式传输结果
const stream = await chain.stream({ context: docs });
for await (const token of stream) {
process.stdout.write(token + "|");
}
|The| retrieved| documents| discuss| the| development| and| capabilities| of| autonomous| agents| powered| by| large| language| models| (|LL|Ms|).| Here| are| the| main| themes|:
|1|.| **|LL|M| as| a| Core| Controller|**|:| L|LM|s| are| positioned| as| the| central| intelligence| in| autonomous| agent| systems|,| capable| of| performing| complex| tasks| beyond| simple| text| generation|.| They| can| be| framed| as| general| problem| sol|vers|,| with| various| implementations| like| Auto|GPT|,| GPT|-|Engineer|,| and| Baby|AG|I| serving| as| proof|-of|-con|cept| demonstrations|.
|2|.| **|Task| De|composition| and| Planning|**|:| Effective| task| management| is| crucial| for| L|LM|s| to| handle| complicated| tasks|.| Techniques| like| Chain| of| Thought| (|Co|T|)| and| Tree| of| Thoughts| (|To|T|)| are| highlighted| for| breaking| down| tasks| into| manageable| steps| and| exploring| multiple| reasoning| paths|.| Additionally|,| L|LM|+|P| integrates| classical| planning| methods| to| enhance| long|-term| planning| capabilities|.
|3|.| **|Self|-|Reflection| and| Learning|**|:| Self|-ref|lection| mechanisms| are| essential| for| agents| to| learn| from| past| actions| and| improve| their| decision|-making| processes|.| Framework|s| like| Re|Act| and| Reflex|ion| incorporate| dynamic| memory| and| self|-ref|lection| to| refine| reasoning| skills| and| enhance| performance| through| iterative| learning|.
|4|.| **|Tool| Util|ization|**|:| The| integration| of| external| tools| significantly| extends| the| capabilities| of| L|LM|s|.| Appro|aches| like| MR|KL| and| Tool|former| demonstrate| how| L|LM|s| can| be| augmented| with| various| APIs| to| perform| specialized| tasks|,| enhancing| their| functionality| in| real|-world| applications|.
|5|.| **|Memory| and| Context| Management|**|:| The| documents| discuss| different| types| of| memory| (|sens|ory|,| short|-term|,| long|-term|)| and| their| relevance| to| L|LM|s|.| The| challenge| of| finite| context| length| is| emphasized|,| as| it| limits| the| model|'s| ability| to| retain| and| utilize| historical| information| effectively|.| Techniques| like| vector| stores| and| approximate| nearest| neighbors| (|ANN|)| are| suggested| to| improve| retrieval| speed| and| memory| management|.
|6|.| **|Challenges| and| Limit|ations|**|:| Several| limitations| of| current| L|LM|-powered| agents| are| identified|,| including| issues| with| the| reliability| of| natural| language| interfaces|,| difficulties| in| long|-term| planning|,| and| the| need| for| improved| efficiency| in| task| execution|.| The| documents| also| highlight| the| importance| of| human| feedback| in| refining| model| outputs| and| addressing| potential| biases|.
|7|.| **|Emer|ging| Applications|**|:| The| potential| applications| of| L|LM|-powered| agents| are| explored|,| including| scientific| discovery|,| autonomous| design|,| and| interactive| simulations| (|e|.g|.,| gener|ative| agents|).| These| applications| showcase| the| versatility| of| L|LM|s| in| various| domains|,| from| drug| discovery| to| social| behavior| simulations|.
|Overall|,| the| documents| present| a| comprehensive| overview| of| the| current| state| of| L|LM|-powered| autonomous| agents|,| their| capabilities|,| methodologies| for| improvement|,| and| the| challenges| they| face| in| practical| applications|.|||
深入了解
- 您可以轻松自定义提示。
- 您可以轻松尝试不同的 LLM(例如,通过
llm
参数使用 Claude)。
Map-Reduce:通过并行化总结长文本
让我们解开 map reduce 方法。为此,我们将首先使用 LLM 将每个文档映射到单独的摘要。然后,我们将把这些摘要 reduce 或合并为单个全局摘要。
请注意,map 步骤通常在输入文档上并行化。
LangGraph 构建于 @langchain/core 之上,支持 map-reduce 工作流程,非常适合此问题
- LangGraph 允许流式传输各个步骤(例如连续摘要),从而可以更好地控制执行;
- LangGraph 的 checkpointing 支持错误恢复、扩展到人工参与的工作流程,并且更容易集成到对话应用程序中。
- LangGraph 实现易于修改和扩展,我们将在下面看到。
Map
让我们首先定义与 map 步骤关联的提示。我们可以使用与上面的 stuff
方法相同的摘要提示
import { ChatPromptTemplate } from "@langchain/core/prompts";
const mapPrompt = ChatPromptTemplate.fromMessages([
["user", "Write a concise summary of the following: \n\n{context}"],
]);
我们还可以使用 Prompt Hub 来存储和获取提示。
这将与您的 LangSmith API 密钥 一起使用。
例如,请参阅 此处 的 map 提示。
import { pull } from "langchain/hub";
import { ChatPromptTemplate } from "@langchain/core/prompts";
const mapPrompt = (await pull) < ChatPromptTemplate > "rlm/map-prompt";
Reduce
我们还定义了一个提示,该提示接受文档映射结果并将它们 reduce 为单个输出。
// Also available via the hub at `rlm/reduce-prompt`
let reduceTemplate = `
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary
of the main themes.
`;
const reducePrompt = ChatPromptTemplate.fromMessages([
["user", reduceTemplate],
]);
通过 LangGraph 进行编排
下面我们实现一个简单的应用程序,该应用程序在文档列表上映射摘要步骤,然后使用上述提示 reduce 它们。
当文本相对于 LLM 的上下文窗口较长时,Map-reduce 流特别有用。对于长文本,我们需要一种机制来确保 reduce 步骤中要总结的上下文不超过模型的上下文窗口大小。在这里,我们实现了摘要的递归“折叠”:输入根据令牌限制进行分区,并生成分区摘要。重复此步骤,直到摘要的总长度在所需的限制范围内,从而可以总结任意长度的文本。
首先,我们将博文分块为较小的“子文档”以进行映射
import { TokenTextSplitter } from "@langchain/textsplitters";
const textSplitter = new TokenTextSplitter({
chunkSize: 1000,
chunkOverlap: 0,
});
const splitDocs = await textSplitter.splitDocuments(docs);
console.log(`Generated ${splitDocs.length} documents.`);
Generated 6 documents.
接下来,我们定义我们的图。请注意,我们定义了一个人为的低最大令牌长度 1,000 个令牌,以说明“折叠”步骤。
import {
collapseDocs,
splitListOfDocs,
} from "langchain/chains/combine_documents/reduce";
import { Document } from "@langchain/core/documents";
import { StateGraph, Annotation, Send } from "@langchain/langgraph";
let tokenMax = 1000;
async function lengthFunction(documents) {
const tokenCounts = await Promise.all(
documents.map(async (doc) => {
return llm.getNumTokens(doc.pageContent);
})
);
return tokenCounts.reduce((sum, count) => sum + count, 0);
}
const OverallState = Annotation.Root({
contents: Annotation<string[]>,
// Notice here we pass a reducer function.
// This is because we want combine all the summaries we generate
// from individual nodes back into one list. - this is essentially
// the "reduce" part
summaries: Annotation<string[]>({
reducer: (state, update) => state.concat(update),
}),
collapsedSummaries: Annotation<Document[]>,
finalSummary: Annotation<string>,
});
// This will be the state of the node that we will "map" all
// documents to in order to generate summaries
interface SummaryState {
content: string;
}
// Here we generate a summary, given a document
const generateSummary = async (
state: SummaryState
): Promise<{ summaries: string[] }> => {
const prompt = await mapPrompt.invoke({ context: state.content });
const response = await llm.invoke(prompt);
return { summaries: [String(response.content)] };
};
// Here we define the logic to map out over the documents
// We will use this an edge in the graph
const mapSummaries = (state: typeof OverallState.State) => {
// We will return a list of `Send` objects
// Each `Send` object consists of the name of a node in the graph
// as well as the state to send to that node
return state.contents.map(
(content) => new Send("generateSummary", { content })
);
};
const collectSummaries = async (state: typeof OverallState.State) => {
return {
collapsedSummaries: state.summaries.map(
(summary) => new Document({ pageContent: summary })
),
};
};
async function _reduce(input) {
const prompt = await reducePrompt.invoke({ docs: input });
const response = await llm.invoke(prompt);
return String(response.content);
}
// Add node to collapse summaries
const collapseSummaries = async (state: typeof OverallState.State) => {
const docLists = splitListOfDocs(
state.collapsedSummaries,
lengthFunction,
tokenMax
);
const results = [];
for (const docList of docLists) {
results.push(await collapseDocs(docList, _reduce));
}
return { collapsedSummaries: results };
};
// This represents a conditional edge in the graph that determines
// if we should collapse the summaries or not
async function shouldCollapse(state: typeof OverallState.State) {
let numTokens = await lengthFunction(state.collapsedSummaries);
if (numTokens > tokenMax) {
return "collapseSummaries";
} else {
return "generateFinalSummary";
}
}
// Here we will generate the final summary
const generateFinalSummary = async (state: typeof OverallState.State) => {
const response = await _reduce(state.collapsedSummaries);
return { finalSummary: response };
};
// Construct the graph
const graph = new StateGraph(OverallState)
.addNode("generateSummary", generateSummary)
.addNode("collectSummaries", collectSummaries)
.addNode("collapseSummaries", collapseSummaries)
.addNode("generateFinalSummary", generateFinalSummary)
.addConditionalEdges("__start__", mapSummaries, ["generateSummary"])
.addEdge("generateSummary", "collectSummaries")
.addConditionalEdges("collectSummaries", shouldCollapse, [
"collapseSummaries",
"generateFinalSummary",
])
.addConditionalEdges("collapseSummaries", shouldCollapse, [
"collapseSummaries",
"generateFinalSummary",
])
.addEdge("generateFinalSummary", "__end__");
const app = graph.compile();
LangGraph 允许绘制图结构以帮助可视化其功能
// Note: tslab only works inside a jupyter notebook. Don't worry about running this code yourself!
import * as tslab from "tslab";
const image = await app.getGraph().drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();
await tslab.display.png(new Uint8Array(arrayBuffer));
运行应用程序时,我们可以流式传输图以观察其步骤序列。下面,我们将简单地打印出步骤的名称。
请注意,由于我们在图中有一个循环,因此指定执行的 recursion_limit 可能会有所帮助。当超过指定的限制时,这将引发特定的错误。
let finalSummary = null;
for await (const step of await app.stream(
{ contents: splitDocs.map((doc) => doc.pageContent) },
{ recursionLimit: 10 }
)) {
console.log(Object.keys(step));
if (step.hasOwnProperty("generateFinalSummary")) {
finalSummary = step.generateFinalSummary;
}
}
[ 'generateSummary' ]
[ 'generateSummary' ]
[ 'generateSummary' ]
[ 'generateSummary' ]
[ 'generateSummary' ]
[ 'generateSummary' ]
[ 'collectSummaries' ]
[ 'generateFinalSummary' ]
finalSummary;
{
finalSummary: 'The summaries highlight the evolving landscape of large language models (LLMs) and their integration into autonomous agents and various applications. Key themes include:\n' +
'\n' +
'1. **Autonomous Agents and LLMs**: Projects like AutoGPT and GPT-Engineer demonstrate the potential of LLMs as core controllers in autonomous systems, utilizing techniques such as Chain of Thought (CoT) and Tree of Thoughts (ToT) for task management and reasoning. These agents can learn from past actions through self-reflection mechanisms, enhancing their problem-solving capabilities.\n' +
'\n' +
'2. **Supervised Fine-Tuning and Human Feedback**: The importance of human feedback in fine-tuning models is emphasized, with methods like Algorithm Distillation (AD) showing promise in improving model performance while preventing overfitting. The integration of various memory types and external memory systems is suggested to enhance cognitive capabilities.\n' +
'\n' +
'3. **Integration of External Tools**: The incorporation of external tools and APIs significantly extends LLM capabilities, particularly in specialized tasks like maximum inner-product search (MIPS) and domain-specific applications such as ChemCrow for drug discovery. Frameworks like MRKL and HuggingGPT illustrate the potential for LLMs to effectively utilize these tools.\n' +
'\n' +
'4. **Evaluation Discrepancies**: There are notable discrepancies between LLM-based assessments and expert evaluations, indicating that LLMs may struggle with specialized knowledge. This raises concerns about their reliability in critical applications, such as scientific discovery.\n' +
'\n' +
'5. **Limitations of LLMs**: Despite advancements, LLMs face limitations, including finite context lengths, challenges in long-term planning, and difficulties in adapting to unexpected errors. These constraints hinder their robustness compared to human capabilities.\n' +
'\n' +
'Overall, the advancements in LLMs and their applications reveal both their potential and limitations, emphasizing the need for ongoing research and development to enhance their effectiveness in various domains.'
}
在相应的 LangSmith 跟踪 中,我们可以看到各个 LLM 调用,这些调用在其各自的节点下分组。
深入了解
自定义
- 如上所示,您可以自定义 map 和 reduce 阶段的 LLM 和提示。
真实世界的用例
- 请参阅 此博文 案例研究,了解有关分析用户交互(关于 LangChain 文档的问题)!
- 该博文和相关的 repo 还介绍了聚类作为一种摘要手段。
- 这开辟了除
stuff
或map-reduce
方法之外的另一条值得考虑的路径。
下一步
我们鼓励您查看 操作指南,以了解更多详细信息,例如
和其他概念。