如何从问答链中流式传输

先决条件

本指南假定您熟悉以下内容

检索增强生成

在问答应用程序中，向用户展示用于生成答案的来源通常很重要。最简单的方法是让链返回在每次生成中检索到的文档。

我们将使用 Lilian Weng 的 LLM 驱动的自主代理博客文章作为本笔记本的检索内容。

设置

依赖项

在本演练中，我们将使用 OpenAI 聊天模型和嵌入以及内存向量存储，但此处显示的所有内容都适用于任何 ChatModel 或 LLM、嵌入以及 VectorStore 或检索器。

我们将使用以下包

npm install --save langchain @langchain/openai cheerio

我们需要设置环境变量 OPENAI_API_KEY

export OPENAI_API_KEY=YOUR_KEY

LangSmith

您使用 LangChain 构建的许多应用程序将包含多个步骤，这些步骤将多次调用 LLM。随着这些应用程序变得越来越复杂，能够检查链或代理内部到底发生了什么变得至关重要。执行此操作的最佳方法是使用 LangSmith。

请注意，LangSmith 不是必需的，但它很有用。如果您确实要使用 LangSmith，在您在上面的链接中注册后，请确保设置环境变量以开始记录跟踪

export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=YOUR_KEY

# Reduce tracing latency if you are not in a serverless environment
# export LANGCHAIN_CALLBACKS_BACKGROUND=true

带有来源的链

这是一个带有来源的 Q&A 应用程序，我们是在 LLM 驱动的自主代理博客文章中使用 Lilian Weng 的文章在返回来源指南中构建的。

import "cheerio";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { pull } from "langchain/hub";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { formatDocumentsAsString } from "langchain/util/document";
import {
  RunnableSequence,
  RunnablePassthrough,
  RunnableMap,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";

const loader = new CheerioWebBaseLoader(
  "https://lilianweng.github.io/posts/2023-06-23-agent/"
);

const docs = await loader.load();

const textSplitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});
const splits = await textSplitter.splitDocuments(docs);
const vectorStore = await MemoryVectorStore.fromDocuments(
  splits,
  new OpenAIEmbeddings()
);

// Retrieve and generate using the relevant snippets of the blog.
const retriever = vectorStore.asRetriever();
const prompt = await pull<ChatPromptTemplate>("rlm/rag-prompt");
const llm = new ChatOpenAI({ model: "gpt-3.5-turbo", temperature: 0 });

const ragChainFromDocs = RunnableSequence.from([
  RunnablePassthrough.assign({
    context: (input) => formatDocumentsAsString(input.context),
  }),
  prompt,
  llm,
  new StringOutputParser(),
]);

let ragChainWithSource = new RunnableMap({
  steps: { context: retriever, question: new RunnablePassthrough() },
});
ragChainWithSource = ragChainWithSource.assign({ answer: ragChainFromDocs });

await ragChainWithSource.invoke("What is Task Decomposition");

{
  question: "What is Task Decomposition",
  context: [
    Document {
      pageContent: "Fig. 1. Overview of a LLM-powered autonomous agent system.\n" +
        "Component One: Planning#\n" +
        "A complicated ta"... 898 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: 'Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are'... 887 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: "Agent System Overview\n" +
        "                \n" +
        "                    Component One: Planning\n" +
        "                 "... 850 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: "Resources:\n" +
        "1. Internet access for searches and information gathering.\n" +
        "2. Long Term memory management"... 456 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    }
  ],
  answer: "Task decomposition is a technique used to break down complex tasks into smaller and simpler steps fo"... 230 more characters
}

让我们看看这个提示实际是什么样的。您也可以在 LangChain 提示中心中查看它

console.log(prompt.promptMessages.map((msg) => msg.prompt.template).join("\n"));

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:

流式传输最终输出

使用 LCEL，我们可以流式传输生成时的输出

for await (const chunk of await ragChainWithSource.stream(
  "What is task decomposition?"
)) {
  console.log(chunk);
}

{ question: "What is task decomposition?" }
{
  context: [
    Document {
      pageContent: "Fig. 1. Overview of a LLM-powered autonomous agent system.\n" +
        "Component One: Planning#\n" +
        "A complicated ta"... 898 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: 'Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are'... 887 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: "Agent System Overview\n" +
        "                \n" +
        "                    Component One: Planning\n" +
        "                 "... 850 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    },
    Document {
      pageContent: "(3) Task execution: Expert models execute on the specific tasks and log results.\n" +
        "Instruction:\n" +
        "\n" +
        "With "... 539 more characters,
      metadata: {
        source: "https://lilianweng.github.io/posts/2023-06-23-agent/",
        loc: { lines: [Object] }
      }
    }
  ]
}
{ answer: "" }
{ answer: "Task" }
{ answer: " decomposition" }
{ answer: " is" }
{ answer: " a" }
{ answer: " technique" }
{ answer: " used" }
{ answer: " to" }
{ answer: " break" }
{ answer: " down" }
{ answer: " complex" }
{ answer: " tasks" }
{ answer: " into" }
{ answer: " smaller" }
{ answer: " and" }
{ answer: " simpler" }
{ answer: " steps" }
{ answer: "." }
{ answer: " It" }
{ answer: " can" }
{ answer: " be" }
{ answer: " done" }
{ answer: " through" }
{ answer: " various" }
{ answer: " methods" }
{ answer: " such" }
{ answer: " as" }
{ answer: " using" }
{ answer: " prompting" }
{ answer: " techniques" }
{ answer: "," }
{ answer: " task" }
{ answer: "-specific" }
{ answer: " instructions" }
{ answer: "," }
{ answer: " or" }
{ answer: " human" }
{ answer: " inputs" }
{ answer: "." }
{ answer: " Another" }
{ answer: " approach" }
{ answer: " involves" }
{ answer: " outsourcing" }
{ answer: " the" }
{ answer: " planning" }
{ answer: " step" }
{ answer: " to" }
{ answer: " an" }
{ answer: " external" }
{ answer: " classical" }
{ answer: " planner" }
{ answer: "." }
{ answer: "" }

我们可以添加一些逻辑来编译正在返回的流

const output = {};
let currentKey: string | null = null;

for await (const chunk of await ragChainWithSource.stream(
  "What is task decomposition?"
)) {
  for (const key of Object.keys(chunk)) {
    if (output[key] === undefined) {
      output[key] = chunk[key];
    } else {
      output[key] += chunk[key];
    }

    if (key !== currentKey) {
      console.log(`\n\n${key}: ${JSON.stringify(chunk[key])}`);
    } else {
      console.log(chunk[key]);
    }
    currentKey = key;
  }
}

question: "What is task decomposition?"


context: [{"pageContent":"Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.","metadata":{"source":"https://lilianweng.github.io/posts/2023-06-23-agent/","loc":{"lines":{"from":176,"to":181}}}},{"pageContent":"Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\nSelf-Reflection#","metadata":{"source":"https://lilianweng.github.io/posts/2023-06-23-agent/","loc":{"lines":{"from":182,"to":184}}}},{"pageContent":"Agent System Overview\n                \n                    Component One: Planning\n                        \n                \n                    Task Decomposition\n                \n                    Self-Reflection\n                \n                \n                    Component Two: Memory\n                        \n                \n                    Types of Memory\n                \n                    Maximum Inner Product Search (MIPS)\n                \n                \n                    Component Three: Tool Use\n                \n                    Case Studies\n                        \n                \n                    Scientific Discovery Agent\n                \n                    Generative Agents Simulation\n                \n                    Proof-of-Concept Examples\n                \n                \n                    Challenges\n                \n                    Citation\n                \n                    References","metadata":{"source":"https://lilianweng.github.io/posts/2023-06-23-agent/","loc":{"lines":{"from":112,"to":146}}}},{"pageContent":"(3) Task execution: Expert models execute on the specific tasks and log results.\nInstruction:\n\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.","metadata":{"source":"https://lilianweng.github.io/posts/2023-06-23-agent/","loc":{"lines":{"from":277,"to":280}}}}]


answer: ""
Task
 decomposition
 is
 a
 technique
 used
 to
 break
 down
 complex
 tasks
 into
 smaller
 and
 simpler
 steps
.
 It
 can
 be
 done
 through
 various
 methods
 such
 as
 using
 prompting
 techniques
,
 task
-specific
 instructions
,
 or
 human
 inputs
.
 Another
 approach
 involves
 outsourcing
 the
 planning
 step
 to
 an
 external
 classical
 planner
.

"answer"

下一步

您现在已经了解了如何从 QA 链流式传输响应。

接下来，查看有关 RAG 的其他一些操作指南，例如如何添加聊天历史记录。

如何从问答链中流式传输

设置

依赖项

LangSmith

带有来源的链

流式传输最终输出

下一步

此页面是否有用？

您也可以在 GitHub 上留下详细的反馈 GitHub.

设置​

依赖项​

LangSmith​

带有来源的链​

流式传输最终输出​

下一步​

此页面是否有用？

您也可以在 GitHub 上留下详细的反馈 GitHub.

设置

依赖项

LangSmith

带有来源的链

流式传输最终输出

下一步