如何在不使用函数调用的情况下进行提取

先决条件

本指南假设您熟悉以下内容

提取

能够很好地遵循提示指令的 LLM 可以被赋予在不使用函数调用的情况下以给定格式输出信息的任務。

这种方法依赖于设计好的提示，然后解析 LLM 的输出以使其能够很好地提取信息，尽管它缺乏函数调用或 JSON 模式提供的一些保证。

在这里，我们将使用 Claude，它非常擅长遵循指令！查看此处了解有关 Anthropic 模型的更多信息。

首先，我们将安装集成包

提示

查看此部分了解有关安装集成包的一般说明。

npm
yarn
pnpm

npm i @langchain/anthropic zod zod-to-json-schema

yarn add @langchain/anthropic zod zod-to-json-schema

pnpm add @langchain/anthropic zod zod-to-json-schema

import { ChatAnthropic } from "@langchain/anthropic";

const model = new ChatAnthropic({
  model: "claude-3-sonnet-20240229",
  temperature: 0,
});

提示

所有与提取质量相关的相同考虑因素适用于解析方法。

本教程旨在简单易懂，但通常应该真正包含参考示例以提高性能！

使用 StructuredOutputParser

以下示例使用内置的StructuredOutputParser 解析聊天模型的输出。我们使用解析器中包含的内置提示格式化指令。

import { z } from "zod";
import { StructuredOutputParser } from "langchain/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const personSchema = z
  .object({
    name: z.optional(z.string()).describe("The name of the person"),
    hair_color: z
      .optional(z.string())
      .describe("The color of the person's hair, if known"),
    height_in_meters: z
      .optional(z.string())
      .describe("Height measured in meters"),
  })
  .describe("Information about a person.");

const parser = StructuredOutputParser.fromZodSchema(personSchema);

const prompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "Answer the user query. Wrap the output in `json` tags\n{format_instructions}",
  ],
  ["human", "{query}"],
]);

const partialedPrompt = await prompt.partial({
  format_instructions: parser.getFormatInstructions(),
});

让我们看看什么信息被发送到模型

const query = "Anna is 23 years old and she is 6 feet tall";

const promptValue = await partialedPrompt.invoke({ query });

console.log(promptValue.toChatMessages());

[
  SystemMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Answer the user query. Wrap the output in `json` tags\n" +
        "You must format your output as a JSON value th"... 1444 more characters,
      additional_kwargs: {}
    },
    lc_namespace: [ "langchain_core", "messages" ],
    content: "Answer the user query. Wrap the output in `json` tags\n" +
      "You must format your output as a JSON value th"... 1444 more characters,
    name: undefined,
    additional_kwargs: {}
  },
  HumanMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Anna is 23 years old and she is 6 feet tall",
      additional_kwargs: {}
    },
    lc_namespace: [ "langchain_core", "messages" ],
    content: "Anna is 23 years old and she is 6 feet tall",
    name: undefined,
    additional_kwargs: {}
  }
]

const chain = partialedPrompt.pipe(model).pipe(parser);

await chain.invoke({ query });

{ name: "Anna", hair_color: "", height_in_meters: "1.83" }

自定义解析

您还可以使用LangChain 和LCEL 创建自定义提示和解析器。

您可以使用原始函数解析来自模型的输出。

在以下示例中，我们将模式作为 JSON 模式传递到提示中。为了方便起见，我们将使用 Zod 声明我们的模式，然后使用zod-to-json-schema 实用程序将其转换为 JSON 模式。

import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

const personSchema = z
  .object({
    name: z.optional(z.string()).describe("The name of the person"),
    hair_color: z
      .optional(z.string())
      .describe("The color of the person's hair, if known"),
    height_in_meters: z
      .optional(z.string())
      .describe("Height measured in meters"),
  })
  .describe("Information about a person.");

const peopleSchema = z.object({
  people: z.array(personSchema),
});

const SYSTEM_PROMPT_TEMPLATE = [
  "Answer the user's query. You must return your answer as JSON that matches the given schema:",
  "```json\n{schema}\n```.",
  "Make sure to wrap the answer in ```json and ``` tags. Conform to the given schema exactly.",
].join("\n");

const prompt = ChatPromptTemplate.fromMessages([
  ["system", SYSTEM_PROMPT_TEMPLATE],
  ["human", "{query}"],
]);

const extractJsonFromOutput = (message) => {
  const text = message.content;

  // Define the regular expression pattern to match JSON blocks
  const pattern = /```json\s*((.|\n)*?)\s*```/gs;

  // Find all non-overlapping matches of the pattern in the string
  const matches = pattern.exec(text);

  if (matches && matches[1]) {
    try {
      return JSON.parse(matches[1].trim());
    } catch (error) {
      throw new Error(`Failed to parse: ${matches[1]}`);
    }
  } else {
    throw new Error(`No JSON found in: ${message}`);
  }
};

const query = "Anna is 23 years old and she is 6 feet tall";

const promptValue = await prompt.invoke({
  schema: zodToJsonSchema(peopleSchema),
  query,
});

promptValue.toString();

"System: Answer the user's query. You must return your answer as JSON that matches the given schema:\n"... 170 more characters

const chain = prompt.pipe(model).pipe(extractJsonFromOutput);

await chain.invoke({
  schema: zodToJsonSchema(peopleSchema),
  query,
});

{ name: "Anna", age: 23, height: { feet: 6, inches: 0 } }

下一步

您现在已经了解了如何在不使用工具调用的情况下执行提取。

接下来，查看本节中的一些其他指南，例如有关如何使用示例提高提取质量的一些提示。

如何在不使用函数调用的情况下进行提取

使用 StructuredOutputParser

自定义解析

下一步

此页面是否有用？

您也可以留下详细的反馈在 GitHub 上.

使用 StructuredOutputParser​

自定义解析​

下一步​

此页面是否有用？

您也可以留下详细的反馈 在 GitHub 上.

使用 StructuredOutputParser

自定义解析

下一步

您也可以留下详细的反馈在 GitHub 上.