如何将值映射到数据库

在本指南中，我们将介绍通过将用户输入中的值映射到数据库来改进图数据库查询生成的策略。当使用内置图链时，LLM 知道图模式，但没有关于数据库中存储的属性值的信息。因此，我们可以在图数据库 QA 系统中引入一个新步骤来准确映射值。

设置

安装依赖项

提示

有关安装集成包的一般说明，请参阅此部分。

npm
yarn
pnpm

npm i langchain @langchain/community @langchain/openai @langchain/core neo4j-driver zod

yarn add langchain @langchain/community @langchain/openai @langchain/core neo4j-driver zod

pnpm add langchain @langchain/community @langchain/openai @langchain/core neo4j-driver zod

设置环境变量

在此示例中，我们将使用 OpenAI

OPENAI_API_KEY=your-api-key

# Optional, use LangSmith for best-in-class observability
LANGSMITH_API_KEY=your-api-key
LANGSMITH_TRACING=true

# Reduce tracing latency if you are not in a serverless environment
# LANGCHAIN_CALLBACKS_BACKGROUND=true

接下来，我们需要定义 Neo4j 凭据。按照这些安装步骤设置 Neo4j 数据库。

NEO4J_URI="bolt://:7687"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="password"

以下示例将创建一个与 Neo4j 数据库的连接，并将使用有关电影及其演员的示例数据填充它。

import "neo4j-driver";
import { Neo4jGraph } from "@langchain/community/graphs/neo4j_graph";

const url = process.env.NEO4J_URI;
const username = process.env.NEO4J_USER;
const password = process.env.NEO4J_PASSWORD;
const graph = await Neo4jGraph.initialize({ url, username, password });

// Import movie information
const moviesQuery = `LOAD CSV WITH HEADERS FROM 
'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'
AS row
MERGE (m:Movie {id:row.movieId})
SET m.released = date(row.released),
    m.title = row.title,
    m.imdbRating = toFloat(row.imdbRating)
FOREACH (director in split(row.director, '|') | 
    MERGE (p:Person {name:trim(director)})
    MERGE (p)-[:DIRECTED]->(m))
FOREACH (actor in split(row.actors, '|') | 
    MERGE (p:Person {name:trim(actor)})
    MERGE (p)-[:ACTED_IN]->(m))
FOREACH (genre in split(row.genres, '|') | 
    MERGE (g:Genre {name:trim(genre)})
    MERGE (m)-[:IN_GENRE]->(g))`;

await graph.query(moviesQuery);

Schema refreshed successfully.

[]

检测用户输入中的实体

我们必须提取要映射到图数据库的实体/值的类型。在此示例中，我们正在处理电影图，因此我们可以将电影和人物映射到数据库。

import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const llm = new ChatOpenAI({ model: "gpt-3.5-turbo", temperature: 0 });

const entitySchema = z
  .object({
    names: z
      .array(z.string())
      .describe("All the person or movies appearing in the text"),
  })
  .describe("Identifying information about entities.");

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are extracting person and movies from the text."],
  [
    "human",
    "Use the given format to extract information from the following\ninput: {question}",
  ],
]);

const entityChain = prompt.pipe(llm.withStructuredOutput(entitySchema));

我们可以测试实体提取链。

const entities = await entityChain.invoke({
  question: "Who played in Casino movie?",
});
entities;

{ names: [ "Casino" ] }

我们将利用简单的 CONTAINS 子句将实体与数据库匹配。在实践中，您可能需要使用模糊搜索或全文索引以允许轻微的拼写错误。

const matchQuery = `
MATCH (p:Person|Movie)
WHERE p.name CONTAINS $value OR p.title CONTAINS $value
RETURN coalesce(p.name, p.title) AS result, labels(p)[0] AS type
LIMIT 1`;

const matchToDatabase = async (values) => {
  let result = "";
  for (const entity of values.names) {
    const response = await graph.query(matchQuery, {
      value: entity,
    });
    if (response.length > 0) {
      result += `${entity} maps to ${response[0]["result"]} ${response[0]["type"]} in database\n`;
    }
  }
  return result;
};

await matchToDatabase(entities);

"Casino maps to Casino Movie in database\n"

自定义 Cypher 生成链

我们需要定义一个自定义 Cypher 提示，该提示将实体映射信息以及模式和用户问题一起使用，以构建 Cypher 语句。我们将使用 LangChain 表达式语言来完成这项任务。

import { StringOutputParser } from "@langchain/core/output_parsers";
import {
  RunnablePassthrough,
  RunnableSequence,
} from "@langchain/core/runnables";

// Generate Cypher statement based on natural language input
const cypherTemplate = `Based on the Neo4j graph schema below, write a Cypher query that would answer the user's question:
{schema}
Entities in the question map to the following database values:
{entities_list}
Question: {question}
Cypher query:`;

const cypherPrompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "Given an input question, convert it to a Cypher query. No pre-amble.",
  ],
  ["human", cypherTemplate],
]);

const llmWithStop = llm.bind({ stop: ["\nCypherResult:"] });

const cypherResponse = RunnableSequence.from([
  RunnablePassthrough.assign({ names: entityChain }),
  RunnablePassthrough.assign({
    entities_list: async (x) => matchToDatabase(x.names),
    schema: async (_) => graph.getSchema(),
  }),
  cypherPrompt,
  llmWithStop,
  new StringOutputParser(),
]);

const cypher = await cypherResponse.invoke({
  question: "Who played in Casino movie?",
});
cypher;

'MATCH (:Movie {title: "Casino"})<-[:ACTED_IN]-(actor)\nRETURN actor.name'

如何将值映射到数据库

设置

安装依赖项

设置环境变量

检测用户输入中的实体

自定义 Cypher 生成链

此页内容对您有帮助吗？

您也可以留下详细的反馈在 GitHub 上.

设置​

安装依赖项​

设置环境变量​

检测用户输入中的实体​

自定义 Cypher 生成链​

此页内容对您有帮助吗？

您也可以留下详细的反馈 在 GitHub 上.

设置

安装依赖项

设置环境变量

检测用户输入中的实体

自定义 Cypher 生成链

您也可以留下详细的反馈在 GitHub 上.