Neo4j 向量索引

Neo4j 是一个开源图数据库，它支持向量相似性搜索。它支持

近似最近邻搜索
欧几里得相似度和余弦相似度
结合向量和关键字搜索的混合搜索

设置

要使用 Neo4j 向量索引，你需要安装 neo4j-driver 包

npm
Yarn
pnpm

npm install neo4j-driver

yarn add neo4j-driver

pnpm add neo4j-driver

提示

查看此部分以获取有关安装集成包的一般说明。

npm
Yarn
pnpm

npm install @langchain/openai @langchain/community

yarn add @langchain/openai @langchain/community

pnpm add @langchain/openai @langchain/community

使用 `docker-compose` 设置一个自托管的 `Neo4j` 实例

Neo4j 提供了一个预构建的 Docker 镜像，可用于快速设置一个自托管的 Neo4j 数据库实例。在下面创建一个名为 docker-compose.yml 的文件

export default {services:{database:{image:'neo4j',ports:['7687:7687','7474:7474'],environment:['NEO4J_AUTH=neo4j/pleaseletmein']}}};

API 参考

然后在同一个目录中运行 docker compose up 来启动容器。

你可以在他们的网站上找到有关如何设置 Neo4j 的更多信息。

用法

以下是一个使用 Neo4jVectorStore 的完整示例

import { OpenAIEmbeddings } from "@langchain/openai";
import { Neo4jVectorStore } from "@langchain/community/vectorstores/neo4j_vector";

// Configuration object for Neo4j connection and other related settings
const config = {
  url: "bolt://127.0.0.1:7687", // URL for the Neo4j instance
  username: "neo4j", // Username for Neo4j authentication
  password: "pleaseletmein", // Password for Neo4j authentication
  indexName: "vector", // Name of the vector index
  keywordIndexName: "keyword", // Name of the keyword index if using hybrid search
  searchType: "vector" as const, // Type of search (e.g., vector, hybrid)
  nodeLabel: "Chunk", // Label for the nodes in the graph
  textNodeProperty: "text", // Property of the node containing text
  embeddingNodeProperty: "embedding", // Property of the node containing embedding
};

const documents = [
  { pageContent: "what's this", metadata: { a: 2 } },
  { pageContent: "Cat drinks milk", metadata: { a: 1 } },
];

const neo4jVectorIndex = await Neo4jVectorStore.fromDocuments(
  documents,
  new OpenAIEmbeddings(),
  config
);

const results = await neo4jVectorIndex.similaritySearch("water", 1);

console.log(results);

/*
  [ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]
*/

await neo4jVectorIndex.close();

API 参考

OpenAIEmbeddings 来自 @langchain/openai
Neo4jVectorStore 来自 @langchain/community/vectorstores/neo4j_vector

使用 retrievalQuery 参数来自定义响应

import { OpenAIEmbeddings } from "@langchain/openai";
import { Neo4jVectorStore } from "@langchain/community/vectorstores/neo4j_vector";

/*
 * The retrievalQuery is a customizable Cypher query fragment used in the Neo4jVectorStore class to define how
 * search results should be retrieved and presented from the Neo4j database. It allows developers to specify
 * the format and structure of the data returned after a similarity search.
 * Mandatory columns for `retrievalQuery`:
 *
 * 1. text:
 *    - Description: Represents the textual content of the node.
 *    - Type: String
 *
 * 2. score:
 *    - Description: Represents the similarity score of the node in relation to the search query. A
 *      higher score indicates a closer match.
 *    - Type: Float (ranging between 0 and 1, where 1 is a perfect match)
 *
 * 3. metadata:
 *    - Description: Contains additional properties and information about the node. This can include
 *      any other attributes of the node that might be relevant to the application.
 *    - Type: Object (key-value pairs)
 *    - Example: { "id": "12345", "category": "Books", "author": "John Doe" }
 *
 * Note: While you can customize the `retrievalQuery` to fetch additional columns or perform
 * transformations, never omit the mandatory columns. The names of these columns (`text`, `score`,
 * and `metadata`) should remain consistent. Renaming them might lead to errors or unexpected behavior.
 */

// Configuration object for Neo4j connection and other related settings
const config = {
  url: "bolt://127.0.0.1:7687", // URL for the Neo4j instance
  username: "neo4j", // Username for Neo4j authentication
  password: "pleaseletmein", // Password for Neo4j authentication
  retrievalQuery: `
    RETURN node.text AS text, score, {a: node.a * 2} AS metadata
  `,
};

const documents = [
  { pageContent: "what's this", metadata: { a: 2 } },
  { pageContent: "Cat drinks milk", metadata: { a: 1 } },
];

const neo4jVectorIndex = await Neo4jVectorStore.fromDocuments(
  documents,
  new OpenAIEmbeddings(),
  config
);

const results = await neo4jVectorIndex.similaritySearch("water", 1);

console.log(results);

/*
  [ Document { pageContent: 'Cat drinks milk', metadata: { a: 2 } } ]
*/

await neo4jVectorIndex.close();

API 参考

OpenAIEmbeddings 来自 @langchain/openai
Neo4jVectorStore 来自 @langchain/community/vectorstores/neo4j_vector

从现有图中实例化 Neo4jVectorStore

import { OpenAIEmbeddings } from "@langchain/openai";
import { Neo4jVectorStore } from "@langchain/community/vectorstores/neo4j_vector";

/**
 * `fromExistingGraph` Method:
 *
 * Description:
 * This method initializes a `Neo4jVectorStore` instance using an existing graph in the Neo4j database.
 * It's designed to work with nodes that already have textual properties but might not have embeddings.
 * The method will compute and store embeddings for nodes that lack them.
 *
 * Note:
 * This method is particularly useful when you have a pre-existing graph with textual data and you want
 * to enhance it with vector embeddings for similarity searches without altering the original data structure.
 */

// Configuration object for Neo4j connection and other related settings
const config = {
  url: "bolt://127.0.0.1:7687", // URL for the Neo4j instance
  username: "neo4j", // Username for Neo4j authentication
  password: "pleaseletmein", // Password for Neo4j authentication
  indexName: "wikipedia",
  nodeLabel: "Wikipedia",
  textNodeProperties: ["title", "description"],
  embeddingNodeProperty: "embedding",
  searchType: "hybrid" as const,
};

// You should have a populated Neo4j database to use this method
const neo4jVectorIndex = await Neo4jVectorStore.fromExistingGraph(
  new OpenAIEmbeddings(),
  config
);

await neo4jVectorIndex.close();

API 参考

OpenAIEmbeddings 来自 @langchain/openai
Neo4jVectorStore 来自 @langchain/community/vectorstores/neo4j_vector

元数据过滤

import { OpenAIEmbeddings } from "@langchain/openai";
import { Neo4jVectorStore } from "@langchain/community/vectorstores/neo4j_vector";

/**
 * `similaritySearch` Method with Metadata Filtering:
 *
 * Description:
 * This method facilitates advanced similarity searches within a Neo4j vector index, leveraging both text embeddings and metadata attributes.
 * The third parameter, `filter`, allows for the specification of metadata-based conditions that pre-filter the nodes before performing the similarity search.
 * This approach enhances the search precision by allowing users to query based on complex metadata criteria alongside textual similarity.
 * Metadata filtering also support the following operators:
 *
 *  $eq: Equal
 *  $ne: Not Equal
 *  $lt: Less than
 *  $lte: Less than or equal
 *  $gt: Greater than
 *  $gte: Greater than or equal
 *  $in: In a list of values
 *  $nin: Not in a list of values
 *  $between: Between two values
 *  $like: Text contains value
 *  $ilike: lowered text contains value
 *
 * The filter supports a range of query operations such as equality checks, range queries, and compound conditions (using logical operators like $and, $or).
 * This makes it highly adaptable to varied use cases requiring detailed and specific retrieval of documents based on both content and contextual information.
 *
 * Note:
 * Effective use of this method requires a well-structured Neo4j database where nodes are enriched with both text and metadata properties.
 * The method is particularly useful in scenarios where the integration of text analysis with detailed metadata querying is crucial, such as in content recommendation systems, detailed archival searches, or any application where contextual relevance is key.
 */

// Configuration object for Neo4j connection and other related settings
const config = {
  url: "bolt://127.0.0.1:7687", // URL for the Neo4j instance
  username: "neo4j", // Username for Neo4j authentication
  password: "pleaseletmein", // Password for Neo4j authentication
  indexName: "vector", // Name of the vector index
  keywordIndexName: "keyword", // Name of the keyword index if using hybrid search
  searchType: "vector" as const, // Type of search (e.g., vector, hybrid)
  nodeLabel: "Chunk", // Label for the nodes in the graph
  textNodeProperty: "text", // Property of the node containing text
  embeddingNodeProperty: "embedding", // Property of the node containing embedding
};

const documents = [
  { pageContent: "what's this", metadata: { a: 2 } },
  { pageContent: "Cat drinks milk", metadata: { a: 1 } },
];

const neo4jVectorIndex = await Neo4jVectorStore.fromDocuments(
  documents,
  new OpenAIEmbeddings(),
  config
);

const filter = { a: { $eq: 1 } };
const results = await neo4jVectorIndex.similaritySearch("water", 1, { filter });

console.log(results);

/*
  [ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]
*/

await neo4jVectorIndex.close();

API 参考

OpenAIEmbeddings 来自 @langchain/openai
Neo4jVectorStore 来自 @langchain/community/vectorstores/neo4j_vector

免责声明 ⚠️

安全说明：确保数据库连接使用仅包含必要权限的严格范围的凭据。未能做到这一点可能会导致数据损坏或丢失，因为调用代码可能会尝试执行会导致删除、数据突变（如果适当提示）或读取敏感数据（如果数据库中存在此类数据）的命令。防止此类负面结果的最佳方法是（视情况而定）限制授予此工具使用的凭据的权限。例如，为数据库创建只读用户是确保调用代码无法修改或删除数据的良好方法。有关更多信息，请查看安全页面。

向量存储概念指南
向量存储操作指南

Neo4j 向量索引

设置

使用 `docker-compose` 设置一个自托管的 `Neo4j` 实例

API 参考

用法

API 参考

使用 retrievalQuery 参数来自定义响应

API 参考

从现有图中实例化 Neo4jVectorStore

API 参考

元数据过滤

API 参考

免责声明 ⚠️

此页面对您有帮助吗？

您也可以在 GitHub 上留下详细的反馈 GitHub.

Neo4j 向量索引

设置​

使用 docker-compose 设置一个自托管的 Neo4j 实例​

API 参考

用法​

API 参考

使用 retrievalQuery 参数来自定义响应​

API 参考

从现有图中实例化 Neo4jVectorStore​

API 参考

元数据过滤​

API 参考

免责声明 ⚠️

相关​

此页面对您有帮助吗？

您也可以在 GitHub 上留下详细的反馈 GitHub.

设置

使用 `docker-compose` 设置一个自托管的 `Neo4j` 实例

用法

使用 retrievalQuery 参数来自定义响应

从现有图中实例化 Neo4jVectorStore

元数据过滤

相关