Elasticsearch

兼容性

仅在 Node.js 上可用。

Elasticsearch 是一个分布式 RESTful 搜索引擎，针对生产规模工作负载的速度和相关性进行了优化。它还支持使用 k 近邻 (kNN) 算法的向量搜索，以及用于自然语言处理 (NLP) 的自定义模型。您可以在此处阅读有关 Elasticsearch 中向量搜索支持的更多信息。

本指南快速概述了 Elasticsearch 向量数据库的入门知识。有关所有 ElasticVectorSearch 功能和配置的详细文档，请访问 API 参考。

概述

集成详情

类	包	PY 支持	最新包
`ElasticVectorSearch`	`@langchain/community`	✅

设置

要使用 Elasticsearch 向量数据库，您需要安装 @langchain/community 集成包。

LangChain.js 接受 @elastic/elasticsearch 作为 Elasticsearch 向量数据库的客户端。您需要将其作为对等依赖项安装。

本指南还将使用 OpenAI 嵌入，这需要您安装 @langchain/openai 集成包。如果您愿意，也可以使用其他受支持的嵌入模型。

提示

请参阅此部分，了解有关安装集成包的通用说明。

npm
yarn
pnpm

npm i @langchain/community @elastic/elasticsearch @langchain/openai @langchain/core

yarn add @langchain/community @elastic/elasticsearch @langchain/openai @langchain/core

pnpm add @langchain/community @elastic/elasticsearch @langchain/openai @langchain/core

凭证

要使用 Elasticsearch 向量数据库，您需要运行 Elasticsearch 实例。

您可以使用官方 Docker 镜像开始，或者您可以使用 Elastic Cloud，Elastic 的官方云服务。

要连接到 Elastic Cloud，您可以阅读此处报告的文档，以获取 API 密钥。

如果您在本指南中使用 OpenAI 嵌入，您还需要设置您的 OpenAI 密钥

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

如果您想获得模型调用的自动跟踪，您还可以通过取消注释下面内容来设置您的 LangSmith API 密钥

// process.env.LANGSMITH_TRACING="true"
// process.env.LANGSMITH_API_KEY="your-api-key"

实例化

Elasticsearch 的实例化将根据您的实例托管位置而有所不同。

import {
  ElasticVectorSearch,
  type ElasticClientArgs,
} from "@langchain/community/vectorstores/elasticsearch";
import { OpenAIEmbeddings } from "@langchain/openai";

import { Client, type ClientOptions } from "@elastic/elasticsearch";

import * as fs from "node:fs";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const config: ClientOptions = {
  node: process.env.ELASTIC_URL ?? "https://127.0.0.1:9200",
};

if (process.env.ELASTIC_API_KEY) {
  config.auth = {
    apiKey: process.env.ELASTIC_API_KEY,
  };
} else if (process.env.ELASTIC_USERNAME && process.env.ELASTIC_PASSWORD) {
  config.auth = {
    username: process.env.ELASTIC_USERNAME,
    password: process.env.ELASTIC_PASSWORD,
  };
}
// Local Docker deploys require a TLS certificate
if (process.env.ELASTIC_CERT_PATH) {
  config.tls = {
    ca: fs.readFileSync(process.env.ELASTIC_CERT_PATH),
    rejectUnauthorized: false,
  };
}
const clientArgs: ElasticClientArgs = {
  client: new Client(config),
  indexName: process.env.ELASTIC_INDEX ?? "test_vectorstore",
};

const vectorStore = new ElasticVectorSearch(embeddings, clientArgs);

管理向量数据库

向向量数据库添加项目

import type { Document } from "@langchain/core/documents";

const document1: Document = {
  pageContent: "The powerhouse of the cell is the mitochondria",
  metadata: { source: "https://example.com" },
};

const document2: Document = {
  pageContent: "Buildings are made out of brick",
  metadata: { source: "https://example.com" },
};

const document3: Document = {
  pageContent: "Mitochondria are made out of lipids",
  metadata: { source: "https://example.com" },
};

const document4: Document = {
  pageContent: "The 2024 Olympics are in Paris",
  metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });

[ '1', '2', '3', '4' ]

从向量数据库删除项目

您可以通过传递您传入的相同 ID 从存储中删除值

await vectorStore.delete({ ids: ["4"] });

查询向量数据库

一旦您的向量数据库被创建并且相关的文档被添加，您很可能希望在您的链或代理运行时查询它。

直接查询

可以按如下方式执行简单的相似性搜索

const filter = [
  {
    operator: "match",
    field: "source",
    value: "https://example.com",
  },
];

const similaritySearchResults = await vectorStore.similaritySearch(
  "biology",
  2,
  filter
);

for (const doc of similaritySearchResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

* The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* Mitochondria are made out of lipids [{"source":"https://example.com"}]

向量数据库支持 Elasticsearch 过滤器语法运算符。

如果您想执行相似性搜索并接收相应的分数，您可以运行

const similaritySearchWithScoreResults =
  await vectorStore.similaritySearchWithScore("biology", 2, filter);

for (const [doc, score] of similaritySearchWithScoreResults) {
  console.log(
    `* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
      doc.metadata
    )}]`
  );
}

* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"source":"https://example.com"}]

通过转换为检索器进行查询

您还可以将向量数据库转换为检索器，以便在您的链中更轻松地使用。

const retriever = vectorStore.asRetriever({
  // Optional filter
  filter: filter,
  k: 2,
});
await retriever.invoke("biology");

[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Mitochondria are made out of lipids',
    metadata: { source: 'https://example.com' },
    id: undefined
  }
]

用于检索增强生成的用法

有关如何将此向量数据库用于检索增强生成 (RAG) 的指南，请参阅以下部分

API 参考

有关所有 ElasticVectorSearch 功能和配置的详细文档，请访问 API 参考。

向量数据库概念指南
向量数据库操作指南

Elasticsearch

概述

集成详情

设置

凭证

实例化

管理向量数据库

向向量数据库添加项目

从向量数据库删除项目

查询向量数据库

直接查询

通过转换为检索器进行查询

用于检索增强生成的用法

API 参考

此页对您有帮助吗？

您也可以留下详细的反馈在 GitHub 上.

概述​

集成详情​

设置​

凭证​

实例化​

管理向量数据库​

向向量数据库添加项目​

从向量数据库删除项目​

查询向量数据库​

直接查询​

通过转换为检索器进行查询​

用于检索增强生成的用法​

API 参考​

相关​

此页对您有帮助吗？

您也可以留下详细的反馈 在 GitHub 上.

概述

集成详情

设置

凭证

实例化

管理向量数据库

向向量数据库添加项目

从向量数据库删除项目

查询向量数据库

直接查询

通过转换为检索器进行查询

用于检索增强生成的用法

API 参考

相关

您也可以留下详细的反馈在 GitHub 上.