MemoryVectorStore

LangChain 提供了一个内存中的、短暂的向量存储，它在内存中存储嵌入向量，并对最相似的嵌入向量进行精确的线性搜索。默认的相似性度量是余弦相似度，但可以更改为 ml-distance 支持的任何相似性度量。

由于它是为演示而设计的，因此尚不支持 ID 或删除。

本指南概述了如何开始使用内存中的 vector stores。有关所有 MemoryVectorStore 功能和配置的详细文档，请访问 API 参考。

概述

集成详情

类	包	PY 支持	最新包
`MemoryVectorStore`	`langchain`	❌

设置

要使用内存向量存储，您需要安装 langchain 包

本指南还将使用 OpenAI 嵌入，这需要您安装 @langchain/openai 集成包。如果您愿意，也可以使用其他支持的嵌入模型。

提示

有关安装集成包的一般说明，请参阅此部分。

npm
yarn
pnpm

npm i langchain @langchain/openai @langchain/core

yarn add langchain @langchain/openai @langchain/core

pnpm add langchain @langchain/openai @langchain/core

凭证

使用内存向量存储不需要任何凭证。

如果您在本指南中使用 OpenAI 嵌入，您还需要设置您的 OpenAI 密钥

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

如果您想获得模型调用的自动追踪，您还可以设置您的 LangSmith API 密钥，取消注释下面即可

// process.env.LANGSMITH_TRACING="true"
// process.env.LANGSMITH_API_KEY="your-api-key"

实例化

import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const vectorStore = new MemoryVectorStore(embeddings);

管理向量存储

向向量存储添加项目

import type { Document } from "@langchain/core/documents";

const document1: Document = {
  pageContent: "The powerhouse of the cell is the mitochondria",
  metadata: { source: "https://example.com" },
};

const document2: Document = {
  pageContent: "Buildings are made out of brick",
  metadata: { source: "https://example.com" },
};

const document3: Document = {
  pageContent: "Mitochondria are made out of lipids",
  metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3];

await vectorStore.addDocuments(documents);

查询向量存储

一旦您的向量存储被创建并且相关的文档被添加，您很可能希望在您的链或 Agent 运行时查询它。

直接查询

执行简单的相似性搜索可以按如下方式完成

const filter = (doc) => doc.metadata.source === "https://example.com";

const similaritySearchResults = await vectorStore.similaritySearch(
  "biology",
  2,
  filter
);

for (const doc of similaritySearchResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

* The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* Mitochondria are made out of lipids [{"source":"https://example.com"}]

过滤器是可选的，并且必须是一个谓词函数，它接受一个文档作为输入，并返回 true 或 false，具体取决于是否应返回该文档。

如果您想执行相似性搜索并接收相应的分数，您可以运行

const similaritySearchWithScoreResults =
  await vectorStore.similaritySearchWithScore("biology", 2, filter);

for (const [doc, score] of similaritySearchWithScoreResults) {
  console.log(
    `* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
      doc.metadata
    )}]`
  );
}

* [SIM=0.165] The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* [SIM=0.148] Mitochondria are made out of lipids [{"source":"https://example.com"}]

通过转换为检索器进行查询

您还可以将向量存储转换为检索器，以便在您的链中更易于使用

const retriever = vectorStore.asRetriever({
  // Optional filter
  filter: filter,
  k: 2,
});

await retriever.invoke("biology");

[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Mitochondria are made out of lipids',
    metadata: { source: 'https://example.com' },
    id: undefined
  }
]

最大边际相关性

此向量存储还支持最大边际相关性 (MMR)，这是一种首先使用经典相似性搜索获取大量结果（由 searchKwargs.fetchK 给出），然后重新排序以实现多样性并返回前 k 个结果的技术。这有助于防止冗余信息

const mmrRetriever = vectorStore.asRetriever({
  searchType: "mmr",
  searchKwargs: {
    fetchK: 10,
  },
  // Optional filter
  filter: filter,
  k: 2,
});

await mmrRetriever.invoke("biology");

[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Buildings are made out of brick',
    metadata: { source: 'https://example.com' },
    id: undefined
  }
]

检索增强生成的使用

有关如何将此向量存储用于检索增强生成 (RAG) 的指南，请参阅以下部分

API 参考

有关所有 MemoryVectorStore 功能和配置的详细文档，请访问 API 参考。

向量存储概念指南
向量存储操作指南

MemoryVectorStore

概述

集成详情

设置

凭证

实例化

管理向量存储

向向量存储添加项目

查询向量存储

直接查询

通过转换为检索器进行查询

最大边际相关性

检索增强生成的使用

API 参考

此页面是否有帮助？

您也可以留下详细的反馈在 GitHub 上.

概述​

集成详情​

设置​

凭证​

实例化​

管理向量存储​

向向量存储添加项目​

查询向量存储​

直接查询​

通过转换为检索器进行查询​

最大边际相关性​

检索增强生成的使用​

API 参考​

相关内容​

此页面是否有帮助？

您也可以留下详细的反馈 在 GitHub 上.

概述

集成详情

设置

凭证

实例化

管理向量存储

向向量存储添加项目

查询向量存储

直接查询

通过转换为检索器进行查询

最大边际相关性

检索增强生成的使用

API 参考

相关内容

您也可以留下详细的反馈在 GitHub 上.