MariaDB

兼容性

仅在 Node.js 上可用。

这需要 MariaDB 11.7 或更高版本

本指南快速概述了如何开始使用 mariadb 向量存储。有关所有 MariaDB store 功能和配置的详细文档，请访问 API 参考。

概述

集成详情

类	包	PY 支持	最新包
`MariaDBStore`	`@langchain/community`	✅

设置

要使用 MariaDBVector 向量存储，您需要设置 MariaDB 11.7 或更高版本，并将 mariadb 连接器作为对等依赖项。

本指南还将使用 OpenAI 嵌入，这需要您安装 @langchain/openai 集成包。如果您愿意，也可以使用其他支持的嵌入模型。

我们还将使用 uuid 包来生成所需格式的 ID。

提示

有关安装集成包的通用说明，请参阅此部分。

npm
yarn
pnpm

npm i @langchain/community @langchain/openai @langchain/core mariadb uuid

yarn add @langchain/community @langchain/openai @langchain/core mariadb uuid

pnpm add @langchain/community @langchain/openai @langchain/core mariadb uuid

设置实例

创建一个名为 docker-compose.yml 的文件，内容如下

# Run this command to start the database:
# docker-compose up --build
version: "3"
services:
  db:
    hostname: 127.0.0.1
    image: mariadb/mariadb:11.7-rc
    ports:
      - 3306:3306
    restart: always
    environment:
      - MARIADB_DATABASE=api
      - MARIADB_USER=myuser
      - MARIADB_PASSWORD=ChangeMe
      - MARIADB_ROOT_PASSWORD=ChangeMe
    volumes:
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

然后在同一目录下，运行 docker compose up 以启动容器。

凭据

要连接到您的 MariaDB 实例，您需要相应的凭据。有关支持选项的完整列表，请参阅 mariadb 文档。

如果您在本指南中使用 OpenAI 嵌入，您还需要设置您的 OpenAI 密钥

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

如果您想获得模型调用的自动追踪，您还可以通过取消注释下方内容来设置您的 LangSmith API 密钥

// process.env.LANGCHAIN_TRACING_V2="true"
// process.env.LANGCHAIN_API_KEY="your-api-key"

实例化

要实例化向量存储，请调用 .initialize() 静态方法。这将自动检查表中是否存在 config 中传递的 tableName。如果不存在，它将使用所需的列创建它。

import { OpenAIEmbeddings } from "@langchain/openai";

import {
  DistanceStrategy,
  MariaDBStore,
} from "@langchain/community/vectorstores/mariadb";
import { PoolConfig } from "mariadb";

const config = {
  connectionOptions: {
    type: "mariadb",
    host: "127.0.0.1",
    port: 3306,
    user: "myuser",
    password: "ChangeMe",
    database: "api",
  } as PoolConfig,
  distanceStrategy: "EUCLIDEAN" as DistanceStrategy,
};
const vectorStore = await MariaDBStore.initialize(
  new OpenAIEmbeddings(),
  config
);

管理向量存储

向向量存储添加项目

import { v4 as uuidv4 } from "uuid";
import type { Document } from "@langchain/core/documents";

const document1: Document = {
  pageContent: "The powerhouse of the cell is the mitochondria",
  metadata: { source: "https://example.com" },
};

const document2: Document = {
  pageContent: "Buildings are made out of brick",
  metadata: { source: "https://example.com" },
};

const document3: Document = {
  pageContent: "Mitochondria are made out of lipids",
  metadata: { source: "https://example.com" },
};

const document4: Document = {
  pageContent: "The 2024 Olympics are in Paris",
  metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()];

// ids are not mandatory, but that's for the example
await vectorStore.addDocuments(documents, { ids: ids });

从向量存储删除项目

const id4 = ids[ids.length - 1];

await vectorStore.delete({ ids: [id4] });

查询向量存储

一旦您的向量存储被创建并且相关的文档被添加，您很可能希望在链或 Agent 运行时查询它。

直接查询

执行简单的相似性搜索可以按如下方式完成

const similaritySearchResults = await vectorStore.similaritySearch(
  "biology",
  2,
  { year: 2021 }
);
for (const doc of similaritySearchResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

* The powerhouse of the cell is the mitochondria [{"year": 2021}]
* Mitochondria are made out of lipids [{"year": 2022}]

上面的过滤器语法可以使用更复杂的方式

# name = 'martin' OR firstname = 'john'
let res = await vectorStore.similaritySearch("biology", 2, {"$or": [{"name":"martin"}, {"firstname", "john"}] });

如果您想执行相似性搜索并接收相应的分数，您可以运行

const similaritySearchWithScoreResults =
  await vectorStore.similaritySearchWithScore("biology", 2);

for (const [doc, score] of similaritySearchWithScoreResults) {
  console.log(
    `* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
      doc.metadata
    )}]`
  );
}

* [SIM=0.835] The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* [SIM=0.852] Mitochondria are made out of lipids [{"source":"https://example.com"}]

通过转换为检索器进行查询

您还可以将向量存储转换为检索器，以便在您的链中更轻松地使用。

const retriever = vectorStore.asRetriever({
  // Optional filter
  // filter: filter,
  k: 2,
});
await retriever.invoke("biology");

[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Mitochondria are made out of lipids',
    metadata: { source: 'https://example.com' },
    id: undefined
  }
]

用于检索增强生成的用法

有关如何使用此向量存储进行检索增强生成 (RAG) 的指南，请参阅以下部分

高级：重用连接

您可以通过创建连接池来重用连接，然后通过构造函数直接创建新的 MariaDBStore 实例。

请注意，您应该至少调用一次 .initialize() 来设置您的数据库，以便在使用构造函数之前正确设置您的表。

import { OpenAIEmbeddings } from "@langchain/openai";
import { MariaDBStore } from "@langchain/community/vectorstores/mariadb";
import mariadb from "mariadb";

// First, follow set-up instructions at
// https://js.langchain.ac.cn/docs/modules/indexes/vector_stores/integrations/mariadb

const reusablePool = mariadb.createPool({
  host: "127.0.0.1",
  port: 3306,
  user: "myuser",
  password: "ChangeMe",
  database: "api",
});

const originalConfig = {
  pool: reusablePool,
  tableName: "testlangchainjs",
  collectionName: "sample",
  collectionTableName: "collections",
  columns: {
    idColumnName: "id",
    vectorColumnName: "vect",
    contentColumnName: "content",
    metadataColumnName: "metadata",
  },
};

// Set up the DB.
// Can skip this step if you've already initialized the DB.
// await MariaDBStore.initialize(new OpenAIEmbeddings(), originalConfig);
const mariadbStore = new MariaDBStore(new OpenAIEmbeddings(), originalConfig);

await mariadbStore.addDocuments([
  { pageContent: "what's this", metadata: { a: 2 } },
  { pageContent: "Cat drinks milk", metadata: { a: 1 } },
]);

const results = await mariadbStore.similaritySearch("water", 1);

console.log(results);

/*
  [ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]
*/

const mariadbStore2 = new MariaDBStore(new OpenAIEmbeddings(), {
  pool: reusablePool,
  tableName: "testlangchainjs",
  collectionTableName: "collections",
  collectionName: "some_other_collection",
  columns: {
    idColumnName: "id",
    vectorColumnName: "vector",
    contentColumnName: "content",
    metadataColumnName: "metadata",
  },
});

const results2 = await mariadbStore2.similaritySearch("water", 1);

console.log(results2);

/*
  []
*/

await reusablePool.end();

关闭连接

确保在完成后关闭连接，以避免过度消耗资源

await vectorStore.end();

API 参考

有关所有 MariaDBStore 功能和配置的详细文档，请访问 API 参考。

向量存储概念指南
向量存储操作指南

MariaDB

概述

集成详情

设置

设置实例

凭据

实例化

管理向量存储

向向量存储添加项目

从向量存储删除项目

查询向量存储

直接查询

通过转换为检索器进行查询

用于检索增强生成的用法

高级：重用连接

关闭连接

API 参考

此页面是否对您有帮助？

您也可以留下详细的反馈在 GitHub 上.

概述​

集成详情​

设置​

设置实例​

凭据​

实例化​

管理向量存储​

向向量存储添加项目​

从向量存储删除项目​

查询向量存储​

直接查询​

通过转换为检索器进行查询​

用于检索增强生成的用法​

高级：重用连接​

关闭连接​

API 参考​

相关内容​

此页面是否对您有帮助？

您也可以留下详细的反馈 在 GitHub 上.

概述

集成详情

设置

设置实例

凭据

实例化

管理向量存储

向向量存储添加项目

从向量存储删除项目

查询向量存储

直接查询

通过转换为检索器进行查询

用于检索增强生成的用法

高级：重用连接

关闭连接

API 参考

相关内容

您也可以留下详细的反馈在 GitHub 上.