LanceDB
LanceDB 是一种用于 AI 应用程序的嵌入式向量数据库。它是开源的,并使用 Apache-2.0 许可证分发。
LanceDB 数据集持久化到磁盘,可以在 Node.js 和 Python 之间共享。
设置
- npm
- Yarn
- pnpm
npm install -S vectordb
yarn add vectordb
pnpm add vectordb
提示
请参阅 此部分以获取有关安装集成包的一般说明。
- npm
- Yarn
- pnpm
npm install @langchain/openai @langchain/community
yarn add @langchain/openai @langchain/community
pnpm add @langchain/openai @langchain/community
用法
从文本创建新索引
import { LanceDB } from "@langchain/community/vectorstores/lancedb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { connect } from "vectordb";
import * as fs from "node:fs/promises";
import * as path from "node:path";
import os from "node:os";
export const run = async () => {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), "lancedb-"));
const db = await connect(dir);
const table = await db.createTable("vectors", [
{ vector: Array(1536), text: "sample", id: 1 },
]);
const vectorStore = await LanceDB.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new OpenAIEmbeddings(),
{ table }
);
const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
// [ Document { pageContent: 'hello nice world', metadata: { id: 3 } } ]
};
API 参考
- LanceDB 来自
@langchain/community/vectorstores/lancedb
- OpenAIEmbeddings 来自
@langchain/openai
从加载器创建新索引
import { LanceDB } from "@langchain/community/vectorstores/lancedb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { TextLoader } from "langchain/document_loaders/fs/text";
import fs from "node:fs/promises";
import path from "node:path";
import os from "node:os";
import { connect } from "vectordb";
// Create docs with a loader
const loader = new TextLoader("src/document_loaders/example_data/example.txt");
const docs = await loader.load();
export const run = async () => {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), "lancedb-"));
const db = await connect(dir);
const table = await db.createTable("vectors", [
{ vector: Array(1536), text: "sample", source: "a" },
]);
const vectorStore = await LanceDB.fromDocuments(
docs,
new OpenAIEmbeddings(),
{ table }
);
const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
// [
// Document {
// pageContent: 'Foo\nBar\nBaz\n\n',
// metadata: { source: 'src/document_loaders/example_data/example.txt' }
// }
// ]
};
API 参考
- LanceDB 来自
@langchain/community/vectorstores/lancedb
- OpenAIEmbeddings 来自
@langchain/openai
- TextLoader 来自
langchain/document_loaders/fs/text
打开现有数据集
import { LanceDB } from "@langchain/community/vectorstores/lancedb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { connect } from "vectordb";
import * as fs from "node:fs/promises";
import * as path from "node:path";
import os from "node:os";
//
// You can open a LanceDB dataset created elsewhere, such as LangChain Python, by opening
// an existing table
//
export const run = async () => {
const uri = await createdTestDb();
const db = await connect(uri);
const table = await db.openTable("vectors");
const vectorStore = new LanceDB(new OpenAIEmbeddings(), { table });
const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
// [ Document { pageContent: 'Hello world', metadata: { id: 1 } } ]
};
async function createdTestDb(): Promise<string> {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), "lancedb-"));
const db = await connect(dir);
await db.createTable("vectors", [
{ vector: Array(1536), text: "Hello world", id: 1 },
{ vector: Array(1536), text: "Bye bye", id: 2 },
{ vector: Array(1536), text: "hello nice world", id: 3 },
]);
return dir;
}
API 参考
- LanceDB 来自
@langchain/community/vectorstores/lancedb
- OpenAIEmbeddings 来自
@langchain/openai