Supabase
本指南将帮助您开始使用此检索器,它由Supabase 向量存储支持。有关所有功能和配置的详细文档,请前往API 参考。
概述
自查询检索器通过动态生成基于某些输入查询的元数据过滤器来检索文档。这使得检索器除了纯粹的语义相似性之外,还可以考虑底层文档元数据在获取结果时的作用。
它使用一个名为 Translator
的模块,该模块根据有关元数据字段的信息以及给定向量存储支持的查询语言来生成过滤器。
集成细节
支持的向量存储 | 自托管 | 云服务 | 包 | Py 支持 |
---|---|---|---|---|
SupabaseVectorStore | ✅ | ✅ | @langchain/community | ✅ |
设置
按照此处所述设置 Supabase 实例。设置以下环境变量
process.env.SUPABASE_PRIVATE_KEY = "YOUR_SUPABASE_PRIVATE_KEY";
process.env.SUPABASE_URL = "YOUR_SUPABASE_URL";
如果您想从单个查询中获得自动跟踪,还可以通过取消下方注释来设置您的LangSmith API 密钥
// process.env.LANGSMITH_API_KEY = "<YOUR API KEY HERE>";
// process.env.LANGSMITH_TRACING = "true";
安装
向量存储位于 @langchain/community
包中,它需要官方的 Supabase SDK 作为对等依赖项。您还需要安装 langchain
包来导入主要的 SelfQueryRetriever
类。
在本示例中,我们还将使用 OpenAI 嵌入,因此您需要安装 @langchain/openai
包并获取 API 密钥
有关安装集成包的一般说明,请参见此部分。
- npm
- yarn
- pnpm
npm i @langchain/community langchain @langchain/openai @supabase/supabase-js
yarn add @langchain/community langchain @langchain/openai @supabase/supabase-js
pnpm add @langchain/community langchain @langchain/openai @supabase/supabase-js
实例化
首先,使用包含元数据的某些文档初始化您的 Supabase 向量存储
import { OpenAIEmbeddings } from "@langchain/openai";
import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { Document } from "@langchain/core/documents";
import type { AttributeInfo } from "langchain/chains/query_constructor";
import { createClient } from "@supabase/supabase-js";
/**
* First, we create a bunch of documents. You can load your own documents here instead.
* Each document has a pageContent and a metadata field. Make sure your metadata matches the AttributeInfo below.
*/
const docs = [
new Document({
pageContent:
"A bunch of scientists bring back dinosaurs and mayhem breaks loose",
metadata: { year: 1993, rating: 7.7, genre: "science fiction" },
}),
new Document({
pageContent:
"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...",
metadata: { year: 2010, director: "Christopher Nolan", rating: 8.2 },
}),
new Document({
pageContent:
"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea",
metadata: { year: 2006, director: "Satoshi Kon", rating: 8.6 },
}),
new Document({
pageContent:
"A bunch of normal-sized women are supremely wholesome and some men pine after them",
metadata: { year: 2019, director: "Greta Gerwig", rating: 8.3 },
}),
new Document({
pageContent: "Toys come alive and have a blast doing so",
metadata: { year: 1995, genre: "animated" },
}),
new Document({
pageContent: "Three men walk into the Zone, three men walk out of the Zone",
metadata: {
year: 1979,
director: "Andrei Tarkovsky",
genre: "science fiction",
rating: 9.9,
},
}),
];
/**
* Next, we define the attributes we want to be able to query on.
* in this case, we want to be able to query on the genre, year, director, rating, and length of the movie.
* We also provide a description of each attribute and the type of the attribute.
* This is used to generate the query prompts.
*/
const attributeInfo: AttributeInfo[] = [
{
name: "genre",
description: "The genre of the movie",
type: "string or array of strings",
},
{
name: "year",
description: "The year the movie was released",
type: "number",
},
{
name: "director",
description: "The director of the movie",
type: "string",
},
{
name: "rating",
description: "The rating of the movie (1-10)",
type: "number",
},
{
name: "length",
description: "The length of the movie in minutes",
type: "number",
},
];
/**
* Next, we instantiate a vector store. This is where we store the embeddings of the documents.
* We also need to provide an embeddings object. This is used to embed the documents.
*/
const client = createClient(
process.env.SUPABASE_URL,
process.env.SUPABASE_PRIVATE_KEY
);
const embeddings = new OpenAIEmbeddings();
const vectorStore = await SupabaseVectorStore.fromDocuments(docs, embeddings, {
client,
});
现在我们可以实例化我们的检索器
选择您的聊天模型
- OpenAI
- Anthropic
- FireworksAI
- MistralAI
- Groq
- VertexAI
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
添加环境变量
OPENAI_API_KEY=your-api-key
实例化模型
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
添加环境变量
ANTHROPIC_API_KEY=your-api-key
实例化模型
import { ChatAnthropic } from "@langchain/anthropic";
const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
添加环境变量
FIREWORKS_API_KEY=your-api-key
实例化模型
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const llm = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
添加环境变量
MISTRAL_API_KEY=your-api-key
实例化模型
import { ChatMistralAI } from "@langchain/mistralai";
const llm = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
添加环境变量
GROQ_API_KEY=your-api-key
实例化模型
import { ChatGroq } from "@langchain/groq";
const llm = new ChatGroq({
model: "mixtral-8x7b-32768",
temperature: 0
});
安装依赖项
请参见 有关安装集成包的一般说明,请参见此部分.
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
添加环境变量
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
实例化模型
import { ChatVertexAI } from "@langchain/google-vertexai";
const llm = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
import { SelfQueryRetriever } from "langchain/retrievers/self_query";
import { SupabaseTranslator } from "@langchain/community/structured_query/supabase";
const selfQueryRetriever = SelfQueryRetriever.fromLLM({
llm: llm,
vectorStore: vectorStore,
/** A short summary of what the document contents represent. */
documentContents: "Brief summary of a movie",
attributeInfo: attributeInfo,
structuredQueryTranslator: new SupabaseTranslator(),
});
使用
现在,提出一个需要一些关于文档元数据的知识才能回答的问题。您可以看到检索器将生成正确的结果
await selfQueryRetriever.invoke("Which movies are rated higher than 8.5?");
[
Document {
pageContent: 'A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea',
metadata: { year: 2006, rating: 8.6, director: 'Satoshi Kon' },
id: undefined
},
Document {
pageContent: 'Three men walk into the Zone, three men walk out of the Zone',
metadata: {
year: 1979,
genre: 'science fiction',
rating: 9.9,
director: 'Andrei Tarkovsky'
},
id: undefined
}
]
在链中使用
与其他检索器一样,Supabase 自查询检索器可以通过链合并到 LLM 应用程序中。
请注意,由于它们的返回答案可能很大程度上取决于文档元数据,因此我们以不同的格式来格式化检索到的文档,以便包含该信息。
import { ChatPromptTemplate } from "@langchain/core/prompts";
import {
RunnablePassthrough,
RunnableSequence,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";
import type { Document } from "@langchain/core/documents";
const prompt = ChatPromptTemplate.fromTemplate(`
Answer the question based only on the context provided.
Context: {context}
Question: {question}`);
const formatDocs = (docs: Document[]) => {
return docs.map((doc) => JSON.stringify(doc)).join("\n\n");
};
// See https://js.langchain.ac.cn/v0.2/docs/tutorials/rag
const ragChain = RunnableSequence.from([
{
context: selfQueryRetriever.pipe(formatDocs),
question: new RunnablePassthrough(),
},
prompt,
llm,
new StringOutputParser(),
]);
await ragChain.invoke("Which movies are rated higher than 8.5?");
The movies rated higher than 8.5 are:
1. The movie directed by Satoshi Kon in 2006, which has a rating of 8.6.
2. The movie directed by Andrei Tarkovsky in 1979, which has a rating of 9.9.
默认搜索参数
您也可以在上述方法中传递 searchParams
字段,该字段提供除了任何生成的查询之外的默认过滤器。过滤器语法是一个返回Supabase 过滤器的函数
import type { SupabaseFilter } from "@langchain/community/vectorstores/supabase";
const selfQueryRetrieverWithDefaultParams = SelfQueryRetriever.fromLLM({
llm: llm,
vectorStore: vectorStore,
documentContents: "Brief summary of a movie",
attributeInfo: attributeInfo,
structuredQueryTranslator: new SupabaseTranslator(),
searchParams: {
filter: (rpc: SupabaseFilter) =>
rpc.filter("metadata->>type", "eq", "movie"),
mergeFiltersOperator: "and",
},
});
API 参考
有关所有 Supabase 自查询检索器功能和配置的详细文档,请前往API 参考。