如何根据相似度选择示例
此对象根据与输入的相似度选择示例。它通过查找与输入具有最大余弦相似度的嵌入的示例来做到这一点。
示例对象的字段将用作参数来格式化传递给FewShotPromptTemplate
的examplePrompt
。因此,每个示例都应包含您正在使用的示例提示所需的所有字段。
提示
- npm
- Yarn
- pnpm
npm install @langchain/openai @langchain/community
yarn add @langchain/openai @langchain/community
pnpm add @langchain/openai @langchain/community
import { OpenAIEmbeddings } from "@langchain/openai";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(
"Input: {input}\nOutput: {output}"
);
// Create a SemanticSimilarityExampleSelector that will be used to select the examples.
const exampleSelector = await SemanticSimilarityExampleSelector.fromExamples(
[
{ input: "happy", output: "sad" },
{ input: "tall", output: "short" },
{ input: "energetic", output: "lethargic" },
{ input: "sunny", output: "gloomy" },
{ input: "windy", output: "calm" },
],
new OpenAIEmbeddings(),
HNSWLib,
{ k: 1 }
);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
// We provide an ExampleSelector instead of examples.
exampleSelector,
examplePrompt,
prefix: "Give the antonym of every input",
suffix: "Input: {adjective}\nOutput:",
inputVariables: ["adjective"],
});
// Input is about the weather, so should select eg. the sunny/gloomy example
console.log(await dynamicPrompt.format({ adjective: "rainy" }));
/*
Give the antonym of every input
Input: sunny
Output: gloomy
Input: rainy
Output:
*/
// Input is a measurement, so should select the tall/short example
console.log(await dynamicPrompt.format({ adjective: "large" }));
/*
Give the antonym of every input
Input: tall
Output: short
Input: large
Output:
*/
API 参考
- OpenAIEmbeddings 来自
@langchain/openai
- HNSWLib 来自
@langchain/community/vectorstores/hnswlib
- PromptTemplate 来自
@langchain/core/prompts
- FewShotPromptTemplate 来自
@langchain/core/prompts
- SemanticSimilarityExampleSelector 来自
@langchain/core/example_selectors
默认情况下,示例对象中的每个字段都会串联在一起、嵌入并存储在向量存储中,以便以后与用户查询进行相似度搜索。
如果您只想嵌入特定键(例如,您只想搜索与用户提供的查询具有类似查询的示例),您可以在最终的options
参数中传递一个inputKeys
数组。
从现有向量存储中加载
您还可以通过将实例直接传递到SemanticSimilarityExampleSelector
构造函数中,来使用预先初始化的向量存储,如下所示。您还可以通过addExample
方法添加更多示例
// Ephemeral, in-memory vector store for demo purposes
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const embeddings = new OpenAIEmbeddings();
const memoryVectorStore = new MemoryVectorStore(embeddings);
const examples = [
{
query: "healthy food",
output: `galbi`,
},
{
query: "healthy food",
output: `schnitzel`,
},
{
query: "foo",
output: `bar`,
},
];
const exampleSelector = new SemanticSimilarityExampleSelector({
vectorStore: memoryVectorStore,
k: 2,
// Only embed the "query" key of each example
inputKeys: ["query"],
});
for (const example of examples) {
// Format and add an example to the underlying vector store
await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
<user_input>
{query}
</user_input>
<output>
{output}
</output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
// We provide an ExampleSelector instead of examples.
exampleSelector,
examplePrompt,
prefix: `Answer the user's question, using the below examples as reference:`,
suffix: "User question: {query}",
inputVariables: ["query"],
});
const formattedValue = await dynamicPrompt.format({
query: "What is a healthy food?",
});
console.log(formattedValue);
/*
Answer the user's question, using the below examples as reference:
<example>
<user_input>
healthy
</user_input>
<output>
galbi
</output>
</example>
<example>
<user_input>
healthy
</user_input>
<output>
schnitzel
</output>
</example>
User question: What is a healthy food?
*/
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({ query: "What is a healthy food?" });
console.log(result);
/*
AIMessage {
content: 'A healthy food can be galbi or schnitzel.',
additional_kwargs: { function_call: undefined }
}
*/
API 参考
- MemoryVectorStore 来自
langchain/vectorstores/memory
- OpenAIEmbeddings 来自
@langchain/openai
- ChatOpenAI 来自
@langchain/openai
- PromptTemplate 来自
@langchain/core/prompts
- FewShotPromptTemplate 来自
@langchain/core/prompts
- SemanticSimilarityExampleSelector 来自
@langchain/core/example_selectors
元数据过滤
添加示例时,每个字段都可以在生成的文档中作为元数据使用。如果您希望进一步控制您的搜索空间,您可以在示例中添加额外的字段,并在初始化选择器时传递一个filter
参数
// Ephemeral, in-memory vector store for demo purposes
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { Document } from "@langchain/core/documents";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const embeddings = new OpenAIEmbeddings();
const memoryVectorStore = new MemoryVectorStore(embeddings);
const examples = [
{
query: "healthy food",
output: `lettuce`,
food_type: "vegetable",
},
{
query: "healthy food",
output: `schnitzel`,
food_type: "veal",
},
{
query: "foo",
output: `bar`,
food_type: "baz",
},
];
const exampleSelector = new SemanticSimilarityExampleSelector({
vectorStore: memoryVectorStore,
k: 2,
// Only embed the "query" key of each example
inputKeys: ["query"],
// Filter type will depend on your specific vector store.
// See the section of the docs for the specific vector store you are using.
filter: (doc: Document) => doc.metadata.food_type === "vegetable",
});
for (const example of examples) {
// Format and add an example to the underlying vector store
await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
<user_input>
{query}
</user_input>
<output>
{output}
</output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
// We provide an ExampleSelector instead of examples.
exampleSelector,
examplePrompt,
prefix: `Answer the user's question, using the below examples as reference:`,
suffix: "User question:\n{query}",
inputVariables: ["query"],
});
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({
query: "What is exactly one type of healthy food?",
});
console.log(result);
/*
AIMessage {
content: 'One type of healthy food is lettuce.',
additional_kwargs: { function_call: undefined }
}
*/
API 参考
- MemoryVectorStore 来自
langchain/vectorstores/memory
- OpenAIEmbeddings 来自
@langchain/openai
- ChatOpenAI 来自
@langchain/openai
- PromptTemplate 来自
@langchain/core/prompts
- FewShotPromptTemplate 来自
@langchain/core/prompts
- Document 来自
@langchain/core/documents
- SemanticSimilarityExampleSelector 来自
@langchain/core/example_selectors
自定义向量存储检索器
您还可以传递一个向量存储检索器,而不是向量存储。这样做的一种方法是在您想要使用除相似度搜索之外的检索(例如最大边缘相关性)时很有用
/* eslint-disable @typescript-eslint/no-non-null-assertion */
// Requires a vectorstore that supports maximal marginal relevance search
import { Pinecone } from "@pinecone-database/pinecone";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PineconeStore } from "@langchain/pinecone";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const pinecone = new Pinecone();
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX!);
/**
* Pinecone allows you to partition the records in an index into namespaces.
* Queries and other operations are then limited to one namespace,
* so different requests can search different subsets of your index.
* Read more about namespaces here: https://docs.pinecone.io/guides/indexes/use-namespaces
*
* NOTE: If you have namespace enabled in your Pinecone index, you must provide the namespace when creating the PineconeStore.
*/
const namespace = "pinecone";
const pineconeVectorstore = await PineconeStore.fromExistingIndex(
new OpenAIEmbeddings(),
{ pineconeIndex, namespace }
);
const pineconeMmrRetriever = pineconeVectorstore.asRetriever({
searchType: "mmr",
k: 2,
});
const examples = [
{
query: "healthy food",
output: `lettuce`,
food_type: "vegetable",
},
{
query: "healthy food",
output: `schnitzel`,
food_type: "veal",
},
{
query: "foo",
output: `bar`,
food_type: "baz",
},
];
const exampleSelector = new SemanticSimilarityExampleSelector({
vectorStoreRetriever: pineconeMmrRetriever,
// Only embed the "query" key of each example
inputKeys: ["query"],
});
for (const example of examples) {
// Format and add an example to the underlying vector store
await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
<user_input>
{query}
</user_input>
<output>
{output}
</output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
// We provide an ExampleSelector instead of examples.
exampleSelector,
examplePrompt,
prefix: `Answer the user's question, using the below examples as reference:`,
suffix: "User question:\n{query}",
inputVariables: ["query"],
});
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({
query: "What is exactly one type of healthy food?",
});
console.log(result);
/*
AIMessage {
content: 'lettuce.',
additional_kwargs: { function_call: undefined }
}
*/
API 参考
- OpenAIEmbeddings 来自
@langchain/openai
- ChatOpenAI 来自
@langchain/openai
- PineconeStore 来自
@langchain/pinecone
- PromptTemplate 来自
@langchain/core/prompts
- FewShotPromptTemplate 来自
@langchain/core/prompts
- SemanticSimilarityExampleSelector 来自
@langchain/core/example_selectors
下一步
您现在已经了解了一些关于在示例选择器中使用相似度的知识。
接下来,查看此指南,了解如何使用基于长度的示例选择器。