Azure Cosmos DB for NoSQL
Azure Cosmos DB for NoSQL 提供对使用灵活模式查询项目的支持,并提供对 JSON 的原生支持。它现在提供了向量索引和搜索功能。此功能旨在处理高维向量,使您能够在任何规模上进行高效且准确的向量搜索。您现在可以将向量直接存储在文档中,与您的数据并存。您数据库中的每个文档不仅可以包含传统的无模式数据,还可以包含高维向量作为文档的其他属性。
了解如何从此页面利用 Azure Cosmos DB for NoSQL 的向量搜索功能。如果您没有 Azure 帐户,您可以创建一个免费帐户以开始使用。
设置
您首先需要安装@langchain/azure-cosmosdb
包
提示
- npm
- Yarn
- pnpm
npm install @langchain/azure-cosmosdb
yarn add @langchain/azure-cosmosdb
pnpm add @langchain/azure-cosmosdb
您还需要有一个 Azure Cosmos DB for NoSQL 实例运行。您可以按照此指南在 Azure 门户上部署免费版本,无需任何成本。
实例运行后,请确保您拥有连接字符串。您可以在 Azure 门户中,实例的“设置/密钥”部分找到它们。然后,您需要设置以下环境变量
# Use connection string to authenticate
AZURE_COSMOSDB_NOSQL_CONNECTION_STRING=
# Use managed identity to authenticate
AZURE_COSMOSDB_NOSQL_ENDPOINT=
API 参考
使用 Azure 托管标识
如果您使用的是 Azure 托管标识,您可以像这样配置凭据
import { AzureCosmosDBNoSQLVectorStore } from "@langchain/azure-cosmosdb";
import { OpenAIEmbeddings } from "@langchain/openai";
// Create Azure Cosmos DB vector store
const store = new AzureCosmosDBNoSQLVectorStore(new OpenAIEmbeddings(), {
// Or use environment variable AZURE_COSMOSDB_NOSQL_ENDPOINT
endpoint: "https://my-cosmosdb.documents.azure.com:443/",
// Database and container must already exist
databaseName: "my-database",
containerName: "my-container",
});
API 参考
- AzureCosmosDBNoSQLVectorStore 来自
@langchain/azure-cosmosdb
- OpenAIEmbeddings 来自
@langchain/openai
信息
在使用 Azure 托管标识和基于角色的访问控制时,您必须确保数据库和容器已预先创建。RBAC 不提供创建数据库和容器的权限。您可以在Azure Cosmos DB 文档中了解有关权限模型的更多信息。
使用示例
以下是一个示例,它将来自文件的文档索引到 Azure Cosmos DB for NoSQL 中,运行向量搜索查询,最后使用链根据检索到的文档以自然语言回答问题。
import { AzureCosmosDBNoSQLVectorStore } from "@langchain/azure-cosmosdb";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
// Load documents from file
const loader = new TextLoader("./state_of_the_union.txt");
const rawDocuments = await loader.load();
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 0,
});
const documents = await splitter.splitDocuments(rawDocuments);
// Create Azure Cosmos DB vector store
const store = await AzureCosmosDBNoSQLVectorStore.fromDocuments(
documents,
new OpenAIEmbeddings(),
{
databaseName: "langchain",
containerName: "documents",
}
);
// Performs a similarity search
const resultDocuments = await store.similaritySearch(
"What did the president say about Ketanji Brown Jackson?"
);
console.log("Similarity search results:");
console.log(resultDocuments[0].pageContent);
/*
Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections.
Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.
And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
*/
// Use the store as part of a chain
const model = new ChatOpenAI({ model: "gpt-3.5-turbo-1106" });
const questionAnsweringPrompt = ChatPromptTemplate.fromMessages([
[
"system",
"Answer the user's questions based on the below context:\n\n{context}",
],
["human", "{input}"],
]);
const combineDocsChain = await createStuffDocumentsChain({
llm: model,
prompt: questionAnsweringPrompt,
});
const chain = await createRetrievalChain({
retriever: store.asRetriever(),
combineDocsChain,
});
const res = await chain.invoke({
input: "What is the president's top priority regarding prices?",
});
console.log("Chain response:");
console.log(res.answer);
/*
The president's top priority is getting prices under control.
*/
// Clean up
await store.delete();
API 参考
- AzureCosmosDBNoSQLVectorStore 来自
@langchain/azure-cosmosdb
- ChatPromptTemplate 来自
@langchain/core/prompts
- ChatOpenAI 来自
@langchain/openai
- OpenAIEmbeddings 来自
@langchain/openai
- createStuffDocumentsChain 来自
langchain/chains/combine_documents
- createRetrievalChain 来自
langchain/chains/retrieval
- TextLoader 来自
langchain/document_loaders/fs/text
- RecursiveCharacterTextSplitter 来自
@langchain/textsplitters