跳至主要内容

Azure AI 搜索

Azure AI 搜索(以前称为 Azure 搜索和 Azure 认知搜索)是为 Azure 上生产规模工作负载的快速性和相关性而优化的分布式 RESTful 搜索引擎。它还支持使用 k 近邻 (kNN) 算法以及 语义搜索 进行向量搜索。

此向量存储集成支持全文搜索、向量搜索和 混合搜索以获得最佳排名性能

了解如何从 此页面 利用 Azure AI 搜索的向量搜索功能。如果您没有 Azure 帐户,可以 创建一个免费帐户 以开始使用。

设置

您首先需要安装 @azure/search-documents SDK 和 @langchain/community

npm install -S @langchain/community @langchain/core @azure/search-documents

您还需要运行 Azure AI 搜索实例。您可以按照 本指南 在 Azure 门户上部署免费版本,无需任何费用。

实例运行后,请确保您拥有端点和管理密钥(查询密钥只能用于搜索文档,不能用于索引、更新或删除)。端点是实例的 URL,您可以在 Azure 门户中实例的“概述”部分找到。管理密钥可以在实例的“密钥”部分找到。然后,您需要设置以下环境变量

# Azure AI Search connection settings
AZURE_AISEARCH_ENDPOINT=
AZURE_AISEARCH_KEY=

# If you're using Azure OpenAI API, you'll need to set these variables
AZURE_OPENAI_API_KEY=
AZURE_OPENAI_API_INSTANCE_NAME=
AZURE_OPENAI_API_DEPLOYMENT_NAME=
AZURE_OPENAI_API_EMBEDDINGS_DEPLOYMENT_NAME=
AZURE_OPENAI_API_VERSION=

# Or you can use the OpenAI API directly
OPENAI_API_KEY=

API 参考

    混合搜索是一种结合全文搜索和向量搜索优势以提供最佳排名性能的功能。它在 Azure AI 搜索向量存储中默认启用,但您可以在创建向量存储时通过设置 search.type 属性选择不同的搜索查询类型。

    您可以在 官方文档 中详细了解混合搜索及其如何改善您的搜索结果。

    在某些情况下,例如检索增强型生成 (RAG),您可能希望除了混合搜索之外还启用语义排名,以提高搜索结果的相关性。您可以通过在创建向量存储时将 search.type 属性设置为 AzureAISearchQueryType.SemanticHybrid 来启用语义排名。请注意,语义排名功能仅在基本及更高定价层中可用,并且受 区域可用性 限制。

    您可以在 此博客文章 中详细了解使用语义排名与混合搜索的性能。

    示例:索引文档、向量搜索和 LLM 集成

    以下示例演示了如何将文件中的文档索引到 Azure AI 搜索中,运行混合搜索查询,最后使用链根据检索到的文档以自然语言回答问题。

    import {
    AzureAISearchVectorStore,
    AzureAISearchQueryType,
    } from "@langchain/community/vectorstores/azure_aisearch";
    import { ChatPromptTemplate } from "@langchain/core/prompts";
    import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
    import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
    import { createRetrievalChain } from "langchain/chains/retrieval";
    import { TextLoader } from "langchain/document_loaders/fs/text";
    import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

    // Load documents from file
    const loader = new TextLoader("./state_of_the_union.txt");
    const rawDocuments = await loader.load();
    const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000,
    chunkOverlap: 0,
    });
    const documents = await splitter.splitDocuments(rawDocuments);

    // Create Azure AI Search vector store
    const store = await AzureAISearchVectorStore.fromDocuments(
    documents,
    new OpenAIEmbeddings(),
    {
    search: {
    type: AzureAISearchQueryType.SimilarityHybrid,
    },
    }
    );

    // The first time you run this, the index will be created.
    // You may need to wait a bit for the index to be created before you can perform
    // a search, or you can create the index manually beforehand.

    // Performs a similarity search
    const resultDocuments = await store.similaritySearch(
    "What did the president say about Ketanji Brown Jackson?"
    );

    console.log("Similarity search results:");
    console.log(resultDocuments[0].pageContent);
    /*
    Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections.

    Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.

    One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.

    And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
    */

    // Use the store as part of a chain
    const model = new ChatOpenAI({ model: "gpt-3.5-turbo-1106" });
    const questionAnsweringPrompt = ChatPromptTemplate.fromMessages([
    [
    "system",
    "Answer the user's questions based on the below context:\n\n{context}",
    ],
    ["human", "{input}"],
    ]);

    const combineDocsChain = await createStuffDocumentsChain({
    llm: model,
    prompt: questionAnsweringPrompt,
    });

    const chain = await createRetrievalChain({
    retriever: store.asRetriever(),
    combineDocsChain,
    });

    const response = await chain.invoke({
    input: "What is the president's top priority regarding prices?",
    });

    console.log("Chain response:");
    console.log(response.answer);
    /*
    The president's top priority is getting prices under control.
    */

    API 参考


    此页面对您有帮助吗?


    您也可以在 GitHub 上留下详细反馈 在 GitHub 上.