跳至主要内容

BM25

BM25,也称为 Okapi BM25,是信息检索系统中使用的排序函数,用于估计文档与给定搜索查询的相关性。

您可以在检索管道中使用它作为后处理步骤,在从其他来源检索初始文档集后重新排序文档。

设置

BM25Retriever@langchain/community 导出。您需要像这样安装它

提示

有关安装集成包的一般说明,请参阅此部分.

yarn add @langchain/community @langchain/core

此检索器使用来自此实现 的 Okapi BM25 代码。

用法

您现在可以使用先前检索到的文档创建新的检索器

import { BM25Retriever } from "@langchain/community/retrievers/bm25";

const retriever = BM25Retriever.fromDocuments(
[
{ pageContent: "Buildings are made out of brick", metadata: {} },
{ pageContent: "Buildings are made out of wood", metadata: {} },
{ pageContent: "Buildings are made out of stone", metadata: {} },
{ pageContent: "Cars are made out of metal", metadata: {} },
{ pageContent: "Cars are made out of plastic", metadata: {} },
{ pageContent: "mitochondria is the powerhouse of the cell", metadata: {} },
{ pageContent: "mitochondria is made of lipids", metadata: {} },
],
{ k: 4 }
);

// Will return the 4 documents reranked by the BM25 algorithm
await retriever.invoke("mitochondria");
[
{ pageContent: 'mitochondria is made of lipids', metadata: {} },
{
pageContent: 'mitochondria is the powerhouse of the cell',
metadata: {}
},
{ pageContent: 'Buildings are made out of brick', metadata: {} },
{ pageContent: 'Buildings are made out of wood', metadata: {} }
]

此页面是否有用?


您也可以在 GitHub 上留下详细的反馈 on GitHub.