如何缓存聊天模型响应

先决条件

本指南假设您熟悉以下概念

LangChain 为聊天模型提供了一个可选的缓存层。这有两个好处：

如果您经常多次请求相同的完成，它可以节省您的资金，因为您可以减少对 LLM 提供商的 API 调用次数。它可以加快您的应用程序的速度，因为您可以减少对 LLM 提供商的 API 调用次数。

import { ChatOpenAI } from "@langchain/openai";

// To make the caching really obvious, lets use a slower model.
const model = new ChatOpenAI({
  model: "gpt-4",
  cache: true,
});

内存缓存

默认缓存存储在内存中。这意味着，如果您重新启动应用程序，缓存将被清除。

console.time();

// The first time, it is not yet in cache, so it should take longer
const res = await model.invoke("Tell me a joke!");
console.log(res);

console.timeEnd();

/*
  AIMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Why don't scientists trust atoms?\n\nBecause they make up everything!",
      additional_kwargs: { function_call: undefined, tool_calls: undefined }
    },
    lc_namespace: [ 'langchain_core', 'messages' ],
    content: "Why don't scientists trust atoms?\n\nBecause they make up everything!",
    name: undefined,
    additional_kwargs: { function_call: undefined, tool_calls: undefined }
  }
  default: 2.224s
*/

console.time();

// The second time it is, so it goes faster
const res2 = await model.invoke("Tell me a joke!");
console.log(res2);

console.timeEnd();
/*
  AIMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Why don't scientists trust atoms?\n\nBecause they make up everything!",
      additional_kwargs: { function_call: undefined, tool_calls: undefined }
    },
    lc_namespace: [ 'langchain_core', 'messages' ],
    content: "Why don't scientists trust atoms?\n\nBecause they make up everything!",
    name: undefined,
    additional_kwargs: { function_call: undefined, tool_calls: undefined }
  }
  default: 181.98ms
*/

使用 Redis 缓存

LangChain 还提供了一个基于 Redis 的缓存。如果您想在多个进程或服务器之间共享缓存，这将很有用。要使用它，您需要安装 redis 包

npm
Yarn
pnpm

npm install ioredis @langchain/community

yarn add ioredis @langchain/community

pnpm add ioredis @langchain/community

然后，您可以在实例化 LLM 时传递一个 cache 选项。例如

import { ChatOpenAI } from "@langchain/openai";
import { Redis } from "ioredis";
import { RedisCache } from "@langchain/community/caches/ioredis";

const client = new Redis("redis://localhost:6379");

const cache = new RedisCache(client, {
  ttl: 60, // Optional key expiration value
});

const model = new ChatOpenAI({ cache });

const response1 = await model.invoke("Do something random!");
console.log(response1);
/*
  AIMessage {
    content: "Sure! I'll generate a random number for you: 37",
    additional_kwargs: {}
  }
*/

const response2 = await model.invoke("Do something random!");
console.log(response2);
/*
  AIMessage {
    content: "Sure! I'll generate a random number for you: 37",
    additional_kwargs: {}
  }
*/

await client.disconnect();

API 参考

ChatOpenAI 来自 @langchain/openai
RedisCache 来自 @langchain/community/caches/ioredis

在文件系统上缓存

危险

此缓存不推荐用于生产环境。它仅用于本地开发。

LangChain 提供了一个简单的文件系统缓存。默认情况下，缓存存储在临时目录中，但您也可以指定自定义目录。

const cache = await LocalFileCache.create();

下一步

您现在已经学习了如何缓存模型响应以节省时间和金钱。

接下来，查看聊天模型的其他操作指南，例如如何让模型返回结构化输出或如何创建您自己的自定义聊天模型。

如何缓存聊天模型响应

内存缓存

使用 Redis 缓存

API 参考

在文件系统上缓存

下一步

此页面是否有帮助？

您也可以在 GitHub 上留下详细的反馈在 GitHub 上.

如何缓存聊天模型响应

内存缓存​

使用 Redis 缓存​

API 参考

在文件系统上缓存​

下一步​

此页面是否有帮助？

您也可以在 GitHub 上留下详细的反馈 在 GitHub 上.

内存缓存

使用 Redis 缓存

在文件系统上缓存

下一步

您也可以在 GitHub 上留下详细的反馈在 GitHub 上.