跳至主要内容

如何缓存模型响应

LangChain 为 LLM 提供了一个可选的缓存层。这对于两个原因很有用

如果您经常多次请求相同的完成,它可以减少您对 LLM 提供商发出的 API 调用次数,从而节省您的开支。它可以减少您对 LLM 提供商发出的 API 调用次数,从而加快您的应用程序速度。

npm install @langchain/openai
import { OpenAI } from "@langchain/openai";

const model = new OpenAI({
model: "gpt-3.5-turbo-instruct",
cache: true,
});

内存缓存

默认缓存存储在内存中。这意味着如果您重新启动应用程序,缓存将被清除。

console.time();

// The first time, it is not yet in cache, so it should take longer
const res = await model.invoke("Tell me a long joke");

console.log(res);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 4.187s
*/
console.time();

// The second time it is, so it goes faster
const res2 = await model.invoke("Tell me a joke");

console.log(res2);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 175.74ms
*/

使用 Momento 进行缓存

LangChain 还提供了一个基于 Momento 的缓存。Momento 是一个分布式的无服务器缓存,无需任何设置或基础设施维护。鉴于 Momento 与 Node.js、浏览器和边缘环境兼容,请确保您安装了相关软件包。

要为 Node.js 安装

npm install @gomomento/sdk

要为 浏览器/边缘工作器 安装

npm install @gomomento/sdk-web

接下来,您需要注册并创建一个 API 密钥。完成此操作后,在实例化 LLM 时传递 cache 选项,如下所示

import { OpenAI } from "@langchain/openai";
import {
CacheClient,
Configurations,
CredentialProvider,
} from "@gomomento/sdk";
import { MomentoCache } from "@langchain/community/caches/momento";

// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
configuration: Configurations.Laptop.v1(),
credentialProvider: CredentialProvider.fromEnvironmentVariable({
environmentVariableName: "MOMENTO_API_KEY",
}),
defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
client,
cacheName: "langchain",
});

const model = new OpenAI({ cache });

API 参考

使用 Redis 进行缓存

LangChain 还提供了一个基于 Redis 的缓存。如果您想在多个进程或服务器之间共享缓存,这将很有用。要使用它,您需要安装 redis

npm install ioredis

然后,您可以在实例化 LLM 时传递 cache 选项。例如

import { OpenAI } from "@langchain/openai";
import { RedisCache } from "@langchain/community/caches/ioredis";
import { Redis } from "ioredis";

// See https://github.com/redis/ioredis for connection options
const client = new Redis({});

const cache = new RedisCache(client);

const model = new OpenAI({ cache });

使用 Upstash Redis 进行缓存

LangChain 提供了一个基于 Upstash Redis 的缓存。与基于 Redis 的缓存一样,此缓存如果您想在多个进程或服务器之间共享缓存,这将很有用。Upstash Redis 客户端使用 HTTP 并支持边缘环境。要使用它,您需要安装 @upstash/redis

npm install @upstash/redis

您还需要一个Upstash 帐户 和一个Redis 数据库 来连接。完成此操作后,检索您的 REST URL 和 REST 令牌。

然后,您可以在实例化 LLM 时传递 cache 选项。例如

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
config: {
url: "UPSTASH_REDIS_REST_URL",
token: "UPSTASH_REDIS_REST_TOKEN",
},
});

const model = new OpenAI({ cache });

API 参考

您也可以直接传入之前创建的@upstash/redis 客户端实例

import { Redis } from "@upstash/redis";
import https from "https";

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// const client = new Redis({
// url: process.env.UPSTASH_REDIS_REST_URL!,
// token: process.env.UPSTASH_REDIS_REST_TOKEN!,
// agent: new https.Agent({ keepAlive: true }),
// });

// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
agent: new https.Agent({ keepAlive: true }),
});

const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });

API 参考

使用 Cloudflare KV 进行缓存

信息

此集成仅在 Cloudflare Workers 中受支持。

如果您将项目部署为 Cloudflare Worker,则可以使用 LangChain 的 Cloudflare KV 支持的 LLM 缓存。

有关如何在 Cloudflare 中设置 KV 的信息,请参阅官方文档

注意:如果您使用的是 TypeScript,则可能需要安装类型(如果它们不存在)。

npm install -S @cloudflare/workers-types
import type { KVNamespace } from "@cloudflare/workers-types";

import { OpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";

export interface Env {
KV_NAMESPACE: KVNamespace;
OPENAI_API_KEY: string;
}

export default {
async fetch(_request: Request, env: Env) {
try {
const cache = new CloudflareKVCache(env.KV_NAMESPACE);
const model = new OpenAI({
cache,
model: "gpt-3.5-turbo-instruct",
apiKey: env.OPENAI_API_KEY,
});
const response = await model.invoke("How are you today?");
return new Response(JSON.stringify(response), {
headers: { "content-type": "application/json" },
});
} catch (err: any) {
console.log(err.message);
return new Response(err.message, { status: 500 });
}
},
};

API 参考

在文件系统上进行缓存

危险

此缓存不建议用于生产环境。它仅供本地开发使用。

LangChain 提供了一个简单的文件系统缓存。默认情况下,缓存存储在临时目录中,但您可以根据需要指定自定义目录。

const cache = await LocalFileCache.create();

下一步

现在您已经了解了如何缓存模型响应以节省时间和金钱。

接下来,查看有关 LLM 的其他操作指南,例如 如何创建您自己的自定义 LLM 类


此页面对您有帮助吗?


您也可以留下详细的反馈 在 GitHub 上.