跳到主要内容

如何缓存模型响应

LangChain 为 LLM 提供了可选的缓存层。这在两个方面很有用

如果您经常多次请求相同的补全,它可以减少您对 LLM 提供商的 API 调用次数,从而节省您的资金。 它可以减少您对 LLM 提供商的 API 调用次数,从而加速您的应用程序。

提示

有关安装集成包的通用说明,请参阅此部分

npm install @langchain/openai @langchain/core
import { OpenAI } from "@langchain/openai";

const model = new OpenAI({
model: "gpt-3.5-turbo-instruct",
cache: true,
});

内存缓存

默认缓存存储在内存中。 这意味着如果您重新启动应用程序,缓存将被清除。

console.time();

// The first time, it is not yet in cache, so it should take longer
const res = await model.invoke("Tell me a long joke");

console.log(res);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 4.187s
*/
console.time();

// The second time it is, so it goes faster
const res2 = await model.invoke("Tell me a joke");

console.log(res2);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 175.74ms
*/

使用 Momento 缓存

LangChain 还提供了基于 Momento 的缓存。 Momento 是一种分布式、无服务器缓存,无需任何设置或基础设施维护。 鉴于 Momento 与 Node.js、浏览器和边缘环境的兼容性,请确保您安装了相关的软件包。

要为 Node.js 安装

npm install @gomomento/sdk

要为 浏览器/边缘工作者 安装

npm install @gomomento/sdk-web

接下来,您需要注册并创建一个 API 密钥。 完成后,在实例化 LLM 时传递 cache 选项,如下所示

import { OpenAI } from "@langchain/openai";
import {
CacheClient,
Configurations,
CredentialProvider,
} from "@gomomento/sdk";
import { MomentoCache } from "@langchain/community/caches/momento";

// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
configuration: Configurations.Laptop.v1(),
credentialProvider: CredentialProvider.fromEnvironmentVariable({
environmentVariableName: "MOMENTO_API_KEY",
}),
defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
client,
cacheName: "langchain",
});

const model = new OpenAI({ cache });

API 参考

使用 Redis 缓存

LangChain 还提供了基于 Redis 的缓存。 如果您想在多个进程或服务器之间共享缓存,这将非常有用。 要使用它,您需要安装 redis

npm install ioredis

然后,您可以在实例化 LLM 时传递 cache 选项。 例如

import { OpenAI } from "@langchain/openai";
import { RedisCache } from "@langchain/community/caches/ioredis";
import { Redis } from "ioredis";

// See https://github.com/redis/ioredis for connection options
const client = new Redis({});

const cache = new RedisCache(client);

const model = new OpenAI({ cache });

使用 Upstash Redis 缓存

LangChain 提供了基于 Upstash Redis 的缓存。 与基于 Redis 的缓存类似,如果您想在多个进程或服务器之间共享缓存,此缓存非常有用。 Upstash Redis 客户端使用 HTTP 并支持边缘环境。 要使用它,您需要安装 @upstash/redis

npm install @upstash/redis

您还需要一个 Upstash 帐户 和一个 Redis 数据库 来连接。 完成后,检索您的 REST URL 和 REST 令牌。

然后,您可以在实例化 LLM 时传递 cache 选项。 例如

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
config: {
url: "UPSTASH_REDIS_REST_URL",
token: "UPSTASH_REDIS_REST_TOKEN",
},
ttl: 3600,
});

const model = new OpenAI({ cache });

API 参考

您还可以直接传入先前创建的 @upstash/redis 客户端实例

import { Redis } from "@upstash/redis";
import https from "https";

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// const client = new Redis({
// url: process.env.UPSTASH_REDIS_REST_URL!,
// token: process.env.UPSTASH_REDIS_REST_TOKEN!,
// agent: new https.Agent({ keepAlive: true }),
// });

// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
agent: new https.Agent({ keepAlive: true }),
});

const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });

API 参考

使用 Vercel KV 缓存

LangChain 提供了基于 Vercel KV 的缓存。 与基于 Redis 的缓存类似,如果您想在多个进程或服务器之间共享缓存,此缓存非常有用。 Vercel KV 客户端使用 HTTP 并支持边缘环境。 要使用它,您需要安装 @vercel/kv

npm install @vercel/kv

您还需要一个 Vercel 帐户和一个 KV 数据库 来连接。 完成后,检索您的 REST URL 和 REST 令牌。

然后,您可以在实例化 LLM 时传递 cache 选项。 例如

import { OpenAI } from "@langchain/openai";
import { VercelKVCache } from "@langchain/community/caches/vercel_kv";
import { createClient } from "@vercel/kv";

// See https://vercel.com/docs/storage/vercel-kv/kv-reference#createclient-example for connection options
const cache = new VercelKVCache({
client: createClient({
url: "VERCEL_KV_API_URL",
token: "VERCEL_KV_API_TOKEN",
}),
ttl: 3600,
});

const model = new OpenAI({ cache });

API 参考

使用 Cloudflare KV 缓存

信息

此集成仅在 Cloudflare Workers 中受支持。

如果您将项目部署为 Cloudflare Worker,则可以使用 LangChain 的 Cloudflare KV 驱动的 LLM 缓存。

有关如何在 Cloudflare 中设置 KV 的信息,请参阅官方文档

注意:如果您使用的是 TypeScript,则可能需要安装类型定义(如果它们尚不存在)

npm install -S @cloudflare/workers-types
import type { KVNamespace } from "@cloudflare/workers-types";

import { OpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";

export interface Env {
KV_NAMESPACE: KVNamespace;
OPENAI_API_KEY: string;
}

export default {
async fetch(_request: Request, env: Env) {
try {
const cache = new CloudflareKVCache(env.KV_NAMESPACE);
const model = new OpenAI({
cache,
model: "gpt-3.5-turbo-instruct",
apiKey: env.OPENAI_API_KEY,
});
const response = await model.invoke("How are you today?");
return new Response(JSON.stringify(response), {
headers: { "content-type": "application/json" },
});
} catch (err: any) {
console.log(err.message);
return new Response(err.message, { status: 500 });
}
},
};

API 参考

在文件系统上缓存

危险

不建议在生产环境中使用此缓存。 它仅适用于本地开发。

LangChain 提供了一个简单的文件系统缓存。 默认情况下,缓存存储在临时目录中,但如果您需要,可以指定自定义目录。

const cache = await LocalFileCache.create();

下一步

您现在已经学习了如何缓存模型响应以节省时间和金钱。

接下来,查看关于 LLM 的其他操作指南,例如如何创建您自己的自定义 LLM 类


此页面是否对您有帮助?


您也可以留下详细的反馈 在 GitHub 上.