跳至主要内容

如何缓存模型响应

LangChain 为 LLM 提供了一个可选的缓存层。这很有用,原因有两个:

它可以节省您的资金,因为如果经常多次请求相同的完成,它可以减少您对 LLM 提供商发出的 API 调用次数。它可以加快您的应用程序速度,因为可以减少您对 LLM 提供商发出的 API 调用次数。

npm install @langchain/openai @langchain/core
import { OpenAI } from "@langchain/openai";

const model = new OpenAI({
model: "gpt-3.5-turbo-instruct",
cache: true,
});

内存缓存

默认缓存存储在内存中。这意味着如果重新启动应用程序,则缓存将被清除。

console.time();

// The first time, it is not yet in cache, so it should take longer
const res = await model.invoke("Tell me a long joke");

console.log(res);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 4.187s
*/
console.time();

// The second time it is, so it goes faster
const res2 = await model.invoke("Tell me a joke");

console.log(res2);

console.timeEnd();

/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.

The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."

Intrigued, the man asks what the tasks are.

The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."

The man thinks for a moment and then confidently says, "I'll do it."

He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.

The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.

The bartender hands the man the jar of money and asks, "How

default: 175.74ms
*/

使用 Momento 缓存

LangChain 还提供基于 Momento 的缓存。Momento 是一个分布式、无服务器缓存,不需要任何设置或基础设施维护。鉴于 Momento 与 Node.js、浏览器和边缘环境兼容,请确保安装相关软件包。

要安装用于 **Node.js** 的软件包

npm install @gomomento/sdk

要安装用于 **浏览器/边缘工作程序** 的软件包

npm install @gomomento/sdk-web

接下来,您需要注册并创建一个 API 密钥。完成后,在实例化 LLM 时传递一个 cache 选项,如下所示:

import { OpenAI } from "@langchain/openai";
import {
CacheClient,
Configurations,
CredentialProvider,
} from "@gomomento/sdk";
import { MomentoCache } from "@langchain/community/caches/momento";

// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
configuration: Configurations.Laptop.v1(),
credentialProvider: CredentialProvider.fromEnvironmentVariable({
environmentVariableName: "MOMENTO_API_KEY",
}),
defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
client,
cacheName: "langchain",
});

const model = new OpenAI({ cache });

API 参考

使用 Redis 缓存

LangChain 还提供基于 Redis 的缓存。如果您想在多个进程或服务器之间共享缓存,这很有用。要使用它,您需要安装 redis 软件包

npm install ioredis

然后,您可以在实例化 LLM 时传递一个 cache 选项。例如:

import { OpenAI } from "@langchain/openai";
import { RedisCache } from "@langchain/community/caches/ioredis";
import { Redis } from "ioredis";

// See https://github.com/redis/ioredis for connection options
const client = new Redis({});

const cache = new RedisCache(client);

const model = new OpenAI({ cache });

使用 Upstash Redis 缓存

LangChain 提供了一个基于 Upstash Redis 的缓存。与基于 Redis 的缓存一样,此缓存在您想在多个进程或服务器之间共享缓存时很有用。Upstash Redis 客户端使用 HTTP 并支持边缘环境。要使用它,您需要安装 @upstash/redis 软件包

npm install @upstash/redis

您还需要一个 Upstash 帐户 和一个 Redis 数据库 来连接。完成后,检索您的 REST URL 和 REST 令牌。

然后,您可以在实例化 LLM 时传递一个 cache 选项。例如:

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
config: {
url: "UPSTASH_REDIS_REST_URL",
token: "UPSTASH_REDIS_REST_TOKEN",
},
});

const model = new OpenAI({ cache });

API 参考

您也可以直接传入先前创建的 @upstash/redis 客户端实例

import { Redis } from "@upstash/redis";
import https from "https";

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// const client = new Redis({
// url: process.env.UPSTASH_REDIS_REST_URL!,
// token: process.env.UPSTASH_REDIS_REST_TOKEN!,
// agent: new https.Agent({ keepAlive: true }),
// });

// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
agent: new https.Agent({ keepAlive: true }),
});

const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });

API 参考

使用 Cloudflare KV 缓存

信息

此集成仅在 Cloudflare Workers 中受支持。

如果您将项目部署为 Cloudflare Worker,则可以使用 LangChain 的 Cloudflare KV 支持的 LLM 缓存。

有关如何在 Cloudflare 中设置 KV 的信息,请参阅 官方文档

注意:如果您使用的是 TypeScript,则可能需要安装类型(如果它们不存在)。

npm install -S @cloudflare/workers-types
import type { KVNamespace } from "@cloudflare/workers-types";

import { OpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";

export interface Env {
KV_NAMESPACE: KVNamespace;
OPENAI_API_KEY: string;
}

export default {
async fetch(_request: Request, env: Env) {
try {
const cache = new CloudflareKVCache(env.KV_NAMESPACE);
const model = new OpenAI({
cache,
model: "gpt-3.5-turbo-instruct",
apiKey: env.OPENAI_API_KEY,
});
const response = await model.invoke("How are you today?");
return new Response(JSON.stringify(response), {
headers: { "content-type": "application/json" },
});
} catch (err: any) {
console.log(err.message);
return new Response(err.message, { status: 500 });
}
},
};

API 参考

文件系统缓存

危险

不建议将此缓存用于生产环境。它仅用于本地开发。

LangChain 提供了一个简单的文件系统缓存。默认情况下,缓存存储在临时目录中,但如果需要,您可以指定自定义目录。

const cache = await LocalFileCache.create();

下一步

您现在已经了解了如何缓存模型响应以节省时间和金钱。

接下来,查看有关 LLM 的其他操作指南,例如 如何创建自己的自定义 LLM 类


此页面是否有帮助?


您也可以在 GitHub 上留下详细的反馈 在 GitHub 上.