如何缓存模型响应
LangChain 为 LLM 提供了可选的缓存层。这在两个方面很有用
如果您经常多次请求相同的补全,它可以减少您对 LLM 提供商的 API 调用次数,从而节省您的资金。 它可以减少您对 LLM 提供商的 API 调用次数,从而加速您的应用程序。
有关安装集成包的通用说明,请参阅此部分。
- npm
- Yarn
- pnpm
npm install @langchain/openai @langchain/core
yarn add @langchain/openai @langchain/core
pnpm add @langchain/openai @langchain/core
import { OpenAI } from "@langchain/openai";
const model = new OpenAI({
model: "gpt-3.5-turbo-instruct",
cache: true,
});
内存缓存
默认缓存存储在内存中。 这意味着如果您重新启动应用程序,缓存将被清除。
console.time();
// The first time, it is not yet in cache, so it should take longer
const res = await model.invoke("Tell me a long joke");
console.log(res);
console.timeEnd();
/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.
The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."
Intrigued, the man asks what the tasks are.
The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."
The man thinks for a moment and then confidently says, "I'll do it."
He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.
The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.
The bartender hands the man the jar of money and asks, "How
default: 4.187s
*/
console.time();
// The second time it is, so it goes faster
const res2 = await model.invoke("Tell me a joke");
console.log(res2);
console.timeEnd();
/*
A man walks into a bar and sees a jar filled with money on the counter. Curious, he asks the bartender about it.
The bartender explains, "We have a challenge for our customers. If you can complete three tasks, you win all the money in the jar."
Intrigued, the man asks what the tasks are.
The bartender replies, "First, you have to drink a whole bottle of tequila without making a face. Second, there's a pitbull out back with a sore tooth. You have to pull it out. And third, there's an old lady upstairs who has never had an orgasm. You have to give her one."
The man thinks for a moment and then confidently says, "I'll do it."
He grabs the bottle of tequila and downs it in one gulp, without flinching. He then heads to the back and after a few minutes of struggling, emerges with the pitbull's tooth in hand.
The bar erupts in cheers and the bartender leads the man upstairs to the old lady's room. After a few minutes, the man walks out with a big smile on his face and the old lady is giggling with delight.
The bartender hands the man the jar of money and asks, "How
default: 175.74ms
*/
使用 Momento 缓存
LangChain 还提供了基于 Momento 的缓存。 Momento 是一种分布式、无服务器缓存,无需任何设置或基础设施维护。 鉴于 Momento 与 Node.js、浏览器和边缘环境的兼容性,请确保您安装了相关的软件包。
要为 Node.js 安装
- npm
- Yarn
- pnpm
npm install @gomomento/sdk
yarn add @gomomento/sdk
pnpm add @gomomento/sdk
要为 浏览器/边缘工作者 安装
- npm
- Yarn
- pnpm
npm install @gomomento/sdk-web
yarn add @gomomento/sdk-web
pnpm add @gomomento/sdk-web
接下来,您需要注册并创建一个 API 密钥。 完成后,在实例化 LLM 时传递 cache
选项,如下所示
import { OpenAI } from "@langchain/openai";
import {
CacheClient,
Configurations,
CredentialProvider,
} from "@gomomento/sdk";
import { MomentoCache } from "@langchain/community/caches/momento";
// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
configuration: Configurations.Laptop.v1(),
credentialProvider: CredentialProvider.fromEnvironmentVariable({
environmentVariableName: "MOMENTO_API_KEY",
}),
defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
client,
cacheName: "langchain",
});
const model = new OpenAI({ cache });
API 参考
- OpenAI 来自
@langchain/openai
- MomentoCache 来自
@langchain/community/caches/momento
使用 Redis 缓存
LangChain 还提供了基于 Redis 的缓存。 如果您想在多个进程或服务器之间共享缓存,这将非常有用。 要使用它,您需要安装 redis
包
- npm
- Yarn
- pnpm
npm install ioredis
yarn add ioredis
pnpm add ioredis
然后,您可以在实例化 LLM 时传递 cache
选项。 例如
import { OpenAI } from "@langchain/openai";
import { RedisCache } from "@langchain/community/caches/ioredis";
import { Redis } from "ioredis";
// See https://github.com/redis/ioredis for connection options
const client = new Redis({});
const cache = new RedisCache(client);
const model = new OpenAI({ cache });
使用 Upstash Redis 缓存
LangChain 提供了基于 Upstash Redis 的缓存。 与基于 Redis 的缓存类似,如果您想在多个进程或服务器之间共享缓存,此缓存非常有用。 Upstash Redis 客户端使用 HTTP 并支持边缘环境。 要使用它,您需要安装 @upstash/redis
包
- npm
- Yarn
- pnpm
npm install @upstash/redis
yarn add @upstash/redis
pnpm add @upstash/redis
您还需要一个 Upstash 帐户 和一个 Redis 数据库 来连接。 完成后,检索您的 REST URL 和 REST 令牌。
然后,您可以在实例化 LLM 时传递 cache
选项。 例如
import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";
// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
config: {
url: "UPSTASH_REDIS_REST_URL",
token: "UPSTASH_REDIS_REST_TOKEN",
},
ttl: 3600,
});
const model = new OpenAI({ cache });
API 参考
- OpenAI 来自
@langchain/openai
- UpstashRedisCache 来自
@langchain/community/caches/upstash_redis
您还可以直接传入先前创建的 @upstash/redis 客户端实例
import { Redis } from "@upstash/redis";
import https from "https";
import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";
// const client = new Redis({
// url: process.env.UPSTASH_REDIS_REST_URL!,
// token: process.env.UPSTASH_REDIS_REST_TOKEN!,
// agent: new https.Agent({ keepAlive: true }),
// });
// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
agent: new https.Agent({ keepAlive: true }),
});
const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });
API 参考
- OpenAI 来自
@langchain/openai
- UpstashRedisCache 来自
@langchain/community/caches/upstash_redis
使用 Vercel KV 缓存
LangChain 提供了基于 Vercel KV 的缓存。 与基于 Redis 的缓存类似,如果您想在多个进程或服务器之间共享缓存,此缓存非常有用。 Vercel KV 客户端使用 HTTP 并支持边缘环境。 要使用它,您需要安装 @vercel/kv
包
- npm
- Yarn
- pnpm
npm install @vercel/kv
yarn add @vercel/kv
pnpm add @vercel/kv
您还需要一个 Vercel 帐户和一个 KV 数据库 来连接。 完成后,检索您的 REST URL 和 REST 令牌。
然后,您可以在实例化 LLM 时传递 cache
选项。 例如
import { OpenAI } from "@langchain/openai";
import { VercelKVCache } from "@langchain/community/caches/vercel_kv";
import { createClient } from "@vercel/kv";
// See https://vercel.com/docs/storage/vercel-kv/kv-reference#createclient-example for connection options
const cache = new VercelKVCache({
client: createClient({
url: "VERCEL_KV_API_URL",
token: "VERCEL_KV_API_TOKEN",
}),
ttl: 3600,
});
const model = new OpenAI({ cache });
API 参考
- OpenAI 来自
@langchain/openai
- VercelKVCache 来自
@langchain/community/caches/vercel_kv
使用 Cloudflare KV 缓存
此集成仅在 Cloudflare Workers 中受支持。
如果您将项目部署为 Cloudflare Worker,则可以使用 LangChain 的 Cloudflare KV 驱动的 LLM 缓存。
有关如何在 Cloudflare 中设置 KV 的信息,请参阅官方文档。
注意:如果您使用的是 TypeScript,则可能需要安装类型定义(如果它们尚不存在)
- npm
- Yarn
- pnpm
npm install -S @cloudflare/workers-types
yarn add @cloudflare/workers-types
pnpm add @cloudflare/workers-types
import type { KVNamespace } from "@cloudflare/workers-types";
import { OpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";
export interface Env {
KV_NAMESPACE: KVNamespace;
OPENAI_API_KEY: string;
}
export default {
async fetch(_request: Request, env: Env) {
try {
const cache = new CloudflareKVCache(env.KV_NAMESPACE);
const model = new OpenAI({
cache,
model: "gpt-3.5-turbo-instruct",
apiKey: env.OPENAI_API_KEY,
});
const response = await model.invoke("How are you today?");
return new Response(JSON.stringify(response), {
headers: { "content-type": "application/json" },
});
} catch (err: any) {
console.log(err.message);
return new Response(err.message, { status: 500 });
}
},
};
API 参考
- OpenAI 来自
@langchain/openai
- CloudflareKVCache 来自
@langchain/cloudflare
在文件系统上缓存
不建议在生产环境中使用此缓存。 它仅适用于本地开发。
LangChain 提供了一个简单的文件系统缓存。 默认情况下,缓存存储在临时目录中,但如果您需要,可以指定自定义目录。
const cache = await LocalFileCache.create();
下一步
您现在已经学习了如何缓存模型响应以节省时间和金钱。
接下来,查看关于 LLM 的其他操作指南,例如如何创建您自己的自定义 LLM 类。