如何获取日志概率
先决条件
本指南假设您熟悉以下概念
某些聊天模型可以配置为返回令牌级日志概率,表示给定令牌的可能性。本指南将介绍如何在 LangChain 中获取此信息。
OpenAI
安装 @langchain/openai
包并设置您的 API 密钥
提示
- npm
- yarn
- pnpm
npm i @langchain/openai @langchain/core
yarn add @langchain/openai @langchain/core
pnpm add @langchain/openai @langchain/core
为了使 OpenAI API 返回日志概率,我们需要将 logprobs
参数设置为 true
。然后,日志概率将包含在每个输出 AIMessage
中,作为 response_metadata
的一部分
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
model: "gpt-4o",
logprobs: true,
});
const responseMessage = await model.invoke("how are you today?");
responseMessage.response_metadata.logprobs.content.slice(0, 5);
[
{
token: "Thank",
logprob: -0.70174205,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000013856493,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
},
{
token: "!",
logprob: -0.00030102333,
bytes: [ 33 ],
top_logprobs: []
}
]
并且也作为流式传输的消息块的一部分
let count = 0;
const stream = await model.stream("How are you today?");
let aggregateResponse;
for await (const chunk of stream) {
if (count > 5) {
break;
}
if (aggregateResponse === undefined) {
aggregateResponse = chunk;
} else {
aggregateResponse = aggregateResponse.concat(chunk);
}
console.log(aggregateResponse.response_metadata.logprobs?.content);
count++;
}
[]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000029352968,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000029352968,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
},
{
token: "!",
logprob: -0.00039694557,
bytes: [ 33 ],
top_logprobs: []
}
]
topLogprobs
要查看每个步骤的替代潜在生成,可以使用 topLogprobs
参数
const modelWithTopLogprobs = new ChatOpenAI({
model: "gpt-4o",
logprobs: true,
topLogprobs: 3,
});
const res = await modelWithTopLogprobs.invoke("how are you today?");
res.response_metadata.logprobs.content.slice(0, 5);
[
{
token: "I'm",
logprob: -2.2864406,
bytes: [ 73, 39, 109 ],
top_logprobs: [
{
token: "Thank",
logprob: -0.28644064,
bytes: [ 84, 104, 97, 110, 107 ]
},
{
token: "Hello",
logprob: -2.0364406,
bytes: [ 72, 101, 108, 108, 111 ]
},
{ token: "I'm", logprob: -2.2864406, bytes: [ 73, 39, 109 ] }
]
},
{
token: " just",
logprob: -0.14442946,
bytes: [ 32, 106, 117, 115, 116 ],
top_logprobs: [
{
token: " just",
logprob: -0.14442946,
bytes: [ 32, 106, 117, 115, 116 ]
},
{ token: " an", logprob: -2.2694294, bytes: [ 32, 97, 110 ] },
{
token: " here",
logprob: -4.0194297,
bytes: [ 32, 104, 101, 114, 101 ]
}
]
},
{
token: " a",
logprob: -0.00066632946,
bytes: [ 32, 97 ],
top_logprobs: [
{ token: " a", logprob: -0.00066632946, bytes: [ 32, 97 ] },
{
token: " lines",
logprob: -7.750666,
bytes: [ 32, 108, 105, 110, 101, 115 ]
},
{ token: " an", logprob: -9.250667, bytes: [ 32, 97, 110 ] }
]
},
{
token: " computer",
logprob: -0.015423919,
bytes: [
32, 99, 111, 109,
112, 117, 116, 101,
114
],
top_logprobs: [
{
token: " computer",
logprob: -0.015423919,
bytes: [
32, 99, 111, 109,
112, 117, 116, 101,
114
]
},
{
token: " program",
logprob: -5.265424,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
]
},
{
token: " machine",
logprob: -5.390424,
bytes: [
32, 109, 97, 99,
104, 105, 110, 101
]
}
]
},
{
token: " program",
logprob: -0.0010724656,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
],
top_logprobs: [
{
token: " program",
logprob: -0.0010724656,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
]
},
{
token: "-based",
logprob: -6.8760724,
bytes: [ 45, 98, 97, 115, 101, 100 ]
},
{
token: " algorithm",
logprob: -10.626073,
bytes: [
32, 97, 108, 103,
111, 114, 105, 116,
104, 109
]
}
]
}
]
后续步骤
现在您已经了解了如何在 LangChain 中从 OpenAI 模型获取日志概率。
接下来,查看本部分中的其他操作指南聊天模型,例如如何让模型返回结构化输出或如何跟踪令牌使用情况。