跳至主要内容

如何获取日志概率

先决条件

本指南假设您熟悉以下概念

某些聊天模型可以配置为返回令牌级日志概率,表示给定令牌的可能性。本指南将逐步介绍如何在 LangChain 中获取这些信息。

OpenAI

安装 @langchain/openai 包并设置您的 API 密钥

yarn add @langchain/openai

为了使 OpenAI API 返回日志概率,我们需要将 logprobs 参数设置为 true。然后,日志概率将包含在每个输出 AIMessage 中作为 response_metadata 的一部分

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
model: "gpt-4o",
logprobs: true,
});

const responseMessage = await model.invoke("how are you today?");

responseMessage.response_metadata.logprobs.content.slice(0, 5);
[
{
token: "Thank",
logprob: -0.70174205,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000013856493,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
},
{
token: "!",
logprob: -0.00030102333,
bytes: [ 33 ],
top_logprobs: []
}
]

并且也是流式传输的消息块的一部分

let count = 0;
const stream = await model.stream("How are you today?");
let aggregateResponse;

for await (const chunk of stream) {
if (count > 5) {
break;
}
if (aggregateResponse === undefined) {
aggregateResponse = chunk;
} else {
aggregateResponse = aggregateResponse.concat(chunk);
}
console.log(aggregateResponse.response_metadata.logprobs?.content);
count++;
}
[]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000029352968,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
}
]
[
{
token: "Thank",
logprob: -0.23375113,
bytes: [ 84, 104, 97, 110, 107 ],
top_logprobs: []
},
{
token: " you",
logprob: 0,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: " for",
logprob: -0.000004723352,
bytes: [ 32, 102, 111, 114 ],
top_logprobs: []
},
{
token: " asking",
logprob: -0.0000029352968,
bytes: [
32, 97, 115,
107, 105, 110,
103
],
top_logprobs: []
},
{
token: "!",
logprob: -0.00039694557,
bytes: [ 33 ],
top_logprobs: []
}
]

topLogprobs

要查看每个步骤的替代潜在生成,可以使用 topLogprobs 参数

const model = new ChatOpenAI({
model: "gpt-4o",
logprobs: true,
topLogprobs: 3,
});

const responseMessage = await model.invoke("how are you today?");

responseMessage.response_metadata.logprobs.content.slice(0, 5);
[
{
token: "I'm",
logprob: -2.2864406,
bytes: [ 73, 39, 109 ],
top_logprobs: [
{
token: "Thank",
logprob: -0.28644064,
bytes: [ 84, 104, 97, 110, 107 ]
},
{
token: "Hello",
logprob: -2.0364406,
bytes: [ 72, 101, 108, 108, 111 ]
},
{ token: "I'm", logprob: -2.2864406, bytes: [ 73, 39, 109 ] }
]
},
{
token: " just",
logprob: -0.14442946,
bytes: [ 32, 106, 117, 115, 116 ],
top_logprobs: [
{
token: " just",
logprob: -0.14442946,
bytes: [ 32, 106, 117, 115, 116 ]
},
{ token: " an", logprob: -2.2694294, bytes: [ 32, 97, 110 ] },
{
token: " here",
logprob: -4.0194297,
bytes: [ 32, 104, 101, 114, 101 ]
}
]
},
{
token: " a",
logprob: -0.00066632946,
bytes: [ 32, 97 ],
top_logprobs: [
{ token: " a", logprob: -0.00066632946, bytes: [ 32, 97 ] },
{
token: " lines",
logprob: -7.750666,
bytes: [ 32, 108, 105, 110, 101, 115 ]
},
{ token: " an", logprob: -9.250667, bytes: [ 32, 97, 110 ] }
]
},
{
token: " computer",
logprob: -0.015423919,
bytes: [
32, 99, 111, 109,
112, 117, 116, 101,
114
],
top_logprobs: [
{
token: " computer",
logprob: -0.015423919,
bytes: [
32, 99, 111, 109,
112, 117, 116, 101,
114
]
},
{
token: " program",
logprob: -5.265424,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
]
},
{
token: " machine",
logprob: -5.390424,
bytes: [
32, 109, 97, 99,
104, 105, 110, 101
]
}
]
},
{
token: " program",
logprob: -0.0010724656,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
],
top_logprobs: [
{
token: " program",
logprob: -0.0010724656,
bytes: [
32, 112, 114, 111,
103, 114, 97, 109
]
},
{
token: "-based",
logprob: -6.8760724,
bytes: [ 45, 98, 97, 115, 101, 100 ]
},
{
token: " algorithm",
logprob: -10.626073,
bytes: [
32, 97, 108, 103,
111, 114, 105, 116,
104, 109
]
}
]
}
]

下一步

您现在已经学习了如何在 LangChain 中从 OpenAI 模型获取日志概率。

接下来,查看本节中其他有关聊天模型的操作指南,例如 如何让模型返回结构化输出如何跟踪令牌使用情况


此页面是否有帮助?


您也可以在 GitHub 上留下详细的反馈 在 GitHub 上.