如何使用多模态提示
这里演示了如何使用提示模板来格式化多模态输入以供模型使用。
在这个例子中,我们将要求一个模型描述一张图片。
先决条件
本指南假定您熟悉以下概念
- npm
- yarn
- pnpm
npm i axios @langchain/openai @langchain/core
yarn add axios @langchain/openai @langchain/core
pnpm add axios @langchain/openai @langchain/core
import axios from "axios";
const imageUrl =
"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg";
const axiosRes = await axios.get(imageUrl, { responseType: "arraybuffer" });
const base64 = btoa(
new Uint8Array(axiosRes.data).reduce(
(data, byte) => data + String.fromCharCode(byte),
""
)
);
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({ model: "gpt-4o" });
const prompt = ChatPromptTemplate.fromMessages([
["system", "Describe the image provided"],
[
"user",
[{ type: "image_url", image_url: "data:image/jpeg;base64,{base64}" }],
],
]);
const chain = prompt.pipe(model);
const response = await chain.invoke({ base64 });
console.log(response.content);
The image depicts a scenic outdoor landscape featuring a wooden boardwalk path extending forward through a large field of green grass and vegetation. On either side of the path, the grass is lush and vibrant, with a variety of bushes and low shrubs visible as well. The sky overhead is expansive and mostly clear, adorned with soft, wispy clouds, illuminated by the light giving a warm and serene ambiance. In the distant background, there are clusters of trees and additional foliage, suggesting a natural and tranquil setting, ideal for a peaceful walk or nature exploration.
我们也可以传入多张图片。
const promptWithMultipleImages = ChatPromptTemplate.fromMessages([
["system", "compare the two pictures provided"],
[
"user",
[
{
type: "image_url",
image_url: "data:image/jpeg;base64,{imageData1}",
},
{
type: "image_url",
image_url: "data:image/jpeg;base64,{imageData2}",
},
],
],
]);
const chainWithMultipleImages = promptWithMultipleImages.pipe(model);
const res = await chainWithMultipleImages.invoke({
imageData1: base64,
imageData2: base64,
});
console.log(res.content);
The two images provided are identical. Both show a wooden boardwalk path extending into a grassy field under a blue sky with scattered clouds. The scenery includes green shrubs and trees in the background, with a bright and clear sky above.