Image Inputs
Send images to vision models through the Agnic AI Gateway
Image Inputs
Send images to multimodal models for analysis, description, OCR, and more. Agnic supports multiple image formats and both URL-based and base64-encoded images.
Overview
Image requests are available via the /v1/chat/completions API with a multi-part messages parameter. The image_url can either be a URL or a base64-encoded image.
Multiple images can be sent in separate content array entries. We recommend sending the text prompt first, then the images for best results.
Using Image URLs
For publicly accessible images, send the URL directly:
Python
from openai import OpenAI
client = OpenAI(
api_key="agnic_tok_YOUR_TOKEN",
base_url="https://api.agnic.ai/v1"
)
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]
)
print(response.choices[0].message.content)JavaScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'agnic_tok_YOUR_TOKEN',
baseURL: 'https://api.agnic.ai/v1'
});
const response = await client.chat.completions.create({
model: 'openai/gpt-4o',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: "What's in this image?"
},
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg'
}
}
]
}
]
});
console.log(response.choices[0].message.content);cURL
curl https://api.agnic.ai/v1/chat/completions \
-H "Authorization: Bearer agnic_tok_YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}'Using Base64 Encoded Images
For locally stored images, use base64 encoding:
Python
from openai import OpenAI
import base64
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
client = OpenAI(
api_key="agnic_tok_YOUR_TOKEN",
base_url="https://api.agnic.ai/v1"
)
# Read and encode the image
image_path = "path/to/your/image.jpg"
base64_image = encode_image(image_path)
data_url = f"data:image/jpeg;base64,{base64_image}"
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": data_url
}
}
]
}
]
)
print(response.choices[0].message.content)JavaScript
import OpenAI from 'openai';
import fs from 'fs';
const client = new OpenAI({
apiKey: 'agnic_tok_YOUR_TOKEN',
baseURL: 'https://api.agnic.ai/v1'
});
// Read and encode the image
const imagePath = 'path/to/your/image.jpg';
const imageBuffer = fs.readFileSync(imagePath);
const base64Image = imageBuffer.toString('base64');
const dataUrl = `data:image/jpeg;base64,${base64Image}`;
const response = await client.chat.completions.create({
model: 'openai/gpt-4o',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: "What's in this image?"
},
{
type: 'image_url',
image_url: {
url: dataUrl
}
}
]
}
]
});
console.log(response.choices[0].message.content);Supported Image Formats
| Format | MIME Type | Extension |
|---|---|---|
| PNG | image/png | .png |
| JPEG | image/jpeg | .jpg, .jpeg |
| WebP | image/webp | .webp |
| GIF | image/gif | .gif |
Multiple Images
Send multiple images in a single request:
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Compare these two images"
},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image1.jpg"}
},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image2.jpg"}
}
]
}
]
)Compatible Models
Models with vision capabilities include:
| Provider | Models |
|---|---|
| OpenAI | openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-4-turbo |
| Anthropic | anthropic/claude-3.5-sonnet, anthropic/claude-3-opus, anthropic/claude-3-haiku |
google/gemini-2.0-flash, google/gemini-1.5-pro, google/gemini-1.5-flash |
Check the model's architecture.input_modalities for "image" support.
Best Practices
- Use URLs for public images - More efficient than base64 encoding
- Compress large images - Reduce payload size without losing quality
- Send text first - Place your prompt before the images
- Check model limits - Different models have different image count limits