AI Gateway/Multimodal
PDF Inputs
Send PDF documents to models through the Agnic AI Gateway
PDF Inputs
Process PDF documents with compatible models through the Agnic AI Gateway. PDFs can be sent as direct URLs or base64-encoded data URLs.
Overview
PDF processing is available via the /v1/chat/completions API using the file content type. This feature works with models that support file input.
When a model supports file input natively, the PDF is passed directly. Otherwise, the PDF is parsed and the text is passed to the model.
Using PDF URLs
For publicly accessible PDFs, send the URL directly:
Python
from openai import OpenAI
client = OpenAI(
api_key="agnic_tok_YOUR_TOKEN",
base_url="https://api.agnic.ai/v1"
)
response = client.chat.completions.create(
model="anthropic/claude-3.5-sonnet",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the main points in this document?"
},
{
"type": "file",
"file": {
"filename": "document.pdf",
"file_data": "https://example.com/document.pdf"
}
}
]
}
]
)
print(response.choices[0].message.content)JavaScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'agnic_tok_YOUR_TOKEN',
baseURL: 'https://api.agnic.ai/v1'
});
const response = await client.chat.completions.create({
model: 'anthropic/claude-3.5-sonnet',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'What are the main points in this document?'
},
{
type: 'file',
file: {
filename: 'document.pdf',
file_data: 'https://example.com/document.pdf'
}
}
]
}
]
});
console.log(response.choices[0].message.content);cURL
curl https://api.agnic.ai/v1/chat/completions \
-H "Authorization: Bearer agnic_tok_YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3.5-sonnet",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize this document"},
{
"type": "file",
"file": {
"filename": "document.pdf",
"file_data": "https://example.com/document.pdf"
}
}
]
}
]
}'Using Base64 Encoded PDFs
For local PDF files:
Python
from openai import OpenAI
import base64
def encode_pdf(pdf_path):
with open(pdf_path, "rb") as pdf_file:
return base64.b64encode(pdf_file.read()).decode('utf-8')
client = OpenAI(
api_key="agnic_tok_YOUR_TOKEN",
base_url="https://api.agnic.ai/v1"
)
# Read and encode the PDF
pdf_path = "path/to/your/document.pdf"
base64_pdf = encode_pdf(pdf_path)
data_url = f"data:application/pdf;base64,{base64_pdf}"
response = client.chat.completions.create(
model="anthropic/claude-3.5-sonnet",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the key findings in this report?"
},
{
"type": "file",
"file": {
"filename": "report.pdf",
"file_data": data_url
}
}
]
}
]
)
print(response.choices[0].message.content)JavaScript
import OpenAI from 'openai';
import fs from 'fs';
const client = new OpenAI({
apiKey: 'agnic_tok_YOUR_TOKEN',
baseURL: 'https://api.agnic.ai/v1'
});
// Read and encode the PDF
const pdfPath = 'path/to/your/document.pdf';
const pdfBuffer = fs.readFileSync(pdfPath);
const base64Pdf = pdfBuffer.toString('base64');
const dataUrl = `data:application/pdf;base64,${base64Pdf}`;
const response = await client.chat.completions.create({
model: 'anthropic/claude-3.5-sonnet',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'What are the key findings in this report?'
},
{
type: 'file',
file: {
filename: 'report.pdf',
file_data: dataUrl
}
}
]
}
]
});
console.log(response.choices[0].message.content);Compatible Models
Models with PDF/file processing capabilities:
| Provider | Models | Notes |
|---|---|---|
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | Native PDF support |
| Gemini 1.5 Pro, Gemini 2.0 Flash | Native file support |
Use Cases
- Document summarization - Get key points from long documents
- Data extraction - Pull specific information from reports
- Q&A over documents - Ask questions about PDF content
- Contract analysis - Review legal documents
- Research papers - Analyze academic content
Best Practices
- Use URLs when possible - More efficient for large files
- Provide context - Tell the model what to look for
- Break up large documents - Split very long PDFs if needed
- Check model limits - Different models have different page limits
For scanned PDFs or documents with images, use models with strong OCR capabilities like Claude 3 or Gemini 1.5 Pro.
Troubleshooting
PDF not processing?
- Verify the model supports file input
- Check that the PDF is not corrupted
- Ensure the file size is within limits
Poor extraction results?
- Try a model with better OCR capabilities
- Ensure the PDF has selectable text (not just images)