PDF Inputs

Process PDF documents with compatible models through the Agnic AI Gateway. PDFs can be sent as direct URLs or base64-encoded data URLs.

Overview

PDF processing is available via the /v1/chat/completions API using the file content type. This feature works with models that support file input.

When a model supports file input natively, the PDF is passed directly. Otherwise, the PDF is parsed and the text is passed to the model.

Using PDF URLs

For publicly accessible PDFs, send the URL directly:

Python

from openai import OpenAI

client = OpenAI(
    api_key="agnic_tok_YOUR_TOKEN",
    base_url="https://api.agnic.ai/v1"
)

response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the main points in this document?"
                },
                {
                    "type": "file",
                    "file": {
                        "filename": "document.pdf",
                        "file_data": "https://example.com/document.pdf"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'agnic_tok_YOUR_TOKEN',
  baseURL: 'https://api.agnic.ai/v1'
});

const response = await client.chat.completions.create({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'What are the main points in this document?'
        },
        {
          type: 'file',
          file: {
            filename: 'document.pdf',
            file_data: 'https://example.com/document.pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.choices[0].message.content);

cURL

curl https://api.agnic.ai/v1/chat/completions \
  -H "Authorization: Bearer agnic_tok_YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3.5-sonnet",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Summarize this document"},
          {
            "type": "file",
            "file": {
              "filename": "document.pdf",
              "file_data": "https://example.com/document.pdf"
            }
          }
        ]
      }
    ]
  }'

Using Base64 Encoded PDFs

For local PDF files:

Python

from openai import OpenAI
import base64

def encode_pdf(pdf_path):
    with open(pdf_path, "rb") as pdf_file:
        return base64.b64encode(pdf_file.read()).decode('utf-8')

client = OpenAI(
    api_key="agnic_tok_YOUR_TOKEN",
    base_url="https://api.agnic.ai/v1"
)

# Read and encode the PDF
pdf_path = "path/to/your/document.pdf"
base64_pdf = encode_pdf(pdf_path)
data_url = f"data:application/pdf;base64,{base64_pdf}"

response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key findings in this report?"
                },
                {
                    "type": "file",
                    "file": {
                        "filename": "report.pdf",
                        "file_data": data_url
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

JavaScript

import OpenAI from 'openai';
import fs from 'fs';

const client = new OpenAI({
  apiKey: 'agnic_tok_YOUR_TOKEN',
  baseURL: 'https://api.agnic.ai/v1'
});

// Read and encode the PDF
const pdfPath = 'path/to/your/document.pdf';
const pdfBuffer = fs.readFileSync(pdfPath);
const base64Pdf = pdfBuffer.toString('base64');
const dataUrl = `data:application/pdf;base64,${base64Pdf}`;

const response = await client.chat.completions.create({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'What are the key findings in this report?'
        },
        {
          type: 'file',
          file: {
            filename: 'report.pdf',
            file_data: dataUrl
          }
        }
      ]
    }
  ]
});

console.log(response.choices[0].message.content);

Compatible Models

Models with PDF/file processing capabilities:

Provider	Models	Notes
Anthropic	Claude 3.5 Sonnet, Claude 3 Opus	Native PDF support
Google	Gemini 1.5 Pro, Gemini 2.0 Flash	Native file support

Use Cases

Document summarization - Get key points from long documents
Data extraction - Pull specific information from reports
Q&A over documents - Ask questions about PDF content
Contract analysis - Review legal documents
Research papers - Analyze academic content

Best Practices

Use URLs when possible - More efficient for large files
Provide context - Tell the model what to look for
Break up large documents - Split very long PDFs if needed
Check model limits - Different models have different page limits

For scanned PDFs or documents with images, use models with strong OCR capabilities like Claude 3 or Gemini 1.5 Pro.

Troubleshooting

PDF not processing?

Verify the model supports file input
Check that the PDF is not corrupted
Ensure the file size is within limits

Poor extraction results?

Try a model with better OCR capabilities
Ensure the PDF has selectable text (not just images)

PDF Inputs

PDF Inputs

Overview

Using PDF URLs

Python

JavaScript

cURL

Using Base64 Encoded PDFs

Python

JavaScript

Compatible Models

Use Cases

Best Practices

Troubleshooting

Image Inputs

Audio Inputs

On this page