Models

One API for hundreds of models

Explore and browse 340+ models and providers through our unified API. Access models from OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and dozens more providers—all with a single API key.

View the complete list of all models with live pricing at AI Gateway Pricing.

# List all available models
curl https://api.agnic.ai/v1/models \
  -H "Authorization: Bearer agnic_tok_YOUR_TOKEN"

Models API

Our Models API makes the most important information about all LLMs freely available. Query model metadata, pricing, capabilities, and supported parameters programmatically.

Endpoint

GET /v1/models

Example Request

curl https://api.agnic.ai/v1/models \
  -H "Authorization: Bearer agnic_tok_YOUR_TOKEN" \
  -H "Content-Type: application/json"

The Models API returns a standardized JSON response format that provides comprehensive metadata for each available model. This schema is designed for reliable integration with production applications.

Root Response Object

{
  "data": [
    /* Array of Model objects */
  ]
}

Model Object Schema

Each model in the data array contains the following standardized fields:

Field	Type	Description
`id`	string	Unique model identifier used in API requests (e.g., `"openai/gpt-4o"`)
`name`	string	Human-readable display name for the model
`created`	number	Unix timestamp of when the model was added
`description`	string	Detailed description of the model's capabilities
`context_length`	number	Maximum context window size in tokens
`architecture`	Architecture	Object describing the model's technical capabilities
`pricing`	Pricing	Price structure for using this model
`top_provider`	TopProvider	Configuration details for the primary provider
`per_request_limits`	object \| null	Rate limiting information (null if no limits)
`supported_parameters`	string[]	Array of supported API parameters for this model

Example Model Object

{
  "id": "openai/gpt-4o",
  "name": "GPT-4o",
  "created": 1715367049,
  "description": "OpenAI's most advanced multimodal model with 128K context",
  "context_length": 128000,
  "architecture": {
    "input_modalities": ["text", "image"],
    "output_modalities": ["text"],
    "tokenizer": "o200k_base",
    "instruct_type": "chat"
  },
  "pricing": {
    "prompt": "0.0000025",
    "completion": "0.00001",
    "request": "0",
    "image": "0.001445"
  },
  "top_provider": {
    "context_length": 128000,
    "max_completion_tokens": 16384,
    "is_moderated": true
  },
  "supported_parameters": [
    "tools",
    "tool_choice",
    "max_tokens",
    "temperature",
    "top_p",
    "response_format",
    "stop",
    "seed"
  ]
}

Architecture Object

Describes the model's technical capabilities and supported modalities.

{
  "input_modalities": ["text", "image", "file"],
  "output_modalities": ["text"],
  "tokenizer": "o200k_base",
  "instruct_type": "chat"
}

Field	Type	Description
`input_modalities`	string[]	Supported input types: `"text"`, `"image"`, `"file"`, `"audio"`
`output_modalities`	string[]	Supported output types: `"text"`, `"image"`, `"audio"`
`tokenizer`	string	Tokenization method used by the model
`instruct_type`	string \| null	Instruction format type (`"chat"`, `"completion"`, or null)

Input Modalities

Modality	Description
`text`	Standard text input (all models)
`image`	Image/vision input (GPT-4o, Claude 3, Gemini)
`file`	Document/file input (some models)
`audio`	Audio input (GPT-4o Audio, Gemini)

Pricing Object

All pricing values are in USD per token. A value of "0" indicates the feature is free.

For current pricing on all models, visit AI Gateway Pricing.

{
  "prompt": "string",           // Cost per input token
  "completion": "string",       // Cost per output token
  "request": "string",          // Fixed cost per API request
  "image": "string",            // Cost per image input
  "input_cache_read": "string", // Cost per cached input token read
  "input_cache_write": "string" // Cost per cached input token write
}

Field	Type	Description
`prompt`	string	Cost per input token
`completion`	string	Cost per output token
`request`	string	Fixed cost per API request
`image`	string	Cost per image input
`input_cache_read`	string	Cost per cached input token read
`input_cache_write`	string	Cost per cached input token write

Calculating Costs

# Example cost calculation
prompt_tokens = 1000
completion_tokens = 500
 
prompt_cost = prompt_tokens * float(model["pricing"]["prompt"])
completion_cost = completion_tokens * float(model["pricing"]["completion"])
total_cost = prompt_cost + completion_cost
 
print(f"Total cost: ${total_cost:.6f}")

Top Provider Object

Configuration details for the primary provider serving this model.

{
  "context_length": 128000,
  "max_completion_tokens": 16384,
  "is_moderated": true
}

Field	Type	Description
`context_length`	number	Provider-specific context limit
`max_completion_tokens`	number	Maximum tokens in response
`is_moderated`	boolean	Whether content moderation is applied

Supported Parameters

The supported_parameters array indicates which OpenAI-compatible parameters work with each model:

Parameter	Description
`tools`	Function calling capabilities
`tool_choice`	Tool selection control
`max_tokens`	Response length limiting
`temperature`	Randomness control (0-2)
`top_p`	Nucleus sampling threshold
`response_format`	Output format specification (e.g., JSON mode)
`stop`	Custom stop sequences
`frequency_penalty`	Repetition reduction (-2 to 2)
`presence_penalty`	Topic diversity (-2 to 2)
`seed`	Deterministic outputs
`structured_outputs`	JSON schema enforcement

Checking Parameter Support

# Check if a model supports function calling
model = get_model("openai/gpt-4o")
 
if "tools" in model["supported_parameters"]:
    print("This model supports function calling")

Token Counts and Pricing

Different models tokenize text in different ways. Some models break text into chunks of multiple characters (GPT, Claude, Llama), while others tokenize differently. Token counts and costs vary between models even with identical inputs.

Costs are calculated according to each model's tokenizer. Use the usage field in API responses to get accurate token counts:

{
  "usage": {
    "prompt_tokens": 125,
    "completion_tokens": 350,
    "total_tokens": 475
  }
}

Filtering Models

Query specific models or filter by capabilities:

import requests
 
response = requests.get(
    "https://api.agnic.ai/v1/models",
    headers={"Authorization": "Bearer agnic_tok_YOUR_TOKEN"}
)
models = response.json()["data"]
 
# Filter for models with vision support
vision_models = [
    m for m in models
    if "image" in m["architecture"]["input_modalities"]
]
 
# Filter for free models
free_models = [
    m for m in models
    if m["pricing"]["prompt"] == "0"
]
 
# Filter by context length
long_context = [
    m for m in models
    if m["context_length"] >= 100000
]

Model Updates

Models and pricing are updated regularly. For the latest information:

Browse models visually: AI Gateway Pricing
Query programmatically: Use the /v1/models endpoint

Models

Models

One API for hundreds of models

Models API

Endpoint

Example Request

API Response Schema

Root Response Object

Model Object Schema

Example Model Object

Architecture Object

Input Modalities

Pricing Object

Calculating Costs

Top Provider Object

Supported Parameters

Checking Parameter Support

Token Counts and Pricing

Filtering Models

Model Updates

Quick Start

SDK Examples

Streaming

On this page