Service Tiers
Control cost and latency tradeoffs with service tier selection
Service Tiers
The service_tier parameter lets you control cost and latency tradeoffs when sending requests through OpenRouter. You can pass it in your request to select a specific processing tier, and the response will indicate which tier was actually used.
Supported Providers
OpenAI
Accepted request values: auto, default, flex, priority
Learn more in OpenAI’s Chat Completions and Responses API documentation. See OpenAI’s pricing page for details on cost differences between tiers.
API Response Differences
The API response includes a service_tier field that indicates which capacity tier was actually used to serve your request. The placement of this field varies by API format:
- Chat Completions API (
/api/v1/chat/completions):service_tieris returned at the top level of the response object, matching OpenAI’s native format. - Responses API (
/api/v1/responses):service_tieris returned at the top level of the response object, matching OpenAI’s native format. - Messages API (
/api/v1/messages):service_tieris returned inside theusageobject, matching Anthropic’s native format.