OpenAI Vision Pricing: High vs Low Detail Explained

When sending images to the OpenAI API (GPT-4o), developers must specify the detail parameter. This single parameter drastically alters your API bill.

Low Detail Mode

When you set detail: "low", OpenAI disables the tile-based scaling. The model receives a low-resolution 512x512 version of the image.

Cost: A flat 85 tokens per image.
Use Case: Image categorization, detecting broad objects, basic OCR on large text.

High Detail Mode

When you set detail: "high" (or leave it as auto), the tile-based calculation kicks in.

The image is scaled to fit within a 2048 x 2048 square.
The shortest side is scaled to 768px.
It is divided into 512px tiles. Each tile costs 170 tokens, plus the 85 token base fee. A single high-res image can easily consume over 1,000 tokens.

Optimization Strategy

Default to low detail for all initial analysis. Only trigger high detail processing if the initial pass fails to find the required information. You can simulate these exact costs using the Multimodal Calculator.