I've benchmarked 25 image generation models from 6 providers to compare cost and latency using the same prompt:
A photorealistic image of a golden retriever puppy sitting in a sunflower field at golden hour, with a soft bokeh background and warm light. Aspect ratio: 1:1, size: 1024x1024.
The benchmark includes both image-only and image-capable multimodal models available on Vercel AI Gateway.
If you don't have much time to analyze the full report:
- go straight to the winners
- compare images
- check out the cost chart
- or the latency chart
Full Results
April 20, 2026: added prodia/flux-fast-schnell.
April 23, 2026: added bfl/flux-2-klein-4b, bfl/flux-2-klein-9b, openai/gpt-image-2.
May 16, 2026: removed deprecated xai/grok-imagine-image-pro and added recraft/recraft-v4.1 subset.
(click on a thumbnail to see the full image)
Total spent: $1.8116689
The latency here is wall time, measured by the benchmark script. The cost, however, is returned by the gateway, so it should be accurate.
Cost
Latency
Model Quirks
This benchmark includes provider-specific adaptations, because several models need custom params to work correctly:
- Black Forest Labs Flux models need explicit pixel
widthandheightinstead of relying onaspectRatio - Recraft and OpenAI image models require
sizerather thanaspectRatio recraft/recraft-v4-proneeds2048x2048for square output, not1024x1024- xAI Grok models support
aspectRatiobut notsize - Gemini image responses can contain duplicate files in
result.files, so the benchmark deduplicates them before saving xai/grok-imagine-image-profailed when run last in a long benchmark sequence, so isolated runs may be more reliable for itprodia/flux-fast-schnell, previously the cheapest model, failed to generate an image this time (reported to Vercel).
Winners
If you only optimize for cost, the three cheapest models in this benchmark are:
bfl/flux-2-klein-4bat $0.014bfl/flux-2-klein-9bat $0.015google/imagen-4.0-fast-generate-001andxai/grok-imagine-imagetied at $0.02
If you optimize for latency, the three fastest models are:
bfl/flux-pro-1.1at 3.6sxai/grok-imagine-imageat 4.1sbfl/flux-2-klein-9bat 6.2s
I personally think it all depends on your task and prompt. For my particular mass-scale case, I'm optimizing for both cost and speed, so choose among the winners.
DIY
If you want to run the benchmark yourself, add more models, or adjust the prompt - the script is easy to configure and run: kometolabs/ai-image-generation-cost-analysis
Support My Work
Preparing such research sometimes takes 2-5 full benchmark runs, which adds up. If it saved you time, consider sponsoring me on GitHub or Buy Me a Coffee.

