AI Model Performance Results
See how top AI models perform across creative and technical criteria, as evaluated by expert judges.
| AI Model | Sets | Evaluations | Overall |
|---|---|---|---|
| Flux 1.1 Pro Ultra | 0 | 0 | 0.0 |
| Imagen | 0 | 0 | 0.0 |
| Recraft v3 | 0 | 0 | 0.0 |
| Stable Diffusion 3.5 | 32 | 0 | 0.0 |
| Dall-E 3 | 0 | 0 | 0.0 |
| Reve | 64 | 0 | 0.0 |
| Leonardo Phoenix | 0 | 0 | 0.0 |
| GPT-Image | 0 | 0 | 0.0 |
| Midjourney | 0 | 0 | 0.0 |
| Ideogram 3.0 | 22 | 1 | 3.0 |
Some columns are hidden on mobile for readability.
About the Evaluation Process
AI-generated images were evaluated by qualified judges across multiple criteria on a scale of 1-5:
- Prompt Adherence: How well the images follow the given prompt
- Technical Quality: Overall technical execution
- Artistic Merit: Aesthetic value and artistic qualities
- Creativity: Originality and creative interpretation
- Consistency: Uniformity across the set of 4 images
- Detail Richness: Level of detail in the generated images
- Style Accuracy: Appropriateness of style for the genre
- Overall Score: General impression and quality
Each image set was evaluated by multiple judges to ensure fair assessment.