Back to thoughts

Ranking

UI Models Ranking by Sho

My personal ranking of models that make good UI. I'm not only measuring coding ability: here the harness, visual judgment, and how much they avoid AI Slop matter, that is, AI-generated interfaces that look polished but fail in intent, consistency, or real use.

Living history

UI models, month by month

May works as the initial reference and June shows how the models have evolved. The idea is to update it month by month to see what really improves at interface, which harness boosts each model, and which keep generating too much AI Slop.

History of the personal ranking of UI models Historical chart from May to June 2026. In May the order is Opus 4.6, GPT 5.5, Gemini 3 Pro, Gemini 3.5, Sonnet 4.5, and Composer 2. In June the order evolves to GPT 5.5, Composer 2.5, Opus 4.7, Sonnet 4.6, Gemini 3 Pro, and Gemini 3.5. Evolution equivalences: Opus 4.6 becomes Opus 4.7, Composer 2 becomes Composer 2.5, Sonnet 4.5 becomes Sonnet 4.6, and GPT 5.5, Gemini 3 Pro, and Gemini 3.5 stay as comparable lines across months. July and August 2026 appear as a visual expectation: every model keeps its June position, with no changes. May 2026 June 2026 July 2026 2. GPT 5.5 1. GPT 5.5 6. Composer 2 2. Composer 2.5 1. Opus 4.6 3. Opus 4.7 5. Sonnet 4.5 4. Sonnet 4.6 3. Gemini 3 Pro 5. Gemini 3 Pro 4. Gemini 3.5 6. Gemini 3.5
Position 1 means my main recommendation for achieving good UI with a model. May is the previous reference; June shows how versions and ranking evolved: Opus, Composer, and Sonnet change versions; GPT 5.5 and Gemini are compared as direct continuity. Dotted lines are a visual expectation through July, with no position changes.
Harness
The environment where the model runs changes the result: Figma, Codex, Cursor, Stitch, or AI Studio.
AI Slop
How much it falls into generic, over-decorated, or intentionless visual patterns.
UI Judgment
Ability to decide hierarchy, composition, density, clarity, and finish.
Constraints
How much it needs tight instructions to reach a usable interface.

Good UI Models

The models that best turn visual intent into interface

For me, a good UI model isn't the one that spits out the most components. It's the one that understands hierarchy, composition, and constraints, and that also works well within the right harness.

  1. Best with Figma

    #1 · GPT 5.5

    Codex + Figma

    The best when the goal is a well-resolved UI using Figma as part of the flow.

    GPT 5.5 ranks first because with Figma it better understands visual intent, structure, and finish. It doesn't just generate a screen: it helps make decisions about composition, hierarchy, and product.

    FigmaVisual judgmentGood polishClear structureLess AI Slop

    Trade-off: I use it from Codex; the harness matters a lot for that quality to translate well into the product.

  2. Least AI Slop

    #2 · Composer 2.5

    Cursor

    The one that usually delivers cleaner, less generic results inside Cursor.

    Composer 2.5 ranks second because it largely avoids AI Slop: that generic, over-decorated, or unintentional look that gives away an interface made by AI without design direction.

    Less AI SlopGood baseline judgmentCursorClean layoutsFast iteration

    Trade-off: It doesn't always reach the level of visual judgment that GPT 5.5 achieves with Figma, but it's very consistent.

  3. Best in Cursor

    #3 · Opus 4.7

    Cursor

    Powerful for reasoning about UI, but with mid-level AI Slop if the harness doesn't help.

    Opus 4.7 works best in Cursor because the environment gives it a better harness to review, edit, and fix. In Claude Code it tends to drift toward interfaces with too much AI Slop.

    ReasoningCursor as harnessRefinementComponentsGood context

    Trade-off: It needs clear visual direction to avoid falling into overly obvious or artificial decisions.

  4. Fast but Sloppy

    #4 · Sonnet 4.6

    Cursor

    Very useful, but with more risk of AI Slop if the visual instruction isn't well defined.

    Sonnet 4.6 ranks below Opus because it shares part of the problem: it can produce functional UI, but with a more generic finish if it doesn't get strong constraints.

    SpeedCursorGood supportIterationImplementation

    Trade-off: Like Opus, it needs a harness and solid visual direction to avoid becoming too much AI Slop.

  5. Complicated UI

    #5 · Gemini 3 Pro

    Cursor

    It doesn't always produce much AI Slop, but its interfaces tend to feel complicated.

    Gemini 3 Pro can avoid part of the generic look, but in exchange it tends to propose interfaces more tangled than necessary. For good UI, simplicity matters a lot.

    CursorLess genericBroad ideasExplorationTechnical capability

    Trade-off: I find it lacking for UI because it overcomplicates the interface and demands more correction afterward.

  6. Fast with Constraints

    #6 · Gemini 3.5

    Stitch + AI Studio

    Fast when instructions are very defined, but weaker as autonomous UI judgment.

    Gemini 3.5 ranks sixth because it can move fast in Stitch and AI Studio, but it needs very precise instructions. If the brief stays open, visual quality drops fast.

    FastStitchAI StudioTight briefsExploration

    Trade-off: It works better as an executor with constraints than as the main model for deciding good UI.