Ranking
UI Models Ranking by Sho
My personal ranking of models that make good UI. I'm not only measuring coding ability: here the harness, visual judgment, and how much they avoid AI Slop matter, that is, AI-generated interfaces that look polished but fail in intent, consistency, or real use.
Living history
UI models, month by month
May works as the initial reference and June shows how the models have evolved. The idea is to update it month by month to see what really improves at interface, which harness boosts each model, and which keep generating too much AI Slop.
- Harness
- The environment where the model runs changes the result: Figma, Codex, Cursor, Stitch, or AI Studio.
- AI Slop
- How much it falls into generic, over-decorated, or intentionless visual patterns.
- UI Judgment
- Ability to decide hierarchy, composition, density, clarity, and finish.
- Constraints
- How much it needs tight instructions to reach a usable interface.
Good UI Models
The models that best turn visual intent into interface
For me, a good UI model isn't the one that spits out the most components. It's the one that understands hierarchy, composition, and constraints, and that also works well within the right harness.
-
Best with Figma
#1 · GPT 5.5
Codex + Figma
The best when the goal is a well-resolved UI using Figma as part of the flow.
GPT 5.5 ranks first because with Figma it better understands visual intent, structure, and finish. It doesn't just generate a screen: it helps make decisions about composition, hierarchy, and product.
FigmaVisual judgmentGood polishClear structureLess AI SlopTrade-off: I use it from Codex; the harness matters a lot for that quality to translate well into the product.
-
Least AI Slop
#2 · Composer 2.5
Cursor
The one that usually delivers cleaner, less generic results inside Cursor.
Composer 2.5 ranks second because it largely avoids AI Slop: that generic, over-decorated, or unintentional look that gives away an interface made by AI without design direction.
Less AI SlopGood baseline judgmentCursorClean layoutsFast iterationTrade-off: It doesn't always reach the level of visual judgment that GPT 5.5 achieves with Figma, but it's very consistent.
-
Best in Cursor
#3 · Opus 4.7
Cursor
Powerful for reasoning about UI, but with mid-level AI Slop if the harness doesn't help.
Opus 4.7 works best in Cursor because the environment gives it a better harness to review, edit, and fix. In Claude Code it tends to drift toward interfaces with too much AI Slop.
ReasoningCursor as harnessRefinementComponentsGood contextTrade-off: It needs clear visual direction to avoid falling into overly obvious or artificial decisions.
-
Fast but Sloppy
#4 · Sonnet 4.6
Cursor
Very useful, but with more risk of AI Slop if the visual instruction isn't well defined.
Sonnet 4.6 ranks below Opus because it shares part of the problem: it can produce functional UI, but with a more generic finish if it doesn't get strong constraints.
SpeedCursorGood supportIterationImplementationTrade-off: Like Opus, it needs a harness and solid visual direction to avoid becoming too much AI Slop.
-
Complicated UI
#5 · Gemini 3 Pro
Cursor
It doesn't always produce much AI Slop, but its interfaces tend to feel complicated.
Gemini 3 Pro can avoid part of the generic look, but in exchange it tends to propose interfaces more tangled than necessary. For good UI, simplicity matters a lot.
CursorLess genericBroad ideasExplorationTechnical capabilityTrade-off: I find it lacking for UI because it overcomplicates the interface and demands more correction afterward.
-
Fast with Constraints
#6 · Gemini 3.5
Stitch + AI Studio
Fast when instructions are very defined, but weaker as autonomous UI judgment.
Gemini 3.5 ranks sixth because it can move fast in Stitch and AI Studio, but it needs very precise instructions. If the brief stays open, visual quality drops fast.
FastStitchAI StudioTight briefsExplorationTrade-off: It works better as an executor with constraints than as the main model for deciding good UI.