The UX problem is not the algorithm problem
Most discussion about AI recommendations treats the problem as technical. Better models, richer feature sets, lower latency, higher precision@k. Those things matter. But they don't explain why users distrust recommendations even when the accuracy is high. They don't explain why the "you might also like" widget gets ignored even when it's right. They don't explain why some recommendation surfaces build engagement and others feel creepy.
Accuracy alone doesn't build trust — it builds tolerance. At SEEK, where I lead product design, we think about job recommendations as a core trust surface. In a native mobile redesign, we saw a 22% lift in apply starts and 23% lift in job views. The redesign was broader than recommendation UX alone, but the qualitative research pointed to a clear pattern: when candidates understood why a job appeared, they were more willing to engage. The algorithm gets you the right answer. The UX gets users to believe it.
The 5 trust signals for recommendation UX
Most recommendation systems are built around a single trust mechanism: accuracy. If the system is right enough times, users learn to trust it. This is a slow and fragile approach. There are five design signals that build genuine trust in a recommendation surface.
1. Legibility — "Why did I see this?"
Users want to understand the reasoning behind a recommendation, not to audit the model, but to calibrate their own response to it. "Because you applied to roles in Melbourne" is more trustworthy than "Recommended for you" — not because it's more accurate, but because it's more honest.
Legibility doesn't require exposing model internals. It requires identifying the one or two strongest signals that drove a recommendation and surfacing them in plain language. "Similar to jobs you've saved." "New from companies you've followed." "3 of your skills are a strong match." These aren't explanations — they're calibration cues. They help users decide whether to engage, without requiring them to understand the underlying logic.
The visual language matters. Legibility labels should be:
- Adjacent to the recommendation, not buried in a tooltip
- Written in second-person user-facing language ("your skills", "you applied")
- Short enough to scan (under 10 words)
- Honest — don't attribute a recommendation to user behaviour if it's actually driven by business logic
2. Control — Teaching the system
Trust comes from agency. If users can't influence recommendations, they experience the system as something happening to them rather than for them. A feedback mechanism — dismiss, save, "more like this", "not for me" — signals that the user's input is being heard and acted on.
The control surface doesn't need to be complex. It needs to be:
- Available on the recommendation card itself, not a settings menu
- Low-friction (a single tap or click, not a modal)
- Confirmed — users need to see that their feedback had an effect ("Got it — we'll show fewer roles like this")
- Consequential — the system needs to actually change in response
That last point is where most teams fail. They build the dismiss button. They don't build the feedback loop that changes the next session's recommendations. The result is a control surface that looks like agency but delivers none. When users dismiss something and see it again, or provide a preference the system ignores, trust doesn't just stagnate — it drops.
3. Perceived accuracy — the feeling of being known
Objective accuracy (did the user engage with what was recommended?) is a model metric. Perceived accuracy — the feeling that the system gets you — is a UX metric. They diverge more than you'd expect.
Perceived accuracy is driven by resonance, not precision. A recommendation that is technically lower relevance but emotionally resonant will be trusted more than a recommendation that is objectively better but feels generic. "Senior Product Designer, remote, health-tech" feels like you. "Product Designer, Melbourne" is technically accurate but says nothing about understanding.
Design for perceived accuracy by:
- Showing the specific match, not just the category ("3 of 5 required skills match")
- Using the user's own language back at them ("Roles matching your search: 'senior designer'")
- Surfacing specificity over breadth — fewer, better recommendations beat many adequate ones
- Allowing users to declare context ("I'm actively looking" vs "just browsing") so the system can tune without inference
4. Recency — Adapting to where I am now
Recommendation systems trained on historical behaviour have a staleness problem. The model knows who you were when you last engaged with the product heavily. It doesn't know who you are now.
For job seekers, this is acute. A candidate who searched for graduate roles two years ago and now has senior experience doesn't want those graduate recommendations. A candidate who bookmarked roles in one city and just moved doesn't want location-mismatched suggestions. The system's failure to adapt is felt as disrespect — "it doesn't know me."
Recency needs active design attention:
- Decay stale signals after inactivity periods (a search from 18 months ago should carry a fraction of the weight it did then)
- Surface "Update your preferences" prompts at natural re-engagement moments
- Let users set context explicitly — "I'm looking for roles starting in 3 months" — and treat that as a strong recency signal
- When recommendations change significantly from a previous session, acknowledge it ("Based on your recent activity, we've updated your recommendations")
5. Integrity — Is this serving me or optimising against me?
The fifth trust signal is the hardest to design for because it's largely invisible — until it's violated. Integrity is the user's sense that the recommendation system is fundamentally working in their interest, not against it.
Users feel integrity failures quickly. When a user realises that "recommended for you" items are actually paid placements, the entire recommendation surface becomes suspect. When dismissing a recommendation causes it to reappear two sessions later, users learn the system doesn't respect their feedback. When recommendations cluster around items with high business value rather than high user relevance, the implicit contract is broken.
This is where anti-KPIs become essential. For every primary metric your recommendation system optimises for — click-through rate, apply rate, revenue per session — name an integrity anti-KPI. If click-through goes up because recommendations became more clickbait-y, that's not a win. If apply rate goes up because the system is surfacing roles that are a poor fit but have a high application conversion, that's a problem your primary metric won't catch.
The design intervention is transparency: label paid placements as paid, make the ranking rationale accessible, and build feedback loops where user outcomes — did they get the job? did the product help? — inform the recommendation model, not just whether they clicked.
What this looks like in practice
At SEEK, job recommendations sit at the centre of the candidate experience. The platform matches millions of active and passive job seekers with hundreds of thousands of live roles. In a native mobile redesign, we saw a 22% lift in apply starts and 23% lift in job views — a system-level result, not attributable to any single change, but one where the qualitative research pointed consistently to trust: candidates who understood why a role appeared were more willing to engage with it.
The five trust signals above are the design vocabulary for that kind of work. Applied to a job platform, they translate into decisions like:
- Legibility labels that name which skills or preferences drove a recommendation, not just "recommended for you"
- Feedback controls on the card itself — dismiss, save, tune — that visibly change the next set of recommendations
- Specificity in the match ("3 of 5 required skills") rather than category-level relevance
- Context-setting that lets candidates signal intent ("actively looking" vs "just browsing") so the system doesn't have to guess
- Explicit separation of promoted listings from organic recommendations, so the trust contract stays intact
None of these are novel ideas in isolation. The discipline is applying them systematically, measuring the right things, and treating the recommendation surface as a relationship to be maintained — not a prediction to be delivered.
The broader principle holds: recommendation accuracy is a floor, not a ceiling. Once the model is good enough, the marginal returns to further model improvement are smaller than the returns to better recommendation UX. Many mature platforms are past that floor. Most haven't invested in what's above it.
The feedback loop that actually improves recommendations
The standard recommendation feedback loop is: user engages → engagement data updates model → model improves future recommendations. This loop works but it optimises for the wrong thing. Engagement is a proxy for satisfaction, not satisfaction itself. A user who clicks a job recommendation out of uncertainty and doesn't apply is recorded as a positive signal. A user who ignores a recommendation because the title was poorly written — even though the role was perfect — is a missed signal the model never sees.
A better feedback loop has three layers:
Explicit signals — direct feedback from users ("not for me", "save", "more like this"). High-quality but low-volume; most users don't provide them unless the product makes it trivially easy. Design the explicit signal surface to be always available, one tap, and immediately confirmed.
Behavioural signals — how the user engages beyond clicking. Did they dwell on the card? Did they return to it multiple times before acting? Scroll depth, return visits, and time-on-item are higher-fidelity signals than click alone. Instrument for these.
Outcome signals — what happened after the recommendation. For a job platform: did they apply? Did they get an interview? Did they get the job? These signals take longer to collect and may need ATS integrations or employer feedback, but they're the most honest measure of recommendation quality.
The design challenge is closing the loop visibly. Users who provide explicit feedback should see the effect. Users whose engagement patterns shift should see their recommendations adapt. Users who trust the feedback loop provide better signals. Better signals improve the model. The feedback loop is a trust loop.
Explainability as a design practice
There's a tendency to treat recommendation explainability as a compliance concern — something added to satisfy GDPR's right-to-explanation or EU AI Act requirements. That framing is too narrow. Explainability is a design discipline that improves user experience independent of regulatory pressure.
This view lines up with adjacent work in the field: Nielsen Norman Group has framed AI as a shift in user-interface design where users specify intent rather than step-by-step commands, Google's People + AI Guidebook gives product teams practical guidance for designing human-centred AI systems, and Zhang and Chen's survey on explainable recommendation links recommendation explanations to transparency, trust, satisfaction, and system debugging.
Users who understand why a recommendation appeared are more likely to engage with it, more likely to provide meaningful feedback, and more likely to trust the system over time. Explainability isn't a cost centre. It's a conversion lever.
In practice:
- Work with ML to define the 3–5 signals most responsible for recommendations in your system
- Develop a library of plain-language explanation templates for those signals, and test them with users
- Build the explanation layer as a first-class UI component, not an afterthought tooltip
- Measure whether explanations change engagement — not just whether users read them.
FAQ
What makes AI recommendations feel untrustworthy?
The most common causes of distrust are: opaque reasoning (users can't tell why they're seeing a recommendation), lack of control (no way to correct or dismiss), stale data (the system reflects who you were, not who you are), and integrity violations (paid placements presented as organic recommendations, or dismissed items reappearing). Any one of these can erode trust in a recommendation surface that is algorithmically accurate.
How do you design for explainability without overwhelming users?
Select the one or two signals most responsible for a recommendation and surface them in plain language adjacent to the item — not buried in a tooltip or modal. "Because you saved 3 similar roles" is enough. Keep explanation text under 10 words, write in second-person, and test it with real users to find the phrasing that creates trust rather than anxiety.
What's the difference between a feedback button and a real feedback loop?
A feedback button collects user input. A real feedback loop uses that input to change what the user sees — in the same session if possible, across sessions definitely. If a user dismisses a recommendation type and sees it again, or signals a preference the system ignores, the feedback button is UX theatre. Build the loop before you build the button.
How did SEEK approach recommendation UX design?
At SEEK, we treated recommendations as a trust surface rather than a relevance engine. The design focus was on making the reasoning behind recommendations visible — which skills matched, which preferences were reflected — and on building a feedback mechanism that changed recommendations within the same session. Combined with a native mobile redesign, the research showed candidates felt the platform understood them better, and that perception translated to measurable engagement lifts.
What are anti-KPIs for recommendation systems?
Anti-KPIs reveal whether a primary metric improvement came from genuine quality gain or from optimising the wrong behaviour. If click-through rate rises, an anti-KPI is the ratio of clicks that lead to no further action — a sign of clickbait recommendations, not useful ones. If apply rate rises, an anti-KPI is downstream outcome quality: did those applications lead to interviews and hires? Anti-KPIs prevent recommendation systems from gaming their own metrics at the cost of user trust.
When should recommendation explanations change?
When the primary signal behind a recommendation changes. If a recommendation shifts from "based on your search history" to "because similar users engaged with this," the explanation should reflect that. Consistency between what the explanation says and what the model is actually doing is foundational. An explanation that becomes systematically inaccurate is worse than no explanation at all.
About the author
Richard Simms is Principal Product Designer at SEEK and founder of Sentiuma. He writes about AI product design, design measurement, and the UX of intelligent systems at rsimms.com.
