Think Evals Are Just for AI? Think Again
Why product managers need evals and how to put them into practice with examples to get started fast
Your product is growing fast.
Multiple teams ship their parts, each saying: "It works as expected.”
And they’re right. Every part does work well. Automation and CI/CD give you confidence that each piece passes its tests.
So why do you hesitate before releasing?
Because what keeps you up at night isn’t whether components work individually—it’s whether they work together.
You worry the customer experience will break down at the seams. But when you raise the concern, the response is familiar: “It’s too complex to validate. We don’t have time to test all those interactions.”
This is where most product managers stop.
But you don’t have to. There’s a tool you can bring into the process—one that cuts through complexity and keeps customer experience front and center: Evals.
In this article, you'll learn:
What evals are (for any product, not just AI)
How to set them up fast
How to iterate and share results
Why evals save time and protect customer experience
💡 Paid subscribers get the full set of ready-to-use templates (use case matrix + iteration tracker) to jumpstart evals in their own product.
The Problem: Complexity Outpaces Confidence
Modern product environments are becoming more complex by the day:
APIs and interfaces multiplying across systems
AI embedded everywhere—inside your product and in your customers’ workflows
Monetization and pricing automation layered on top
Engineering teams do their part. They break work into components, build clean interfaces, and use automation to hit a 99% pass rate on unit tests.
But as a product manager, you still hesitate. Because it’s your job to picture the complete product in a messy, real-world customer environment. And the truth is: even the strongest product managers can’t hold all those interactions in their head.
What Are Evals (for Any Product, Not Just AI)?
At their core, evals are systematic, repeatable ways to validate outcomes. They first gained traction in AI, where outputs change constantly and drift is real, but the concept is much broader.
The beauty of evals is speed. Instead of exhaustively testing every permutation, you check a small, prioritized set of customer-centered scenarios that reflect what really matters.
And this is why evals belong to product managers:
You know the business context.
You know the customer workflows.
You know the failure modes that keep you up at night.
Evals give you a way to capture that product intuition and turn it into something the whole team can see, run, and learn from.
How to Use Evals in Your Product
Think of evals as a three-step loop: define → environment → iterate.
Eval 1: Prioritize Use Cases
Start with your customer journeys and worry list. For each, define what to check. The output is a simple matrix of use cases and checkboxes.
Examples of use cases:
Customer workflows: onboarding, expansion, renewal
Key actions: feature X → feature Y, exporting a report
Vertical-specific flows: healthcare compliance, developer integrations, enterprise administration
Failure scenarios: API downtime, misfired permissions, edge cases
Prioritize by likelihood and importance. Sometimes your “customer” might even be an internal user, like support or finance.
Eval 2: Set Up the Environment
You don’t need a full replica of production—just enough to exercise your critical paths. Use your product sense to strike a balance between cost, speed, and realism.
Examples of eval environments:
Simulated customer data sets
Mock business conditions (orders, licenses, account provisioning)
Staging versions of non-production systems
Manual inputs where automation isn’t practical
The goal isn’t perfection. It’s coverage of the most important seams where things are likely to break.
Eval 3: Iterate Quickly
Pair up with someone on your team. One person drives the use cases, the other records findings. Screenshots, recordings, and notes are enough—you’re not writing bug reports yet.
After the pass, review what you captured. Decide which issues to follow up on and which can ship with a note. Then repeat the process with your next set of use cases.
Some teams even run eval workshops, where several people tackle the matrix together. Your eval sheet becomes the agenda.
Sharing Results
The magic of evals isn’t just in running them—it’s in making results visible.
By sharing notes and screenshots, you create a common reference point for product readiness. Instead of abstract concerns (“I just don’t feel good about this”), the team sees the product through a customer lens.
Yes, there will still be debates about severity and tradeoffs. But now those conversations happen in the open, with shared evidence. Even if you don’t cover 100% of scenarios, the team walks away aligned on the risks and confident in the release.
And because evals are repeatable, you can re-run them after fixes, before major releases, or whenever integrations shift.
Ready to try evals on your product? Paid subscribers get access to:
A detailed Use Case Matrix template
An Eval Iteration Tracker for progress
Conclusion: Evals Save Time
Evals create a flywheel for iteration. Each cycle:
Surfaces risks early
Turns vague concerns into visible tradeoffs
Gives product managers confidence that “it works” also means “it works for the customer”
In today’s complex environments, product managers can’t rely on test automation alone. You need a way to validate the seams—the places where customer experience is most fragile.
That’s what evals deliver. They don’t slow you down; they speed you up. And once you set them up, they keep paying dividends for the life of the product.
Evals aren’t just for AI. They’re emerging as a core skill for every product manager who wants to ship with confidence.
Bonus: AI evals are a key skill for product managers
Beyond vibe checks: A PM's complete guide to evals: Aman Khan post on Lenny's Newsletter about evals becoming a defining skill for AI PMs.
Mastering AI Evals: A Complete Guide for PMs: Paweł Huryn and Hamel’s Substack (Hamel Husain) on AI evals as a key product manager skill.
Looking for more practical tips to improve your product management skills?
Check out Product Management Resources for free product management templates and guides.
Become a paid subscriber and get more weekly tips. Last week, paid subscribers heard about my recent quiet pilot. Do Quiet Pilots Work?
TLDR Product listed Product Management IRL articles recently! This biweekly email provides a consolidated list of recent product management articles.
Connect with Amy on LinkedIn, Threads, Instagram, and Bluesky for product management insights daily.








