Your experiment is lying to you
If your test can’t fail or doesn’t match your risk, the results don’t matter — you’re just burning time. (Part 3 of the Traction Science Series)
Hey friends 👋
Most founders think they’re running experiments.
What they’re actually doing is running errands:
“Testing” willingness to pay with a free signup form
“Testing” distribution by asking friends on LinkedIn
“Testing” usability when the real risk is whether anyone even wants the damn thing
What they’re actually running? Errands in lab coats.
And worse — it gives you just enough “data” to double down on the wrong idea with confidence.
This is Part 3 of our Traction Science Series. In Part 1, we surfaced your riskiest assumption. In Part 2, we turned it into a clear, falsifiable hypothesis.
Now we’re going to design an experiment that not only gives you the data you need, but also matches the risk — so your results actually mean something and your time isn’t wasted.
Let’s go 👇
Not all tests are created equal.
A “test” doesn’t de-risk anything just because you ran it.
Different risks need different tests.
The wrong test will give you the wrong answer — or worse, the illusion of the right one.
And each risk has its own evidence threshold: the amount and quality of proof you need before it’s safe to move forward.
If you’re selling a $10/month SaaS tool, a lightweight landing page or a few small paid ads might be enough to get directional signal.
If you’re selling a $100K/year enterprise product, you’ll need something far stronger (like a signed pilot agreement or a binding letter of intent) before you can call it “validated”.
The test you pick has to fit both:
The type of risk you’re addressing (desirability, distribution, viability, feasibility), and
The stakes of the decision you’re making
Get either wrong, and you end up with false confidence, wasted time, and a roadmap built on sand.
Why founders get this wrong
When it’s time to design a test, most founders reach for the least terrifying — no facing actual customers, here!
And as “validation”? Not worth the slide it’s written on.
You’ve seen this movie before:
They pick what’s easy, not what’s right. “Let’s run a survey.” Surveys have their place — but they won’t tell you if anyone will actually buy.
They don’t match the test to their stage. “We’ll run a pilot.” You don’t have a clear early adopter yet. Who exactly is that pilot for?
They ignore evidence thresholds. Treating a “like” on LinkedIn the same as a signed contract is how you end up broke with a big following.
They overcomplicate it. Designing a six-month “experiment” when a two-week sprint would tell you 90% of what you need to know.
But two problems kill more experiments than all the others combined:
The two fatal flaws of bad experiments
Flaw #1: tests that can’t fail
Some “experiments” are so safe they can’t fail — which means they can’t teach you anything.
The usual suspects:
Measuring “interest” instead of intent (landing page visits, email signups)
Asking leading questions you already know the answer to
Counting likes, comments, and shares as if they were revenue
It feels like you’re making progress.
You start believing you’ve found traction… but all you’ve really found is people willing to click a button for free.
If you followed Part 2 of this series, you already know the cure:
Falsifiability.
A real test must unambiguously succeed or fail. Anything in between is just noise — usually the result of sloppy design.
If your test can’t fail, it can’t de-risk anything. It’s just another act in the never-ending play of Validation Theatre.
Flaw #2: tests that don’t match your risk
Even if your test can fail, it still might be worthless if it’s aimed at the wrong risk.
Classic mismatch moves:
Testing price sensitivity when you don’t even know who your early adopter is.
Testing distribution by running ads to strangers when your real problem is onboarding and retention.
Testing feasibility by stress-testing your tech stack when the bigger question is “Will anyone care?”
The size mismatch is just as bad: $10k decisions made off $100 experiments, and $100 decisions buried in $10k “pilots.”
This is where proportional testing matters.
Big bets deserve bigger proof. Small bets should be cheap and fast.
Big decisions need bigger proof. Small decisions should be cheap and fast. Match the test to the stakes — don’t spend $10K validating a $100 hypothesis.
If the test doesn’t match the risk size and type, you’re just burning time and money in the wrong direction.
The Traction Science framework for real experiments
Different risks require different tests — and different levels of proof.
A fake door test might be enough to sniff out early demand for a $10 SaaS product.
But if you’re about to bet six months of runway on a $100K enterprise pilot, you’ll need stronger evidence — like a signed LOI or a paid pilot.
The art of good experiment design is choosing the right tool for the job:
Desirability risk → Fake door, concierge, smoke test
Distribution risk → Cold outreach, ad targeting, channel tests
Viability/monetization risk → Pre-orders, pre-payment, pilot sales
Feasibility risk → Prototypes, Wizard of Oz, technical spike
And it’s not just the type of test, but the size.
A $10K decision can be tested with a $1K experiment. A $100 decision should be tested with $10 worth of effort.
That’s the 10x rule.
The wrong test gives you false confidence. The right test gives you the signal you need to make the next decision.
This is where your AI co-founder earns its equity.
Instead of defaulting to the “safe” or “easy” test, ask it to:
Map your hypothesis to the type of risk you’re testing.
Recommend the best experiment formats for that risk.
Sketch the fastest, cheapest version that still gives you a decisive signal.
Define what success and failure look like before you start.
That’s how you design an experiment that’s proportionate, falsifiable, and actually useful.
Prompt #1: Match the risk to the test
Now that you’ve got a hypothesis, the next step is choosing the right way to test it.
Don’t just grab the nearest tactic (landing page, survey, cold emails) and hope it sticks. The whole point is to match the risk type you’re testing with the experiment format that can actually reveal the truth.
This is where your AI co-founder can save you a ton of wasted cycles.



