What sample size do you need for a reliable product test?

How many testers does a product test really need to be statistically reliable? Methodology, formulas and category benchmarks.

Équipe SuperTryMay 5, 20263 min read

Share

"How many testers do I need?" is the first question DTC brands ask when launching a product test. The honest answer: it depends on what you're trying to measure. Here's the method to neither overspend nor undersize.

The rule of 30, and its limits

The rule of thumb circulating — "at least 30 testers" — comes from the Central Limit Theorem: above 30 observations, the distribution of means tends toward a normal distribution, even when the underlying population isn't normal.

But: 30 testers are only enough if you're measuring an average score (satisfaction, propensity to buy) with moderate variance.

To detect a rare defect (1 tester in 20 has an allergic reaction, say), 30 testers are notoriously insufficient. You need at least 100, often 200.

The simple formula to know

For a proportion (% of satisfied testers, % who would repurchase), required sample size is:

n = (Z² × p × (1−p)) / e²

Z = 1.96 for 95% confidence
p = expected proportion (e.g. 0.7 for 70% satisfaction)
e = acceptable margin of error (e.g. 0.1 for ±10 points)

Concrete example: to detect a 70% satisfaction rate with ±10 points, you need:

n = (1.96² × 0.7 × 0.3) / 0.1² ≈ 81 testers

To go down to ±5 points, the count quadruples: 323 testers.

Benchmarks by category

Based on SuperTry data across 2,000 campaigns in 2024-2025:

Category	Recommended size	Why
Packaging test (strong signals)	30-50	Low variance, homogeneous feedback
Cosmetic product test	50-100	Variable skin sensitivities
Food test	80-150	Highly subjective tastes
Health claim test	200+	High variance + regulatory risk
Children / baby test	100+	Safety-critical

The 4 parameters that change everything

1. Type of measure

Continuous measure (1-10 score) → smaller sample suffices
Binary measure (yes/no, bought/not bought) → larger sample required

2. Expected variance

The more heterogeneous the profiles (ages, regions, sensitivities), the larger the sample must be.

3. Effect size sought

Detecting a 30% gap between 2 versions = easy. Detecting 5% = it takes 36× more testers.

4. Sub-segments

If you want to analyze women 25-34 separately from men 45-54, each sub-segment must reach the minimum size — not the overall sample.

The SuperTry method in 3 steps

Define the hypothesis: "70% of testers will prefer version A" is a testable hypothesis.
Compute the minimum size with the formula above (or our built-in calculator).
Run in two waves: first 50% of the sample. If results are clear, stop. Otherwise complete with the second wave.

This approach divides the average budget by 1.5, with no reliability trade-off.

Bottom line

A good sample size is never "30 by default" nor "as many as possible". It's the result of a calculation tied to what you want to prove. A few minutes of upfront methodology beat weeks of re-testing because of a poorly calibrated sample.

Continue reading

DTC product packaging mockup on a beige background

DTC Brands

How to test your packaging before launching a DTC product

Protocol, sample size, biases to avoid: a complete playbook to validate your DTC product packaging with real consumers before going to market.

Équipe SuperTryMay 15, 20262 min read

Testers

Becoming a paid product tester: the complete 2026 guide

How to become a paid product tester in Europe, how much it really pays, and the pitfalls to avoid. Everything you need in an up-to-date guide.

Équipe SuperTryMay 10, 20262 min read