Skip to main content

exd telemetry srm

Sample-ratio mismatch check: did your rollout actually serve the split you declared?

Synopsis

exd telemetry srm --flag <key>
[--expected '<key>=<frac>,...']
[--namespace <slug>] [--environment <env>]
[--since <duration> | --from <ts> --to <ts>]
[--source <uri>]...
[--format human|json]
[--thresholds <path>]
[--engine <name>]
[--fail-on-error]

See common flags for shared options.

Description

Runs a chi-square test on the observed variant distribution against an expected distribution. If --expected is omitted, the expected proportions are derived from the manifest's bucket allocation for the flag.

A "passing" rollout has p-value ≥ srm.significance_level (default 0.001). A "failing" rollout — observed split significantly different from expected — emits T001. Below the srm.min_sample_size threshold, the test cannot conclude and emits T002 instead.

Use cases

  • Post-rollout verdict. "We declared 33/33/33; is that what actually shipped?"

    exd telemetry srm --flag onboarding-banner --since 24h
  • Caching layer audit. If a CDN or app cache bypasses some users, the SRM test fires T001 even though the bucketing math is correct — symptom of a delivery problem, not a flag problem.

  • CI gate on rollout health. Fail the deploy if SRM fires:

    exd telemetry srm --flag onboarding-banner --since 1h --fail-on-error
  • Custom expected split. A non-uniform split that doesn't match the manifest (e.g. a "ramp" rollout pushed via segment override):

    exd telemetry srm --flag onboarding-banner --expected 'treat_a=0.10,treat_b=0.10,control=0.80' --since 1h

Subcommand-specific flags

FlagRequiredNotes
--flag <key>yesThe flag under test.
--expected <csv>no<variant>=<fraction> pairs, comma-separated. Fractions must sum to 1.0 ± 0.001. If omitted, derived from the manifest.

Result fields (--format json)

FieldTypeNotes
expectedmapVariant key → expected fraction.
observedmapVariant key → observed count.
nintegerTotal evaluations in the window.
chi_squarefloatTest statistic.
p_valuefloatTwo-sided p-value.
verdictenum"pass" | "fail" | "insufficient".
thresholdfloatThe significance level used.

Diagnostics

  • T001 SRM detected (verdict = "fail")
  • T002 insufficient sample size (verdict = "insufficient")

Example

$ exd telemetry srm --flag onboarding-banner --since 24h
SRM check for 'onboarding-banner' (last 24h, n=198,432, threshold p=0.001)

variant expected observed observed_pct
treat_a 33.3% 91,008 45.9%
treat_b 33.3% 90,521 45.6%
control 33.4% 16,903 8.5%

chi-square = 142,318.4 p < 1e-10 verdict: FAIL

T001 warning: sample-ratio mismatch — observed split deviates from expected.
Likely causes: cache bypass, mid-flight rule change, SDKs on stale manifest (check version-skew).

Exit codes

See telemetry exit codes. With --fail-on-error, a T001 emission upgrades exit code to 1.

See also

  • summary — raw per-variant counts.
  • version-skew — manifest-version distribution behind the split. SRM + skew often correlate.
  • Thresholdssrm.significance_level, srm.min_sample_size defaults.
  • T001, T002.