exd telemetry srm
Sample-ratio mismatch check: did your rollout actually serve the split you declared?
Synopsis
exd telemetry srm --flag <key>
[--expected '<key>=<frac>,...']
[--namespace <slug>] [--environment <env>]
[--since <duration> | --from <ts> --to <ts>]
[--source <uri>]...
[--format human|json]
[--thresholds <path>]
[--engine <name>]
[--fail-on-error]
See common flags for shared options.
Description
Runs a chi-square test on the observed variant distribution against an expected distribution. If --expected is omitted, the expected proportions are derived from the manifest's bucket allocation for the flag.
A "passing" rollout has p-value ≥ srm.significance_level (default 0.001). A "failing" rollout — observed split significantly different from expected — emits T001. Below the srm.min_sample_size threshold, the test cannot conclude and emits T002 instead.
Use cases
-
Post-rollout verdict. "We declared 33/33/33; is that what actually shipped?"
exd telemetry srm --flag onboarding-banner --since 24h -
Caching layer audit. If a CDN or app cache bypasses some users, the SRM test fires
T001even though the bucketing math is correct — symptom of a delivery problem, not a flag problem. -
CI gate on rollout health. Fail the deploy if SRM fires:
exd telemetry srm --flag onboarding-banner --since 1h --fail-on-error -
Custom expected split. A non-uniform split that doesn't match the manifest (e.g. a "ramp" rollout pushed via segment override):
exd telemetry srm --flag onboarding-banner --expected 'treat_a=0.10,treat_b=0.10,control=0.80' --since 1h
Subcommand-specific flags
| Flag | Required | Notes |
|---|---|---|
--flag <key> | yes | The flag under test. |
--expected <csv> | no | <variant>=<fraction> pairs, comma-separated. Fractions must sum to 1.0 ± 0.001. If omitted, derived from the manifest. |
Result fields (--format json)
| Field | Type | Notes |
|---|---|---|
expected | map | Variant key → expected fraction. |
observed | map | Variant key → observed count. |
n | integer | Total evaluations in the window. |
chi_square | float | Test statistic. |
p_value | float | Two-sided p-value. |
verdict | enum | "pass" | "fail" | "insufficient". |
threshold | float | The significance level used. |
Diagnostics
T001SRM detected (verdict ="fail")T002insufficient sample size (verdict ="insufficient")
Example
$ exd telemetry srm --flag onboarding-banner --since 24h
SRM check for 'onboarding-banner' (last 24h, n=198,432, threshold p=0.001)
variant expected observed observed_pct
treat_a 33.3% 91,008 45.9%
treat_b 33.3% 90,521 45.6%
control 33.4% 16,903 8.5%
chi-square = 142,318.4 p < 1e-10 verdict: FAIL
T001 warning: sample-ratio mismatch — observed split deviates from expected.
Likely causes: cache bypass, mid-flight rule change, SDKs on stale manifest (check version-skew).
Exit codes
See telemetry exit codes. With --fail-on-error, a T001 emission upgrades exit code to 1.
See also
summary— raw per-variant counts.version-skew— manifest-version distribution behind the split. SRM + skew often correlate.- Thresholds —
srm.significance_level,srm.min_sample_sizedefaults. T001,T002.