4. Fixtures and automated tests
← Previous: Schema-driven application code · Index · Next: Roll out with the testing attribute
Lint proves the manifest is well-formed. The schema test in Chapter 3 proves the application sends the right shape of context. Neither proves the manifest does what you mean — that a US user with user.id = u-1 actually returns control, that a non-US user falls through to the default, that a missing user.id doesn't sneak into a bucket by accident.
The tool for that is exd fixtures. It walks the flag's rule chain, synthesizes one representative context per rule, and emits (ctx, expected variant, why) rows in a paste-ready form — either as a --format human table you eyeball in your terminal, or as a --format rust/--format typescript const you check into the repo and consume from tests.
Critically, the synthesis is deterministic: same --seed and same manifest always produce byte-identical output. Fixtures live in your repo, the diff is meaningful, and a rule rewrite that changes a variant assignment shows up as a fixture diff in code review — instant regression signal.
Eyeball the generated rows first
Before you redirect anything into a file, run exd fixtures in human format and read what comes out:
$ exd fixtures welcome-banner --env production --manifest marketing
fixture variant why
──────────────────────────────────────────── ──────── ─────────────────────────────────────────────────────
user.id=u-1, user.country=US control bucket=1832 ∈ [0,3299], country=US satisfies seg
user.id=u-4, user.country=US treat_a bucket=5104 ∈ [3300,6599], country=US satisfies seg
user.id=u-9, user.country=US treat_b bucket=8742 ∈ [6600,9999], country=US satisfies seg
user.id=u-4, user.country=DE control predicate user.country=US fails → fallthrough
user.id=u-7, user.country=US control predicate satisfied, bucket=7204 outside all ranges? — see synth.gap
Five rows for this flag, all matching the resolution walk you saw in Chapter 1:
- Rows 1-3 — one user per bucket rule. The
whycolumn shows the exact bucket hash, the range it fell into, and thatuser.country = USsatisfies the predicate gate. - Row 4 — same user as row 2 (
u-4) butuser.country = DE. The predicate fails for every segment, every rule misses, fallthrough to the env'svariant = "control". This is the geo-gate regression test — if someone weakens the predicate to accept other countries, this row's expected variant changes and the test fails. - Row 5 — the predicate-satisfied / bucket-miss row. A user whose
user.country = USsatisfies every predicate but whoseuser.idbucket lands outside every range. There is no such user in the welcome-banner flag — the three ranges[0,3299],[3300,6599],[6600,9999]together cover the full 0..=9999 hash space — so the synthesizer marks this row's gap explicitly. For a flag where bucket ranges don't fully tile, this row catches the off-by-one and you confirm the fallthrough behavior is what you want.
If any row looks wrong — variant mismatch, bucket arithmetic that doesn't add up, an attribute the synthesizer didn't think to populate — fix the manifest before generating the typed fixture file. The whole point of running human first is that the rows are small enough to fit on a screen and read for sense.
Stretch with --coverage exhaustive
The default mode (rules+fallthrough) emits one row per rule plus the fallthrough and predicate-satisfied/bucket-miss rows. For bucket-bearing rules, add --coverage exhaustive to also emit the boundary rows:
$ exd fixtures welcome-banner --env production --manifest marketing --coverage exhaustive
You'll see two extra rows per bucket range — one id whose bucket hash is exactly start, one whose hash is exactly end + 1. Those are the classic off-by-one cases. Worth running before a release.
Generate a typed fixture file
Once the human-format rows look right, regenerate as a Rust or TypeScript const and commit it:
Rust — tests/welcome_banner_fixtures.rs
$ exd fixtures welcome-banner --env production --manifest marketing --format rust > tests/welcome_banner_fixtures.rs
// Generated by `exd fixtures welcome-banner --env production --format rust`.
// Manifest fingerprint: marketing@a1b2c3 — rerun if salts, ranges, or rule order change.
pub const WELCOME_BANNER_FIXTURES: &[(&[(&str, &str)], &str)] = &[
// why: bucket=1832 ∈ [0,3299], user.country=US satisfies welcome-banner-bucket-control
(&[("user.id", "u-1"), ("user.country", "US")], "control"),
// why: bucket=5104 ∈ [3300,6599], user.country=US satisfies welcome-banner-bucket-treat-a
(&[("user.id", "u-4"), ("user.country", "US")], "treat_a"),
// why: bucket=8742 ∈ [6600,9999], user.country=US satisfies welcome-banner-bucket-treat-b
(&[("user.id", "u-9"), ("user.country", "US")], "treat_b"),
// why: predicate user.country=US fails → fallthrough
(&[("user.id", "u-4"), ("user.country", "DE")], "control"),
];
TypeScript — tests/welcome-banner.fixtures.ts
$ exd fixtures welcome-banner --env production --manifest marketing --format typescript > tests/welcome-banner.fixtures.ts
The TS variant emits the same data as a ReadonlyArray<{ ctx, variant, why }> const.
Consume the fixtures from tests
The fixture const is data; the test is one loop over it.
Rust — tests/welcome_banner.rs
mod welcome_banner_fixtures; // pulls in WELCOME_BANNER_FIXTURES
use welcome_banner_fixtures::WELCOME_BANNER_FIXTURES;
use exd_client::{EvalContext, ExdClient};
async fn client() -> ExdClient {
ExdClient::builder()
.namespace_dir("marketing") // lints + loads the namespace directory
.environment("production")
.build()
.await
.expect("manifest must lint cleanly")
}
#[tokio::test]
async fn welcome_banner_fixtures_resolve_as_expected() {
let c = client().await;
for (attrs, expected) in WELCOME_BANNER_FIXTURES {
let mut b = EvalContext::builder();
for (k, v) in *attrs {
b = b.str(*k, *v);
}
let got = c.string_flag("welcome-banner", &b.build(), String::new()).await;
assert_eq!(
got, *expected,
"fixture mismatch for {:?}: expected {}, got {}",
attrs, expected, got
);
}
}
TypeScript — tests/welcome-banner.test.ts
import { describe, expect, it } from "vitest";
import { ExdClient } from "@exd/client";
import { loadFileUri } from "@exd/client/server";
import { WELCOME_BANNER_FIXTURES } from "./welcome-banner.fixtures";
describe("welcome-banner", () => {
it.each(WELCOME_BANNER_FIXTURES)(
"ctx=$ctx → variant=$variant ($why)",
async ({ ctx, variant }) => {
const files = await loadFileUri("file://./marketing");
const c = await ExdClient.create("marketing", files);
expect(c.evalString("welcome-banner", "production", ctx, "")).toBe(variant);
},
);
});
Both run as part of your existing cargo test / npm test. CI already covers them — no new workflow to wire up.
When synthesis can't pick a value
Some predicate shapes can't be auto-synthesized — deep negation, unbounded numeric domains, conflicting and constraints, unreachable buckets. The row is still emitted, but with TBD placeholders:
// why: predicate too complex to auto-synthesize (deep negation) — fill manually
(&[("user.id", "TBD"), ("user.country", "TBD")], "TBD"),
Exit code stays 0 on TBD rows — synthesis failure is a known limitation, not a tool error. CI consumers that want loud failure on TBD rows can grep for "TBD" in the generated output:
exd fixtures welcome-banner --env production --manifest marketing --format rust \
| grep -q '"TBD"' && { echo "TBD fixture rows present — fill manually"; exit 1; }
The welcome-banner flag has no TBD rows (every predicate is a plain eq), but the pattern is worth knowing about for flags that grow more complex predicates later.
Regenerate on manifest changes
The generated header carries a Manifest fingerprint: <slug>@<6hex> line. A rule order change, a salt change, a bucket-range change, or a rule rewrite that changes a variant assignment will all show up as a fingerprint shift plus a row-level diff in the fixture file — instant signal in code review.
Common practice: re-run exd fixtures whenever you edit a flag's TOML, and commit the regenerated fixture file alongside the manifest edit in the same PR. The reviewer compares the two diffs and sees both the cause (TOML change) and the effect (which fixtures shifted) without simulating evaluation in their head.
Next
Chapter 5 — Roll out with the testing attribute. The behavior is now pinned by tests; next, take the flag to production safely.