Skip to main content

exd telemetry export

Batch query against object-store data. The escape hatch for analyses that aren't expressible as named queries.

Synopsis

exd telemetry export
--source <uri>
--query <sql>
[--engine duckdb|snowflake|bigquery|databricks|redshift]
[--output <path>]
[--format human|json]

Description

For Rung 3 deployments where evaluation records live in object storage (S3, GCS, Azure Blob), export runs an arbitrary SQL query against them via the configured engine (DuckDB by default). The --query value is read from the supplied string or, if it begins with @, from a file.

This is intentionally a power tool. Use the canned queries (summary, srm, etc.) for routine work — they're stable, give T-code diagnostics, and round-trip the same provenance envelope as everything else. Reach for export when:

  • You need a join the catalog doesn't express.
  • You need engine-specific optimizer hints or sampling.
  • You're computing something the built-in queries don't cover (e.g. cohort-based variant decay, attribute-driven funnels).

The SQL surface is NOT stable across versions. Different engines have different dialects; column names in the underlying Parquet are versioned together with the warehouse contract. Pin your queries to a known binary version.

Use cases

  • Cohort analysis. "Of the users who saw treat_a on day 0, what variant did they see on day 7?"

    exd telemetry export --source s3://acme-flags/records/ \
    --query @cohort-day7.sql --engine duckdb --output /tmp/cohort.csv
  • Per-attribute funnel. "Variant breakdown by user.country, for users who also evaluated the pricing flag."

  • One-off audit. "Show me every record with evaluation_reason = 'attribute_type_mismatch' in the last 90 days."

  • Backfill / replay. Re-derive an aggregate from the raw Parquet because the canned query's result wasn't persisted at the time.

Flags

FlagRequiredNotes
--source <uri>yesObject-store URI (s3://…, gs://…, az://…) or local glob (file:///path/*.parquet). Engines may accept multiple sources; pass --source multiple times.
--query <sql>yesSQL string. If it begins with @, treated as a file path: --query @./queries/cohort.sql.
--engine <name>no, default duckdbOne of duckdb (local), snowflake, bigquery, databricks, redshift.
--output <path>no, default stdoutDestination file. Format inferred from extension (.json, .csv, .parquet).

Underlying schema

Records on disk follow the warehouse contract: the Parquet column set mirrors the evaluation record schema, with the addition of partition columns (year, month, day, namespace).

Output

--format json wraps the query result in the standard provenance envelope, with result.rows carrying the rows. --format human prints a tab-aligned table.

Stability

  • Set of supported engines: minor-version add, major-version remove.
  • SQL surface (column names, function set): NOT stable. Pin to a binary version.
  • Output envelope: stable.

Exit codes

See telemetry exit codes. Engine errors produce exit code 3 and include the engine's error message under error in the JSON envelope.

See also