How we test#
Katalyst follows TDD: new behavior arrives with a failing test first. Two traits
shape the suite beyond that. Tests are the source of truth, and many of our
docs are generated from them: the reference pages come from the check and
inspector registries, and the worked examples come from the example registry,
so a green suite keeps behavior and its documentation honest at once. And we
test at seams, the named interfaces a katalyst run flows through, with a
small set of styles that interlock into end-to-end coverage.
Testing seams#
A katalyst run flows through a pipeline of narrow, independently testable
boundaries, such as config loading, frontmatter parsing, and the check engine,
composed behind the CLI. We unit-test each major module under internal/
against its own boundary, with inline literals or a small scaffolded input, so
each assertion stays fast and close to the code that can break it.
A behavior is covered when its owning seam has a focused unit test and the CLI or docs tests exercise the composition (below). Push each assertion to the lowest level that can make it; reserve the CLI for what genuinely needs the whole pipeline. Known coverage gaps are tracked in issue #86.
How they interlock#
- Unit at each module. Fast, precise tests against one module’s exported API.
- Integration at the CLI. The
cmdtests drive the real Cobra root over a temp project, exercising config, storage, the engine, and the checks together. Snapshot the user-facing text; property-test the behavior (exit codes, side effects, query semantics). - Parity guards at the registries.
registry_testasserts every check type and inspector has a descriptor and a library, so a check type or inspector cannot ship orphaned or drift out of sync with the docs. - Generated docs and dogfood close the loop.
docs-gen-checkfails on doc drift, and CI runskatalyst checkover the project’s own.katalyst/corpus, validating real content.
Test styles#
Any style can apply anywhere, but each has a usual home. Three carry most of the suite.
- Behavior (property) tests. The default, across the
internal/modules and for CLI behavior: assert what the code does (outputs, exit codes, semantics). - Text-contract snapshots. At the CLI: pin user-facing output (help, list
and show, diagnostics) with golden files under
cmd/testdata/snapshots/, and keep the behavior behind that text as property tests. Snapshot the text, property-test the behavior. - Executable examples that double as docs. In the docs: the
internal/examplesregistry runs a real command over a tiny corpus; a golden test gates the output andcmd/gendocsrenders the same run into the published docs, so an example cannot drift. Embed one into a prose page with{{< katalyst-example "id" >}}(output only) or{{< katalyst-example-full "id" >}}(full corpus and command). Seeinternal/examples/AGENTS.mdfor how to add one.
The golden-file workflow#
Golden fixtures, snapshots, the generated reference, and the worked examples are generated, never hand-written. Regenerate, then review the diff as the contract before committing:
go test ./cmd -run TestThing -update # snapshots
make docs-gen # generated reference and examplesmake docs-gen-check, run in CI, fails if the committed output drifts from its
generators.
Testing conventions#
Mechanical conventions for writing tests live in the root
AGENTS.md; see its
Testing section. Per-package testdata/AGENTS.md files contain additional
conventions as needed.