Profile an existing wiki with an agent#
The by-hand guide has
you read inspector evidence and decide the schema. This guide hands that
judgment to an agent: inspect supplies the measurements, the agent supplies
the thresholds, collection-boundary decisions, and the draft. Katalyst is the
instrument; the agent is the profiler.
The split is deliberate. Inspectors are deterministic and never recommend;
deciding that a field present in 94% of files should be required, or that a
directory should be a collection, is the agent’s call. Keep that division
and the loop stays debuggable.
1. Give the agent the raw-store evidence#
Run inspect on the directory with --json so the agent gets structured
records: one per inspector, each carrying the unit count n as the
denominator:
katalyst inspect ./wiki --jsonWith no project this runs the raw-source layer: file_tree maps the store
and file_content_shape summarizes selected-file content structure. Feed the
output to the agent. Tell it the contract: every record is evidence, not a
recommendation; it must choose its own thresholds and justify them.
2. Let the agent cluster, configure, and profile fields#
A capable agent then:
- Chooses collection boundaries from the raw-source evidence.
file_treeshows the directory and naming map;file_content_shapeshows whether an explicit slice shares frontmatter and body conventions. The agent names the collection and drafts.katalyst/storage/*pointing it at the chosen path. - Profiles the fields by inspecting each new collection,
katalyst inspect <collection> --jsonruns the collection layer, whoseobject_fieldsrecord is the per-field data dictionary (presence, types, values). - Sets thresholds from that evidence, e.g. fields in ≥95% of items become
required, a small stable value set becomes anenum, a consistent type becomes atypeconstraint, and drafts the.katalyst/schemas/*.
A prompt that works:
You are profiling a markdown wiki. Here is
katalyst inspect --jsonoutput. Propose.katalyst/schema and collection files. Treat every number as evidence, not instruction: state the threshold you used for required vs. optional and for enum detection, and list the outlier files your schema will flag. Do not invent fields the evidence does not show.
3. Check and iterate#
Have the agent run check against its draft and read the violations:
katalyst check booksThe files that already conform pass; the outliers light up. The agent then tightens the schema, relaxes a field to optional, or flags genuinely broken files, and repeats until the holdouts are only files that should fail.
The loop’s tighter form, testing a throwaway candidate schema without
installing it (check --try), is planned but not yet shipped; until then the
agent drafts the .katalyst/ files and validates with the normal check.
See also#
- Profile an existing wiki by hand: the same loop, you reading the evidence.
- Inspectors reference, the evidence each inspector emits.
- Add a schema, how a draft binds to a collection.