Prompt
Source file: src/prompt/builder.ts
The prompt builder bridges test specifications and LLMs. It takes a test file’s content (any format — Markdown, plain text, or anything an LLM can parse) and wraps it in structured instructions that tell the LLM exactly what to do: read the codebase in the current directory, evaluate each test scenario described in the file, and return structured JSON results.
This is the most critical module for result quality — the LLM only knows what the prompt tells it. The prompt specifies the exact JSON schema for responses, including which fields to include for each status type (passing tests just need an ID and status; failing tests need the expectation, what was actually observed, the file location, and a suggested resolution). If the prompt is unclear or ambiguous, the LLM produces unreliable output that the parser and validator then have to work harder to handle.
buildPrompt(testName, testContent)
Section titled “buildPrompt(testName, testContent)”Constructs the full prompt sent to the LLM CLI. The prompt is a single string that includes:
- Role assignment — “You are a semantic test evaluator”
- Test file content — the full content of the test file, embedded under a heading
- Instructions — step-by-step directions for the LLM
- Response format — exact JSON schema the LLM must return
Prompt template
Section titled “Prompt template”You are a semantic test evaluator. Your task is to evaluate whether thecodebase in the current directory meets ALL test scenarios described inthe file below.
## Semantic Test File: {testName}
{testContent}
## Instructions
1. Examine the codebase in the current working directory.2. Identify ALL distinct test scenarios or expectations in the file above. A file may contain one or many tests — look for headings, numbered items, distinct assertions, frontmatter IDs, or any structural markers that define separate test cases.3. For each test scenario, extract an ID or slug that identifies it. Use whatever identifier is most natural from the file: a heading, a marker, a frontmatter field, a short descriptive slug. If the file contains only one test, use a slug derived from the filename if an id is not explicitly provided.4. Evaluate each test scenario against the codebase.5. Respond with ONLY a JSON array (no markdown fencing, no extra text).Expected LLM response contract
Section titled “Expected LLM response contract”The LLM must return a JSON array. Each element describes one test scenario:
Passing test
Section titled “Passing test”{ "id": "my-test-id", "status": "pass" }Failing test
Section titled “Failing test”{ "id": "my-test-id", "status": "fail", "expectation": "what the spec requires", "observed": "what the code actually does", "location": "src/path/to/file.ts", "resolution": "how to fix it"}Invalid scenario
Section titled “Invalid scenario”{ "id": "", "status": "invalid" }Skipped scenario
Section titled “Skipped scenario”{ "id": "my-test-id", "status": "skip" }Error scenario
Section titled “Error scenario”When the LLM encounters an error evaluating a test:
{ "id": "my-test-id", "status": "error", "error": "description of what went wrong" }If the file has no testable content:
[{ "id": "", "status": "invalid" }]Design rationale
Section titled “Design rationale”- The prompt asks for raw JSON (no markdown fencing) to simplify parsing. However, the parser has fallback strategies if the LLM wraps the response in code fences anyway.
- Each test file may contain multiple test scenarios — the LLM identifies and evaluates all of them in a single pass.
- The prompt is passed differently depending on the adapter: Claude receives it as a positional argument, while other CLIs receive it via stdin.