Prompt

Source file: src/prompt/builder.ts

The prompt builder bridges test specifications and LLMs. It takes a test file’s content (any format — Markdown, plain text, or anything an LLM can parse) and wraps it in structured instructions that tell the LLM exactly what to do: read the codebase in the current directory, evaluate each test scenario described in the file, and return structured JSON results.

This is the most critical module for result quality — the LLM only knows what the prompt tells it. The prompt specifies the exact JSON schema for responses, including which fields to include for each status type (passing tests just need an ID and status; failing tests need the expectation, what was actually observed, the file location, and a suggested resolution). If the prompt is unclear or ambiguous, the LLM produces unreliable output that the parser and validator then have to work harder to handle.

`buildPrompt(testName, testContent)`

Constructs the full prompt sent to the LLM CLI. The prompt is a single string that includes:

Role assignment — “You are a semantic test evaluator”
Test file content — the full content of the test file, embedded under a heading
Instructions — step-by-step directions for the LLM
Response format — exact JSON schema the LLM must return

Prompt template

You are a semantic test evaluator. Your task is to evaluate whether the
codebase in the current directory meets ALL test scenarios described in
the file below.

## Semantic Test File: {testName}

{testContent}

## Instructions

1. Examine the codebase in the current working directory.
2. Identify ALL distinct test scenarios or expectations in the file above.
   A file may contain one or many tests — look for headings, numbered items,
   distinct assertions, frontmatter IDs, or any structural markers that
   define separate test cases.
3. For each test scenario, extract an ID or slug that identifies it. Use
   whatever identifier is most natural from the file: a heading, a marker,
   a frontmatter field, a short descriptive slug. If the file contains only
   one test, use a slug derived from the filename if an id is not explicitly
   provided.
4. Evaluate each test scenario against the codebase.
5. Respond with ONLY a JSON array (no markdown fencing, no extra text).

Expected LLM response contract

The LLM must return a JSON array. Each element describes one test scenario:

Passing test

{ "id": "my-test-id", "status": "pass" }

Failing test

{
  "id": "my-test-id",
  "status": "fail",
  "expectation": "what the spec requires",
  "observed": "what the code actually does",
  "location": "src/path/to/file.ts",
  "resolution": "how to fix it"
}

Invalid scenario

{ "id": "", "status": "invalid" }

Skipped scenario

{ "id": "my-test-id", "status": "skip" }

Error scenario

When the LLM encounters an error evaluating a test:

{ "id": "my-test-id", "status": "error", "error": "description of what went wrong" }

If the file has no testable content:

[{ "id": "", "status": "invalid" }]

Design rationale

The prompt asks for raw JSON (no markdown fencing) to simplify parsing. However, the parser has fallback strategies if the LLM wraps the response in code fences anyway.
Each test file may contain multiple test scenarios — the LLM identifies and evaluates all of them in a single pass.
The prompt is passed differently depending on the adapter: Claude receives it as a positional argument, while other CLIs receive it via stdin.