Running Tests

Commands

semtest provides two commands: run for executing tests and init for scaffolding a new setup.

`semtest run`

Basic usage

# Run all spec files in the configured directory
semtest run

# Run specific test files
semtest run auth-middleware.spec.md api-routes.spec.txt

# Run all specs in a subdirectory
semtest run api/

# Run with full paths
semtest run semantic-tests/auth-middleware.spec.md

When file or directory arguments are provided, they’re resolved against cwd first, then against the configured tests directory.

CLI flags

Flag	Type	Default	Description
`--timestamp`	boolean	`false`	Generate a timestamped copy of the Markdown report
`--include-passing`	boolean	`false`	Include passing tests in the Markdown report
`--strict`	boolean	`false`	Exit code 2 if validation issues are found
`--skip-validation`	boolean	`false`	Skip post-run validation entirely
`--extensions <exts>`	string	(all files)	Comma-separated file extensions (e.g. `.md,.txt`)
`--debug`	boolean	`false`	Log raw LLM output to `{output}/debug/`
`--timeout <ms>`	number	`0`	Timeout per test in milliseconds (0 = no timeout)
`--junit`	boolean	`false`	Generate JUnit XML report

Config file options

All CLI flags can also be set in semtest.config.ts:

import { defineConfig } from "@westopp/semtest";

export default defineConfig({
  tests: "semantic-tests/",
  output: "semantic-test-results/",
  llm: {
    runner: "claude",
    capability: "balanced",
  },
  strict: true,
  debug: true,
  timestamp: true,
  includePassing: false,
  extensions: [".md", ".txt"],
  timeout: 60000,
  junit: true,
});

Flag precedence

CLI flags always override config file values:

CLI flag > config file > schema default

For example, if the config has strict: true but you run semtest run without --strict, strict mode is still enabled. But if you explicitly pass a flag, it wins.

Exit codes

Code	Meaning	When
`0`	Pass	All tests passed
`1`	Fail	At least one test failed (but no errors)
`2`	Error	LLM subprocess error, parse error, or `--strict` with validation issues

Precedence: error (2) > fail (1) > pass (0)

Debug mode

When --debug is enabled:

A debug/ directory is created inside the output directory (cleared on each run)
For each test file, a JSON file is written containing all retry attempts
Each attempt includes the raw stdout, stderr, and exitCode from the LLM CLI

semtest run --debug

This is useful for diagnosing unexpected LLM responses or retry behaviour.

Timeout

When --timeout <ms> is set (or timeout in config), each LLM subprocess is given a time limit. If the subprocess exceeds the limit:

SIGTERM is sent to the process
After 5 seconds, if still running, SIGKILL is sent
The test result is marked as an error with a timeout message

semtest run --timeout 60000  # 60 second timeout per test

A timeout of 0 (the default) means no time limit.

`semtest init`

Scaffolds a new semtest setup in the current directory:

semtest init

This creates:

File/Directory	Content
`semtest.config.ts`	Default config using `defineConfig()` with Claude as the runner
`semantic-tests/`	Test directory
`semantic-tests/example.spec.md`	Example test file with heading, expectation, and behaviour sections

If any of these already exist, they are skipped with a message. This command is safe to run in an existing project — it won’t overwrite anything.