Running Tests
Commands
Section titled “Commands”semtest provides two commands: run for executing tests and init for scaffolding a new setup.
semtest run
Section titled “semtest run”Basic usage
Section titled “Basic usage”# Run all spec files in the configured directorysemtest run
# Run specific test filessemtest run auth-middleware.spec.md api-routes.spec.txt
# Run all specs in a subdirectorysemtest run api/
# Run with full pathssemtest run semantic-tests/auth-middleware.spec.mdWhen file or directory arguments are provided, they’re resolved against cwd first, then against the configured tests directory.
CLI flags
Section titled “CLI flags”| Flag | Type | Default | Description |
|---|---|---|---|
--timestamp | boolean | false | Generate a timestamped copy of the Markdown report |
--include-passing | boolean | false | Include passing tests in the Markdown report |
--strict | boolean | false | Exit code 2 if validation issues are found |
--skip-validation | boolean | false | Skip post-run validation entirely |
--extensions <exts> | string | (all files) | Comma-separated file extensions (e.g. .md,.txt) |
--debug | boolean | false | Log raw LLM output to {output}/debug/ |
--timeout <ms> | number | 0 | Timeout per test in milliseconds (0 = no timeout) |
--junit | boolean | false | Generate JUnit XML report |
Config file options
Section titled “Config file options”All CLI flags can also be set in semtest.config.ts:
import { defineConfig } from "@westopp/semtest";
export default defineConfig({ tests: "semantic-tests/", output: "semantic-test-results/", llm: { runner: "claude", capability: "balanced", }, strict: true, debug: true, timestamp: true, includePassing: false, extensions: [".md", ".txt"], timeout: 60000, junit: true,});Flag precedence
Section titled “Flag precedence”CLI flags always override config file values:
CLI flag > config file > schema defaultFor example, if the config has strict: true but you run semtest run without --strict, strict mode is still enabled. But if you explicitly pass a flag, it wins.
Exit codes
Section titled “Exit codes”| Code | Meaning | When |
|---|---|---|
0 | Pass | All tests passed |
1 | Fail | At least one test failed (but no errors) |
2 | Error | LLM subprocess error, parse error, or --strict with validation issues |
Precedence: error (2) > fail (1) > pass (0)
Debug mode
Section titled “Debug mode”When --debug is enabled:
- A
debug/directory is created inside the output directory (cleared on each run) - For each test file, a JSON file is written containing all retry attempts
- Each attempt includes the raw
stdout,stderr, andexitCodefrom the LLM CLI
semtest run --debugThis is useful for diagnosing unexpected LLM responses or retry behaviour.
Timeout
Section titled “Timeout”When --timeout <ms> is set (or timeout in config), each LLM subprocess is given a time limit. If the subprocess exceeds the limit:
SIGTERMis sent to the process- After 5 seconds, if still running,
SIGKILLis sent - The test result is marked as an error with a timeout message
semtest run --timeout 60000 # 60 second timeout per testA timeout of 0 (the default) means no time limit.
semtest init
Section titled “semtest init”Scaffolds a new semtest setup in the current directory:
semtest initThis creates:
| File/Directory | Content |
|---|---|
semtest.config.ts | Default config using defineConfig() with Claude as the runner |
semantic-tests/ | Test directory |
semantic-tests/example.spec.md | Example test file with heading, expectation, and behaviour sections |
If any of these already exist, they are skipped with a message. This command is safe to run in an existing project — it won’t overwrite anything.