Discovery

Source file: src/discovery/tests.ts

Discovery is the first active step in the pipeline. Its job is to scan the configured test directory, find all spec files (files following the .spec.* naming convention), read their full contents into memory, and return them as a SemanticTest[] array. This in-memory array is what drives everything downstream — the prompt builder reads test content from it, the executor iterates over it, and reports reference its file paths.

Test files can be any format — .spec.md, .spec.txt, .spec.json, .spec.yml, or just .spec — as long as the filename contains .spec. or ends with .spec. The tool is format-agnostic; the LLM parses whatever content it receives.

There are two modes: directory scan (the default, when you run semtest run with no arguments) and specific file/directory resolution (when you pass paths as CLI arguments, like semtest run auth.spec.md api.spec.txt). Both modes produce the same SemanticTest[] output, so the rest of the pipeline doesn’t know or care which mode was used.

`SemanticTest` type

Every discovered test file is represented as:

interface SemanticTest {
  name: string;      // Filename (e.g. "auth-middleware.spec.md", "api-routes.spec.txt")
  filePath: string;  // Absolute path
  content: string;   // Full file content (read into memory)
  group?: string;    // Directory-based group (e.g. "api" for tests in api/ subdirectory)
}

The group field enables directory-based test organisation. Tests in subdirectories of the test root are assigned a group matching their relative directory path. This group is used by the report module to organise results by directory.

Spec file convention

The isSpecFile() function determines whether a file qualifies as a semantic test:

function isSpecFile(filename: string): boolean

A file is a spec file if:

The filename contains .spec. (e.g. auth.spec.md, config.spec.txt)
The filename ends with .spec (e.g. auth.spec)

This convention allows any file extension while providing a clear naming pattern that distinguishes test specs from other files in the directory.

Recursive directory walking

Discovery uses a recursive walkDir() function that traverses the entire test directory tree, not just the top level. This means you can organise test files in subdirectories:

semantic-tests/
├── api/
│   ├── routes.spec.md
│   └── middleware.spec.txt
├── config/
│   └── schema.spec.md
└── structure.spec.md

Dotfiles and dotdirectories (names starting with .) are excluded from the walk.

Two discovery modes

`discoverTests(testDir, extensions?)`

Used when no specific files are passed on the CLI (the default). It:

Resolves testDir to an absolute path
Recursively walks the directory tree via walkDir()
Filters to files matching the .spec.* convention via isSpecFile()
If extensions is provided, further filters to matching extensions (e.g. [".md"])
Sorts alphabetically by full path
Reads each file’s content into memory
Assigns a group based on the relative directory path (e.g. files in api/ get group: "api")

Throws if the directory doesn’t exist or contains no matching files.

`resolveTests(filePaths, testDir, extensions?)`

Used when specific paths are passed as CLI arguments (e.g. semtest run auth.spec.md api/). Handles both files and directories:

For directories:

Delegates to discoverTests() for the directory
Adjusts groups to include the directory’s relative path from the test root

For files:

Tries to resolve the path as-is (absolute or relative to cwd)
Falls back to looking inside testDir
Warns (but doesn’t error) if the file doesn’t follow the .spec.* convention
Assigns a group based on the relative directory path

Both modes deduplicate results — if the same file is discovered through multiple paths, it only appears once.

Error handling

Condition	Error
Test directory doesn’t exist	`"Test directory not found: /absolute/path"`
No spec files found	`"No test files found in /absolute/path"`
Specific file not found	`"Test file not found: filename"`
File without `.spec.*` convention	Warning printed to stderr (not an error)