Skip to content

Discovery

Source file: src/discovery/tests.ts

Discovery is the first active step in the pipeline. Its job is to scan the configured root directory, find all test files matching the testMatch glob patterns, read their full contents into memory (including frontmatter extraction), and return them as a SemanticTest[] array. This in-memory array is what drives everything downstream — the prompt builder reads test content from it, the executor iterates over it, and reports reference its file paths.

There are two modes: glob-based discovery (the default, when you run semtest run with no arguments) and specific file/directory resolution (when you pass paths as CLI arguments, like semtest run auth.spec.md api/). Both modes produce the same SemanticTest[] output, so the rest of the pipeline doesn’t know or care which mode was used.

Every discovered test file is represented as:

interface SemanticTest {
name: string; // Filename (e.g. "auth-middleware.spec.md")
filePath: string; // Absolute path
content: string; // File content with frontmatter stripped
rawContent: string; // Original file content including frontmatter
frontmatter: FrontmatterData; // Parsed frontmatter fields
tags: string[]; // Tags extracted from frontmatter (empty array if none)
group?: string; // Directory-based group (e.g. "api" for tests in api/ subdirectory)
}

The group field enables directory-based test organisation. Tests in subdirectories of the root are assigned a group matching their relative directory path. This group is used by the report module to organise results by directory.

The frontmatter field contains parsed YAML frontmatter (tags, timeout, llm, skipPermissionsIfPossible). The tags field is a convenience accessor — equivalent to frontmatter.tags ?? [].

The isTestFile() function determines whether a file qualifies as a semantic test:

function isTestFile(filename: string): boolean

A file is a test file if the filename ends with .spec.md or .test.md. This convention provides a clear naming pattern that distinguishes test specs from other files.

Discovery uses tinyglobby for fast glob matching. The testMatch config field defines which patterns to match (default: ["**/*.spec.md", "**/*.test.md"]), and testPathIgnorePatterns defines directories to exclude (default: ["node_modules", "dist", ".git", "vendor"]). The output directory is also automatically excluded.

semtests/
├── api/
│ ├── routes.spec.md
│ └── middleware.test.md
├── config/
│ └── schema.spec.md
└── structure.spec.md

async discoverTests(options): Promise<SemanticTest[]>

Section titled “async discoverTests(options): Promise<SemanticTest[]>”

Used when no specific files are passed on the CLI (the default). Takes an options object:

interface DiscoverOptions {
rootDir: string;
testMatch: string[];
testPathIgnorePatterns: string[];
outputDir: string;
}

It:

  1. Resolves rootDir to an absolute path
  2. Builds ignore patterns from testPathIgnorePatterns and the output directory
  3. Runs glob(testMatch, { cwd, ignore, absolute: true }) via tinyglobby
  4. Sorts results alphabetically by full path
  5. For each file: reads content, parses frontmatter, extracts tags
  6. Assigns a group based on the relative directory path (e.g. files in api/ get group: "api")

Throws if no matching files are found.

async resolveTests(filePaths, rootDir, testMatch, testPathIgnorePatterns, outputDir): Promise<SemanticTest[]>

Section titled “async resolveTests(filePaths, rootDir, testMatch, testPathIgnorePatterns, outputDir): Promise<SemanticTest[]>”

Used when specific paths are passed as CLI arguments (e.g. semtest run auth.spec.md api/). Handles both files and directories:

For directories:

  1. Delegates to discoverTests() for the directory
  2. Adjusts groups to include the directory’s relative path from the root

For files:

  1. Tries to resolve the path as-is (absolute or relative to cwd)
  2. Falls back to looking inside rootDir
  3. Warns (but doesn’t error) if the file doesn’t follow the .spec.md or .test.md convention
  4. Parses frontmatter and extracts tags
  5. Assigns a group based on the relative directory path

Both modes deduplicate results using a seen Set of resolved absolute paths — if the same file is discovered through multiple paths (e.g. semtest run semtests/ semtests/auth.spec.md), it only appears once.

ConditionError
No matching files found"No test files found in /absolute/path"
Specific file not found"Test file not found: filename"
File without .spec.md/.test.md conventionWarning printed to stderr (not an error)