Test System
Terminology: This specification uses RFC 2119 keywords (MUST, SHOULD, MAY, etc.) to indicate requirement levels.
This document describes the reference test system in Structyl.
Overview
Structyl provides a language-agnostic reference test system. Test data is stored in JSON format and shared across all language implementations, ensuring consistent behavior.
Non-Goals
The reference test system does NOT provide:
- Perceptual/fuzzy binary comparison — Binary outputs are compared byte-for-byte exactly; no image similarity or fuzzy matching
- Test mutation or fuzzing — Test cases are static JSON files; mutation testing is out of scope
- Coverage measurement — Coverage is delegated to language-specific tooling
- Test generation — Structyl does not mandate or provide test generation tools
- Parallel test execution — Parallelism is at the target level, not individual test case level
Test Data Format
Basic Structure
Every test file has input and output:
{
"input": {
"x": [1.0, 2.0, 3.0, 4.0, 5.0]
},
"output": 3.0
}Test Case Schema
| Field | Required | Type | Description |
|---|---|---|---|
input | Yes | object | Input parameters for the function under test |
output | Yes | any (not null) | Expected output value; null is invalid (see Loading Failure Behavior) |
description | No | string | Optional documentation for the test case |
skip | No | boolean | When true, marks the test as skipped |
tags | No | string[] | Optional categorization for filtering or grouping |
Canonical Identifier: The canonical identifier for a test case is {suite}/{name} (e.g., math/addition). When suite is empty, the identifier is just the name. The forward slash (/) separator is used on all platforms for cross-platform consistency. In pkg/testhelper, use TestCase.ID() to obtain this identifier.
Validation Rules:
- Missing
inputfield: Load fails withtest case {suite}/{name}: missing required field "input" - Missing
outputfield: Load fails withtest case {suite}/{name}: missing required field "output" - Any additional fields beyond those listed above are silently ignored (forward-compatibility)
- Empty
inputobject ({}) is valid
Tag usage: Tags have no built-in semantics in Structyl. Language implementations MAY use tags to filter test execution, group tests in output, or skip tests based on environment capabilities. Tag values are free-form strings; establish conventions per-project.
Tags validation: Tags are intentionally permissive: empty strings, duplicates, and any characters are allowed. This design avoids constraining downstream tooling. Establish per-project conventions for tag naming.
Reserved Field Names
The field names timeout, setup, and teardown are reserved for future specification versions. Users SHOULD NOT use these for custom metadata as they MAY gain normative semantics in future releases. These fields are currently ignored by all loaders.
Loading Failure Behavior
Test loading is all-or-nothing per suite:
| Condition | Behavior | Exit Code |
|---|---|---|
| JSON parse error | Suite load fails | 2 |
Missing required field (input or output) | Suite load fails | 2 |
output field is explicit null | Suite load fails | 2 |
Referenced $file not found | Suite load fails | 2 |
Referenced $file path escapes suite directory (../) | Suite load fails | 2 |
Loading failures are configuration errors (exit code 2), distinct from test execution failures (exit code 1). A loading failure prevents any tests in that suite from executing.
pkg/testhelperlimitation: The public Go package uses*.jsonpattern (immediate directory only), not the recursive**/*.jsonpattern supported by Structyl's internal runner. See the Test Loader Implementation section.
API Differences: Internal vs Public
| Capability | Internal Runner (internal/tests) | Public API (pkg/testhelper) |
|---|---|---|
| Glob patterns | **/*.json (recursive) | *.json (immediate directory only) |
$file references | Full support | ErrFileReferenceNotSupported |
| Binary data | Via file references | Embed as base64 or load separately |
Users requiring recursive patterns or binary file references SHOULD use the internal package or implement project-specific loading. See pkg/testhelper Limitations for workarounds.
Error message format:
structyl: test suite "{suite}": {reason}
file: {path}Error Types (pkg/testhelper)
The pkg/testhelper package returns specific error types for programmatic handling:
| Error Type | Sentinel | Condition |
|---|---|---|
ProjectNotFoundError | ErrProjectNotFound | No .structyl/config.json found in ancestors |
SuiteNotFoundError | ErrSuiteNotFound | Test suite directory does not exist |
TestCaseNotFoundError | ErrTestCaseNotFound | Test case file does not exist |
InvalidSuiteNameError | ErrInvalidSuiteName | Suite name contains .., /, \, or \0 |
InvalidTestCaseNameError | ErrInvalidTestCaseName | Test case name contains .., /, \, or \0 |
The InvalidSuiteNameError and InvalidTestCaseNameError types include a Reason field indicating why the name was rejected:
| Constant | Value | Meaning |
|---|---|---|
ReasonPathTraversal | "path_traversal" | Name contains .. sequence |
ReasonPathSeparator | "path_separator" | Name contains / or \ |
ReasonNullByte | "null_byte" | Name contains null byte (\0) |
Additional sentinels (no struct type):
| Sentinel | Condition |
|---|---|
ErrEmptySuiteName | Empty suite name provided |
ErrEmptyTestCaseName | Empty test case name provided |
ErrFileReferenceNotSupported | $file reference in test case |
Usage:
tc, err := testhelper.LoadTestCase(path)
if err != nil {
if errors.Is(err, testhelper.ErrTestCaseNotFound) {
// Handle missing file
}
var suiteErr *testhelper.SuiteNotFoundError
if errors.As(err, &suiteErr) {
// Access suiteErr.Suite for context
}
}Input Structure
Input MUST be a JSON object (map). The object MAY be empty ({}). Scalar values and arrays as the top-level input MUST NOT be used; the loader MUST reject such inputs with exit code 2 (Configuration Error).
Within the input object, values can be:
- Scalar values: numbers, strings, booleans, null
- Arrays:
[1.0, 2.0, 3.0] - Nested objects:
{"config": {"alpha": 0.05}}
{
"input": {
"x": [1.0, 2.0, 3.0],
"y": [4.0, 5.0, 6.0],
"alpha": 0.05
},
"output": 2.5
}Why object-only? Test inputs represent named parameters. Objects provide named access and align with how most test frameworks structure input.
Output Types
Outputs can be:
- Scalar:
"output": 42 - Array:
"output": [1, 2, 3] - Object:
"output": {"lower": -4, "upper": 0}
{
"input": {
"x": [1, 2, 3, 4, 5],
"y": [3, 4, 5, 6, 7],
"misrate": 0.05
},
"output": {
"lower": -4,
"upper": 0
}
}Binary Data References (Internal Only)
Public API Limitation
The $file reference syntax is only available in Structyl's internal test runner (internal/tests package). The public Go package pkg/testhelper does NOT support this syntax and rejects ANY $file reference with ErrFileReferenceNotSupported. External implementations MUST either embed binary data directly in JSON or use Structyl's internal package.
The validation rules in the table below apply only to the internal runner. The public testhelper package rejects all $file references unconditionally.
For projects using the internal runner, binary data can be referenced via the $file syntax:
{
"input": {
"data": { "$file": "input.bin" },
"format": "raw"
},
"output": { "$file": "expected.bin" }
}File Reference Schema
A file reference is a JSON object with exactly one key $file:
{ "$file": "<relative-path>" }Validation rules (internal runner only):
- The object MUST have exactly one key:
$file - The value MUST be a non-empty string
- Objects with
$fileand other keys are invalid
| Example | Valid | Reason |
|---|---|---|
{"$file": "input.bin"} | ✓ | Correct format |
{"$file": "data/input.bin"} | ✓ | Subdirectory allowed |
{"$file": ""} | ✗ | Empty path |
{"$file": "../input.bin"} | ✗ | Parent reference not allowed |
{"$file": "/etc/passwd"} | ✗ | Absolute paths not allowed |
{"$file": "x.bin", "extra": 1} | ✗ | Extra keys not allowed |
{"FILE": "input.bin"} | ✗ | Wrong key (case-sensitive) |
Implementation note: The validation table above describes semantics for Structyl's internal runner. The public
pkg/testhelperpackage rejects ANY$filereference regardless of object structure—see the warning box in Binary Data References.
Path Resolution: Paths in $file references are resolved relative to the directory containing the JSON test file.
Example:
- Test file:
tests/image-processing/resize-test.json - Reference:
{"$file": "input.bin"} - Resolved path:
tests/image-processing/input.bin
Subdirectory references are permitted:
- Reference:
{"$file": "data/input.bin"} - Resolved path:
tests/image-processing/data/input.bin
Parent directory references (../) and absolute paths (starting with / on Unix or drive letters on Windows) are NOT permitted and will cause a load error. Only relative paths within the suite directory are valid.
Symlink handling: Symlinks are followed during resolution. However, if the resolved target path is outside the suite directory, the reference MUST be rejected.
Path separator normalization: Use forward slashes (/) in $file references for cross-platform portability. Implementations SHOULD normalize path separators internally.
Binary files are stored alongside the JSON file:
tests/
└── image-processing/
├── resize-test.json
├── input.bin
└── expected.binBinary Output Comparison
Binary outputs (referenced via $file) are compared byte-for-byte exactly:
- No byte order normalization (files MUST use consistent endianness)
- No line ending normalization (CRLF and LF are distinct bytes)
- No encoding normalization (UTF-8 BOM presence is significant)
- No tolerance is applied to binary data
For outputs requiring approximate comparison (e.g., images with compression artifacts), test authors MUST either:
- Use deterministic output formats (e.g., uncompressed BMP instead of JPEG)
- Pre-process outputs to a canonical form before comparison
- Extract comparable numeric values into the JSON
outputfield instead
Structyl does not provide perceptual or fuzzy binary comparison.
Test Discovery
Algorithm
- Find project root: Walk up from CWD until
.structyl/config.jsonfound - Locate test directory:
{root}/{tests.directory}/(default:tests/) - Discover suites: Immediate subdirectories of test directory
- Load test cases: Files matching
tests.pattern(default:**/*.json)
Glob Pattern Syntax
The tests.pattern field supports a simplified subset of glob syntax:
| Pattern | Matches |
|---|---|
* | Any sequence of non-separator characters |
**/*.json | All .json files recursively (simplified) |
Simplified Pattern Matching
The internal test loader uses a simplified pattern matching implementation, not a full glob library. The **/*.json pattern recursively finds all .json files but does not support full globstar semantics (e.g., intermediate directory matching like foo/**/bar). For most test organization patterns, this is sufficient.
Examples:
**/*.json- All JSON files in any subdirectory (default)*.json- JSON files matching standard glob on filename only
Directory Structure
tests/
├── center/ # Suite: "center"
│ ├── demo-1.json # Case: "demo-1"
│ ├── demo-2.json # Case: "demo-2"
│ └── edge-case.json # Case: "edge-case"
├── shift/ # Suite: "shift"
│ └── ...
└── shift-bounds/ # Suite: "shift-bounds"
└── ...Naming Conventions
- Suite names: lowercase, hyphens allowed (e.g.,
shift-bounds) - Test names: lowercase, hyphens allowed (e.g.,
demo-1) - No spaces: Use hyphens instead
Output Comparison
Floating Point Tolerance
Configure tolerance in .structyl/config.json:
{
"tests": {
"comparison": {
"float_tolerance": 1e-9,
"tolerance_mode": "relative"
}
}
}Tolerance Modes
| Mode | Formula | Use Case |
|---|---|---|
absolute | |expected − actual| ≤ tolerance | Small values |
relative | |expected − actual| / |expected| ≤ tolerance | General purpose |
ulp | ULP difference ≤ tolerance | IEEE precision |
Note: For relative mode, when expected is exactly 0.0, the formula changes to |actual| <= tolerance to avoid division by zero.
Numeric Type Handling
When comparing expected and actual values:
- JSON-decoded numbers are
float64per Go'sencoding/jsonsemantics - Programmatically-constructed
intactual values are converted tofloat64before comparison - This enables test assertions with integer literals:
Equal(1.0, myIntResult, opts)works even ifmyIntResultisint
This accommodation only applies to the actual parameter; expected values from JSON are always float64.
Array Comparison
{
"tests": {
"comparison": {
"array_order": "strict"
}
}
}| Mode | Behavior |
|---|---|
strict | Order matters, element-by-element comparison |
unordered | Order doesn't matter (multiset comparison); array lengths MUST match, duplicates are counted |
Special Floating Point Values
JSON cannot represent NaN or Infinity directly. Structyl uses special string values as placeholders.
Case Sensitivity
Special float strings are matched exactly. Only these exact strings trigger special handling:
"NaN"— not"nan","NAN", or"Nan""Infinity"or"+Infinity"— not"infinity"or"INFINITY""-Infinity"— not"-infinity"
Lowercase or other variants are treated as regular strings, not special float values.
JSON representation:
| Value | JSON String |
|---|---|
| Positive infinity | "Infinity" or "+Infinity" |
| Negative infinity | "-Infinity" |
| Not a Number | "NaN" |
Example:
{
"input": { "x": [1.0, "Infinity", "-Infinity"] },
"output": "NaN"
}Configuration:
{
"tests": {
"comparison": {
"nan_equals_nan": true
}
}
}Comparison behavior for IEEE 754 special values:
| Comparison | Result |
|---|---|
NaN == NaN | true (configurable via nan_equals_nan: false) |
+Infinity == +Infinity | true |
-Infinity == -Infinity | true |
+Infinity == -Infinity | false |
-0.0 == +0.0 | true |
Test Loader Implementation
Informative Section
This section is informative only. The code examples illustrate one possible implementation approach. Conforming implementations MAY use different designs, APIs, or patterns as long as they satisfy the functional requirements.
pkg/testhelper Limitations
WARNING
The public Go pkg/testhelper package has the following limitations compared to Structyl's internal test runner:
No
$filereferences: File reference resolution is only available in the internal runner. Test cases using$filesyntax SHOULD either use theinternal/testspackage or embed data directly in JSON.No recursive glob patterns:
LoadTestSuiteusesfilepath.Glob("*.json")which matches JSON files in the immediate suite directory only. Thetests.patternconfiguration setting (which supports**recursive patterns) is only used by Structyl's internal runner. To load nested test files withpkg/testhelper, iterate subdirectories manually.
Alternative approaches for binary test data:
- Base64 encoding: Embed binary data as base64-encoded strings in JSON and decode in your test setup
- Separate loading: Implement project-specific file loading alongside
pkg/testhelperthat reads binary files directly from the suite directory - Pre-compute comparisons: For binary output verification, compute checksums (SHA256) in JSON expected output and compare hashes instead of raw bytes
Deprecated Functions
See Stability Policy — Current Deprecations for the complete list of deprecated pkg/testhelper functions and constants with their replacements and removal timeline.
Thread Safety
All loader and comparison functions in pkg/testhelper are safe for concurrent use:
- Loader functions (
LoadTestSuite,LoadTestCase, etc.) perform read-only filesystem operations and can be called concurrently. - Comparison functions (
Equal,Compare,FormatComparisonResult) are pure functions with no shared state. - The
TestCasetype is safe to read concurrently, but callers MUST NOT modify aTestCasewhile other goroutines are reading it.
Copy Semantics Warning
Shallow Copy Behavior
TestCase.Clone() and all With* builder methods perform shallow copies. The Output field is NOT copied—both original and clone share the same reference. Modifying Output on a clone also modifies the original:
clone := original.Clone()
clone.Output.(map[string]interface{})["key"] = "changed"
// Surprise: original.Output["key"] is also changed!Additionally, while Input is shallow-copied at the top level, nested values within Input are shared. Modifying nested Input values affects the original.
Use TestCase.DeepClone() when you need to modify Output or nested Input values independently.
TestCase Validation Methods
The TestCase type provides three validation methods forming a hierarchy of increasing strictness:
| Method | Checks | Use Case |
|---|---|---|
Validate() | Name, Input, Output non-nil | Basic structural validation |
ValidateStrict() | Above + top-level Output type | Programmatic TestCase construction |
ValidateDeep() | Above + recursive type validation for all values | Complex nested structures |
When to use each method:
Validate(): Default choice for programmatically-created TestCase instances. Ensures required fields are present.ValidateStrict(): Use when Output comes from non-JSON sources (e.g., computed values) to verify the top-level type is JSON-compatible.ValidateDeep(): Use for deeply nested structures to ensure all values (including nested maps and arrays) contain only JSON-compatible types.
Loader functions (LoadTestCase, LoadTestSuite) already validate these requirements, so calling validation methods after loading is unnecessary.
// Programmatic TestCase creation
tc := testhelper.TestCase{
Name: "my-test",
Input: map[string]interface{}{"x": 1.0},
Output: map[string]interface{}{"result": 2.0},
}
// Choose validation level based on needs
if err := tc.Validate(); err != nil { // structural only
return err
}
if err := tc.ValidateDeep(); err != nil { // recursive type check
return err
}Panic Behavior
The comparison functions (Equal, Compare, FormatComparisonResult) panic on invalid CompareOptions:
| Condition | Panic |
|---|---|
ToleranceMode not in {"", "relative", "absolute", "ulp"} | Yes |
ArrayOrder not in {"", "strict", "unordered"} | Yes |
FloatTolerance < 0 | Yes |
ToleranceMode == "ulp" and FloatTolerance > math.MaxInt64 | Yes |
This design treats invalid options as programmer errors (fail-fast) rather than runtime conditions.
Stability Note: Panic message format is unstable and MAY change between versions. See stability.md for details. For user-provided options, use one of these approaches:
- Validate before comparison: Call
ValidateOptions(opts)first; if it returnsnil, subsequent comparison calls will not panic - Use error-returning variants:
EqualE,CompareE, andFormatComparisonResultEreturn errors instead of panicking
// Option 1: Validate upfront
if err := testhelper.ValidateOptions(opts); err != nil {
return fmt.Errorf("invalid options: %w", err)
}
result := testhelper.Equal(expected, actual, opts) // safe
// Option 2: Error-returning variant
result, err := testhelper.EqualE(expected, actual, opts)
if err != nil {
return fmt.Errorf("invalid options: %w", err)
}Each language MUST implement a test loader. Required functionality:
- Locate project root via marker file traversal
- Discover test suites by scanning test directory
- Load JSON files and deserialize to native types
- Compare outputs with appropriate tolerance
Example: Go Test Loader
Public API vs Internal Implementation
The example below is illustrative. For the actual public Go API, see the pkg/testhelper package. For internal implementation with full glob support and $file resolution, see internal/tests.
// Illustrative pseudocode - see pkg/testhelper for actual API
package testhelper
import (
"encoding/json"
"path/filepath"
)
type TestCase struct {
Name string
Suite string
Input map[string]interface{}
Output interface{}
}
func LoadTestSuite(projectRoot, suite string) ([]TestCase, error) {
pattern := filepath.Join(projectRoot, "tests", suite, "*.json")
files, err := filepath.Glob(pattern)
if err != nil {
return nil, err
}
var cases []TestCase
for _, f := range files {
tc := loadTestCase(f)
tc.Suite = suite
cases = append(cases, tc)
}
return cases, nil
}
func Equal(expected, actual interface{}, opts CompareOptions) bool {
// Implementation with tolerance handling
}pkg/testhelper Field Mapping
The actual pkg/testhelper.TestCase struct uses the following JSON field mapping:
| JSON Field | Go Field | Go Type | Notes |
|---|---|---|---|
input | Input | map[string]interface{} | Required; must be a JSON object |
output | Output | interface{} | Required; any non-null JSON value |
description | Description | string | Optional |
skip | Skip | bool | Optional; defaults to false |
tags | Tags | []string | Optional |
| — | Name | string | Set by loader from filename (not in JSON) |
| — | Suite | string | Set by loader from directory (not in JSON) |
The Name and Suite fields have json:"-" tags and are populated by the loader functions, not deserialized from JSON. Use LoadTestSuite or LoadTestCaseWithSuite to automatically populate the Suite field.
Example: Python Test Loader
import json
from pathlib import Path
def load_test_suite(project_root: Path, suite: str) -> list[dict]:
suite_dir = project_root / "tests" / suite
cases = []
for f in suite_dir.glob("*.json"):
with open(f) as fp:
data = json.load(fp)
data["name"] = f.stem
data["suite"] = suite
cases.append(data)
return cases
def compare_output(expected, actual, tolerance=1e-9) -> bool:
# Implementation with tolerance handling
passConfiguration
{
"tests": {
"directory": "tests",
"pattern": "**/*.json",
"comparison": {
"float_tolerance": 1e-9,
"tolerance_mode": "relative",
"array_order": "strict",
"nan_equals_nan": true
}
}
}| Field | Default | Description |
|---|---|---|
directory | "tests" | Test data directory |
pattern | "**/*.json" | Glob pattern for test files (internal runner only; pkg/testhelper uses *.json) |
comparison.float_tolerance | 1e-9 | Numeric tolerance |
comparison.tolerance_mode | "relative" | How tolerance is applied |
comparison.array_order | "strict" | Array comparison mode |
comparison.nan_equals_nan | true | NaN equality behavior |
pkg/testhelper Limitation
See pkg/testhelper Limitations for pattern support differences between pkg/testhelper (immediate directory only) and the internal runner (recursive ** patterns).
Test Generation
Structyl does not mandate a specific test generation process. The following approach is RECOMMENDED:
- Generate tests in a consistent language (e.g., the reference implementation)
- Store generated JSON in
tests/ - Commit generated tests to version control
- Re-generate when algorithms change
Example command (project-specific):
structyl cs generate # Project-specific test generationBest Practices
- Use descriptive test names:
negative-values,edge-empty-array - Organize by functionality: One suite per function/feature
- Include edge cases: Empty inputs, boundary values, special cases
- Document expected precision: In suite README or comments
- Version test data: Commit to git, review changes