What an external test pass looks like.
A library that promises data safety only matters if someone has tested it under conditions that look like reality. Here is the methodology, the numbers, and the gaps for iso-code v0.1.1 — reproducible end-to-end in 90 seconds against any repository of your choice.
This is the methodology companion to the post that headlines the bugs we found. If you're evaluating whether to depend on iso-code for a tool you're building, this is the page you want.
What we tested against
The PRD names the target market explicitly: AI coding orchestrators (Claude Code, Cursor, Claude Squad, OpenCode, VS Code Copilot) and the developers who run them on real codebases. The test set therefore had to look like what those tools actually point agents at — not famous OSS infrastructure repos.
| Repo | Size | Why included |
|---|---|---|
bat | 45 MB | Smallest baseline · Rust CLI · fast iteration |
flask | 15 MB | Confirms behavior is not Rust-specific · Python · default branch main |
shadcn-ui | 153 MB | TypeScript pnpm monorepo · pnpm-workspace.yaml |
git/git | 377 MB | Has sha1collisiondetection as a real submodule |
next.js | 2.7 GB | Headline target · large pnpm monorepo · default branch canary |
Five repos · four languages · three default-branch names · sizes spanning 180×.
The 15-check lifecycle suite
Each repo was subjected to identical checks. Every check is asserting one specific iso-code surface — not "does this work" in general, but "does this specific guarantee hold."
Smoke (8 checks)
Establishes the happy path. If smoke fails, nothing else is meaningful.
- list_initial:
wt listworks on a fresh repo - create:
wt createsucceeds, with timing - list_post_create: count increments correctly
- state_json_valid:
.git/iso-code/state.jsonparses as well-formed JSON - worktree_on_disk: the directory actually exists with content
- delete:
wt deleteremoves the worktree, with timing - list_post_delete: count returns to baseline
- gc_dry_run:
wt gcruns in dry-run mode and prints its summary format
Stretch (7 checks)
Exercises the safety invariants — the parts that distinguish iso-code from a thin wrapper around git worktree.
- multi_create_x3: three worktrees simultaneously, state stays consistent
- state_count_after_multi:
active_worktreesmatches reality - external_remove_reconciled: a worktree removed via raw
git worktree removeis detected on nextwt listand moved tostale_worktreeswith a 30-day TTL — instead of silently lying about its existence - lock_protection: a
git worktree lock'd worktree surviveswt deletewith a "locked" error; the directory is preserved - attach_external:
wt attachregisters a worktree iso-code didn't create - duplicate_branch_error: creating on an already-checked-out branch is refused cleanly
- missing_path_error: deleting a non-existent path returns "not found" — no panic, no stack trace
Results
All five repos passed all 15 checks. The performance was consistent — create time scales with checkout size, not git-history size, because git worktree add shares the .git/ directory between worktrees instead of copying it.
| Repo | Working tree | Create | Delete | Multi ×3 |
|---|---|---|---|---|
| flask | 2.4 MB | 164 ms | 504 ms | 570 ms |
| bat | 10 MB | 242 ms | 714 ms | 851 ms |
| git/git | 58 MB | 590 ms | 1,285 ms | 2,046 ms |
| shadcn-ui | 90 MB | 884 ms | 1,701 ms | 2,952 ms |
| next.js | 246 MB | 3,483 ms | 4,429 ms | 12,824 ms |
Multi-create scales near-linearly. Delete consistently runs about 1.5–2× slower than create — that's a filesystem-bound recursive remove, not iso-code overhead. There were no non-linear blowups at any size we tested.
Fixtures: the safety paths no public repo can exercise
Three of iso-code's promised safety guards can't be triggered by cloning anything off GitHub — public repos either don't have unmerged commits, or aren't shaped the right way, or aren't encrypted. So we wrote fixtures.
- Unmerged-commit guard: a repo with three unpushed, unmerged commits on a feature branch.
wt deleteagainst that branch must refuse. Result: refused withunmerged commits on 'feature/unmerged': 4 commit(s) not in upstream — use force to override. The directory was preserved on disk; data safe. The fetch step gracefully degraded to local-only when no remote was configured — exactly the behavior the PRD specifies. - Shallow + bare clones:
git clone --depth 1andgit clone --barewere both expected to be edge cases. They turned out not to be — both work transparently.wt listandwt createsucceeded against both shapes without any special handling. - git-crypt locked repo: fixture is built and ready, but skipped on this machine because
git-cryptisn't installed. Marked as a follow-up — the only README-promised safety guard not yet empirically verified.
The headline scenario: 4,972 symlinks
The single most important test in the pass — and the one that validates the bug class iso-code's README is built around.
On shadcn-ui, we created a worktree and ran pnpm install inside it. pnpm uses a global content-addressable store outside the worktree, with thousands of symlinks pointing into it from each project's node_modules. The setup:
| Metric | Value |
|---|---|
| Worktree size on disk | 1.3 GB |
Top-level node_modules | 1.2 GB |
node_modules directories | 4 |
| Symlinks in worktree | 4,972 |
| pnpm global store size | 1.1 GB |
Then wt delete:
| Metric | Pre | Post |
|---|---|---|
| Worktree on disk | 1.3 GB | removed |
| pnpm global store | 1.1 GB | 1.1 GB unchanged |
active_worktrees | 1 | 0 |
| Wall time | — | 12.4 s |
This is exactly where other orchestrators have shipped data-loss bugs: a worktree's node_modules is thousands of symlinks pointing into a single shared directory outside the worktree. A naive "follow symlinks during delete" implementation eats the shared store, breaking every other project on the machine. iso-code cleaned 4,972 symlinks in 12 seconds without touching a single byte of the global store.
What we deliberately didn't test
An honest test pass names its gaps. These are the surfaces that aren't yet validated and that an adopter should know about.
| Surface | Why not yet |
|---|---|
| git-crypt locked-repo guard | Fixture exists; git-crypt not installed on this machine |
| Concurrent create from two shells | Single-shell harness; needs a 2-terminal coordinator |
| MCP integration with Claude Code / Cursor | Tested via CLI; not yet wired through a real MCP client |
| EcosystemAdapter / DefaultAdapter | No adapter configured in any test |
| Force-delete of unmerged branch | Hinted at by guard error; not yet exercised |
| Disk-limit enforcement | CreateOptions::ignore_disk_limit never set |
| Port-lease allocation | CreateOptions::allocate_port never set |
| Crash-safe state-file write under induced crash | Logic verified by inspection; not stress-tested |
| Windows platform | macOS-only test pass · per PRD, Windows is M3 |
Reproduce it yourself
The whole pipeline is scripted. Anything you read above can be regenerated end-to-end in 90 seconds (after the initial repo clones). The harness is idempotent — it cleans up its own worktrees and branches at start and end, and never touches your main checkout.
# Prerequisites: wt (iso-code-cli) on PATH, git, python3 git clone https://github.com/sharkdp/bat.git cd bat ./test-lifecycle.sh . # 15 checks, sub-3s on most repos
Expected output:
[smoke] [PASS] list_initial (1 entries) [PASS] create (242ms) [PASS] list_post_create (2 entries) [PASS] state_json_valid [PASS] worktree_on_disk (10132KB) [PASS] delete (714ms) [PASS] list_post_delete [PASS] gc_dry_run [stretch] [PASS] multi_create_x3 (851ms) [PASS] state_count_after_multi (3 active) [PASS] external_remove_reconciled (active=2 stale=2) [PASS] lock_protection (dir preserved) [PASS] attach_external [PASS] duplicate_branch_error [PASS] missing_path_error ------------------------------------------ summary: 15 passed, 0 failed, 3s wall
Exit code is zero on all-pass, non-zero otherwise. Drop it into CI verbatim.
Findings filed back
Three issues were filed against the iso-code repo during this pass. The headline post covers them in detail. They are public, with repros and exact patches:
- #18 — CLI silently drops extra positional args (medium · 3-line patch)
- #19 — Error message leaks
OptionDebug format (low · 1-line patch) - #21 — Submodules left uninitialized in created worktrees (medium · feature addition)
brew install git-crypt. Anyone can reproduce these numbers locally.