Skip to content

Pipeline: From Gatekeeper to Active Verifier

Traditional CI pipelines act as gatekeepers (compile? tests pass? no CVEs? ship). That model worked when every commit had a human behind it with business context and intentional tradeoffs. When an agent generates the code, that implicit judgment layer disappears.

The question shift (verbatim)

Traditional Agentic
Does it compile? Does it compile, and does it match the spec it was given?
Do tests pass? Do tests pass, and did the agent also generate the tests (potentially biased)?
Any known vulns? Any vulns, and did the agent introduce dependencies that don't exist in any registry?
Does lint pass? Does the code follow the repo's architectural patterns, not just formatting?
Coverage above threshold? Does coverage reflect meaningful assertions, or trivial tests?

Three verification layers

  1. Structural — code matches repo patterns (file placement, dep policies, naming, architecture).
  2. Semantic — code does what it claims, validated against spec acceptance criteria or behavioral diff.
  3. Provenance — every artifact traced to a legitimate source; catches fabricated deps, typosquats, supply-chain attacks.

Agent-specific threat model

  • Prompt injection via code comments or issue descriptions.
  • Supply chain poisoning when agents add deps autonomously.
  • Scope creep when agents modify workflow files, deploy scripts, or security configs beyond the task.

Safeguards: path-based restrictions, dependency allowlists, signature/provenance verification, prompt-injection scanning.

Hallucination gates

  • Verify every added package actually exists in its registry (fabricated dep detection).
  • Strict type-checking + integration tests (catches dead/incorrect API usage).
  • Mutation testing + coverage quality analysis (catches self-validating tests).

Why this matters

This is the operationalised form of verifiability-frontier — Eric says "verifiable tasks are agent-territory"; Sanchez says "here's how to make the verification layer the actual bottleneck." Also: ai-generated-code-is-untrusted at the CI layer.

Connects to