1 minute read

Written by - Millan Kaul

Spec-driven development (SDD) positions QA as the enforcer of executable specs, your north star for AI reliability. In AI-native workflows, QA shifts from reactive bugs to proactive spec validation, catching hallucinations before deploy.

Opinion: Without QA owning SDD, AI ships “good enough” disasters; with it, specs become living tests.

QA’s Pivotal Role

QA authors/validates specs as contracts (inputs, outputs, edges), then verifies AI outputs match.

  • Tools like GitHub Spec Kit auto-gen tests from specs, reducing manual toil by 40%.
  • QA metrics: Schema compliance, edge coverage >90%.

Image 01

QA-Centric Workflow

  • QA Writes Spec: YAML/Markdown with assertions (e.g., “no PII, confidence >0.8”).
  • AI Task Breakdown: Prompt-engineered plans, QA reviews.
  • Implement + QA Gate: Code/prompts must pass spec-evals (LLM-judge, unit tests).
  • QA Iteration: Human-in-loop for failures.

Sample: QA Spec → PII-Safe API

QA spec enforces no PII leak.

# qa_spec.yaml

spec:
  endpoint: /users
  input: {email: string@format=email, name: string}
  output: {id: string, name: string}  # No email echo!
  assertions:
    - no_pii: true  # QA guardrail hook
    - schema_match: 100%

FastAPI + QA Guardrail

# test.py

from fastapi import FastAPI
from pydantic import BaseModel, EmailStr
from nemoguardrails import LLMRails  # QA layer

app = FastAPI()
rails = LLMRails.from_config("./qa_config")  #Spec + PII rails

class User(BaseModel):
    email: EmailStr
    name: str

@app.post("/users")
async def create(user: user):
    # QA Pre: Spec input validation
    if not rails.process_input(user.model_dump()):  #PII/block
        raise HTTPException(400, "PII detected")
    # LLM? Or logic, post-validate output
    result = {"id": user.email, "name": user.name}  #No email!
    if not rails.process_output(result):  #Spec schema check
        raise HTTPException(500, "Spec violation")
    return result

Run QA tests: pytest --spec qa_spec.yaml for 100% alignment.

QA Wins

  • Cuts drift 50% via evals.
  • Guardrails as spec extensions.

QA Checklist

  • Spec coverage: Edges/PII.
  • Auto-eval: >95% pass.
  • CI/CD gate: QA signoff.

QA makes SDD unbreakable—AI follows specs, not whims.