Prompt Unit Tests: 3 Bash Scripts That Catch Regressions Before Deploy
You changed one line in your system prompt and broke three downstream features. No tests caught it because — let’s be honest — you don’t test your prompts. Here are three dead-simple bash scripts I...

Source: DEV Community
You changed one line in your system prompt and broke three downstream features. No tests caught it because — let’s be honest — you don’t test your prompts. Here are three dead-simple bash scripts I use to catch prompt regressions before they hit production. Script 1: The Golden Output Test This script sends a fixed input to your prompt and diffs the output against a known-good response. #!/bin/bash # test-golden.sh — Compare prompt output against golden file PROMPT_FILE="$1" INPUT_FILE="$2" GOLDEN_FILE="$3" ACTUAL=$(cat "$PROMPT_FILE" "$INPUT_FILE" | \ curl -s https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d @- <<EOF { "model": "gpt-4o-mini", "messages": [ {"role": "system", "content": "$(cat $PROMPT_FILE)"}, {"role": "user", "content": "$(cat $INPUT_FILE)"} ], "temperature": 0 } EOF | jq -r '.choices[0].message.content') echo "$ACTUAL" > /tmp/prompt-test-actual.txt if diff -q "$GOLDEN_FILE" /tmp