ADR-030: CodeQL Summary Job Pattern for Required Status Checks
Status
Accepted - 2025-10-29
Context
Fork repositories experienced PR blocking issues when using the CodeQL workflow with GitHub repository rulesets. Two related problems emerged:
Problem 1: Required Status Check Mismatch
Repository rulesets configured during initialization require a status check named "CodeQL":
{
"type": "required_status_checks",
"parameters": {
"required_status_checks": [
{"context": "CodeQL"}
]
}
}
However, the CodeQL workflow provided different job names: - "Check if Code Changes Present" (always runs) - "Detect Project Languages" (conditional) - "Analyze Code" (conditional)
When PRs contained only configuration changes (e.g., dependabot.yml, workflows), the CodeQL workflow correctly skipped analysis to save resources. However, no check named "CodeQL" ever reported, leaving PRs in a BLOCKED state indefinitely.
Problem 2: Code Scanning Rule Blocking Config-Only PRs
The default branch protection ruleset included a code_scanning rule:
{
"type": "code_scanning",
"parameters": {
"code_scanning_tools": [{
"tool": "CodeQL",
"security_alerts_threshold": "high_or_higher",
"alerts_threshold": "errors"
}]
}
}
This rule requires CodeQL to upload SARIF results (security scan analysis). For config-only PRs where analysis is appropriately skipped, no SARIF is uploaded, permanently blocking the PR. Even admin override (gh pr merge --admin) failed with:
Impact
- Template sync PRs blocked: Config-only changes (dependabot, workflow updates) couldn't merge
- Dependabot PRs blocked: Dependency updates for
.github/files couldn't merge - Manual intervention required: Every config-only PR needed manual ruleset modification
- Inconsistent security posture: Some forks removed the rule manually, creating inconsistency
Decision
Implement a two-part solution:
1. Add CodeQL Summary Job
Add a summary job to the CodeQL workflow that: - Named "CodeQL": Matches the required status check context exactly - Always executes: Uses if: always() to run regardless of previous job outcomes - Smart validation: Reports success for config-only changes, validates analysis results for code changes - Proper failure handling: Fails if actual analysis fails or is cancelled
CodeQL:
name: CodeQL # Job name becomes the check context
runs-on: ubuntu-latest
needs: [check-paths, detect-languages, analyze]
if: always() # Executes even when analysis skips
steps:
- name: Report CodeQL Status
run: |
# Validate check-paths succeeded
if [ "${{ needs.check-paths.result }}" != "success" ]; then
exit 1
fi
# Config-only changes: Report success
if [ "${{ needs.check-paths.outputs.should-run }}" = "false" ]; then
echo "✅ CodeQL skipped - only configuration files changed"
exit 0
fi
# Code changes: Validate analysis completed
if [ "${{ needs.analyze.result }}" = "failure" ] ||
[ "${{ needs.analyze.result }}" = "cancelled" ]; then
exit 1
fi
echo "✅ CodeQL checks completed successfully"
2. Remove Code Scanning Rule from Template
Remove the code_scanning rule from .github/rulesets/default-branch.json:
Rationale: - The rule requires SARIF upload for every PR - Config-only PRs appropriately skip analysis (no code to scan) - No SARIF uploaded → PR permanently blocked - Status checks provide sufficient validation - CodeQL still runs on code changes and reports findings
Rationale
Why Summary Job Pattern
- Workflow Composability: GitHub Actions doesn't support renaming jobs dynamically
- Single Source of Truth: One job consolidates the status of multiple conditional jobs
- Standard Pattern: Widely used in GitHub Actions for exactly this use case
- Flexibility: Can add more checks in the future without changing ruleset
Why Remove Code Scanning Rule
- SARIF Upload Requirement: The rule fundamentally requires results upload
- No Conditional SARIF: Can't conditionally satisfy the rule (it either has results or doesn't)
- Analysis Already Validated: The status check validates that analysis ran when needed
- Security Findings Still Visible: Results still appear in Security tab when analysis runs
- Most Teams Prefer Triage: Blocking on alert thresholds is often too strict; teams prefer to triage findings
Alternatives Considered
1. Always Upload Empty SARIF for Skipped Analysis
Approach: Generate and upload a minimal valid SARIF file when analysis is skipped
- name: Upload empty SARIF for skipped analysis
if: needs.check-paths.outputs.should-run == 'false'
uses: github/codeql-action/upload-sarif@v4
with:
sarif_file: empty.sarif
Pros: - Satisfies the code_scanning rule - Maintains alert threshold blocking capability
Cons: - ❌ Adds complexity and maintenance burden - ❌ Generates misleading data (empty scan results) - ❌ Violates principle of least surprise - ❌ No real security benefit over status checks
Decision: Rejected due to complexity without meaningful security benefit
2. Rename Existing Jobs to "CodeQL"
Approach: Change check-paths job name to "CodeQL"
Pros: - Simple, no new jobs
Cons: - ❌ Loses semantic meaning ("CodeQL" doesn't describe what check-paths does) - ❌ Doesn't solve code_scanning rule problem - ❌ Confusing when check-paths succeeds but analysis never ran
Decision: Rejected due to loss of clarity
3. Use Workflow-Level Required Checks Only
Approach: Remove job-level requirements, require entire workflow success
Pros: - Simpler configuration
Cons: - ❌ GitHub rulesets don't support workflow-level checks (only job names) - ❌ Not technically feasible with current GitHub features
Decision: Rejected as not possible
4. Keep Code Scanning Rule, Block Config PRs
Approach: Accept that config-only PRs will block and require manual override
Pros: - Maintains strict security enforcement
Cons: - ❌ Terrible user experience for legitimate PRs - ❌ Breaks automation (Dependabot, template sync) - ❌ Manual intervention doesn't scale - ❌ Security theater (config files don't need CodeQL)
Decision: Rejected due to operational impact
Consequences
Positive
✅ Config-only PRs work: Dependabot, template sync, and workflow updates merge without blocking ✅ Security scanning preserved: CodeQL still runs on code changes ✅ Findings visible: Results still appear in Security tab ✅ Better automation: Template sync works reliably ✅ Consistent behavior: All forks behave identically ✅ Clear semantics: "CodeQL" check clearly represents CodeQL validation status
Negative
❌ Lost alert threshold blocking: Can't require "zero high-severity alerts" before merge ❌ Manual security triage: Teams must review findings rather than being blocked ❌ Existing forks need update: Forks created before this change need manual ruleset modification
Mitigations
- Security findings still generate notifications and appear in Security tab
- Teams can still review findings before merge (just not forced to)
- Most organizations prefer manual triage over strict blocking (false positives common)
- The summary job can be enhanced later to check alert severity if needed
Implementation
Files Changed
.github/template-workflows/codeql.yml- Added
CodeQLsummary job - Validates previous jobs and reports consolidated status
-
Added build artifact exclusion patterns
-
.github/rulesets/default-branch.json - Removed
code_scanningrule - Kept
required_status_checkswith "CodeQL" context
Build Artifact Exclusion Pattern
Problem: CodeQL language detection was finding generated Python files in build directories (e.g., build-aws/build-info.py), attempting to analyze them, and failing because they're malformed or auto-generated code.
Error Example:
[ERROR] Failed to extract file /home/runner/work/partition/partition/provider/partition-aws/build-aws/build-info.py: 'name'
CodeQL detected code written in Python but could not process any of it.
Solution: Exclude build and generated code directories from both language detection and CodeQL analysis:
Language Detection (lines 102-115):
# Check for Python (exclude build directories and common generated code paths)
if [ -f "setup.py" ] || [ -f "pyproject.toml" ] || [ -f "requirements.txt" ] || \
find . -name "*.py" \
-not -path "./.*" \
-not -path "*/build/*" \
-not -path "*/build-*/*" \
-not -path "*/target/*" \
-not -path "*/dist/*" \
-not -path "*/__pycache__/*" \
-not -path "*/.venv/*" \
-not -path "*/venv/*" \
-not -path "*/node_modules/*" | grep -q .; then
LANGUAGES=$(echo "$LANGUAGES" | jq -c '. + ["python"]')
fi
CodeQL Configuration (lines 172-189):
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
queries: security-extended
build-mode: none
config: |
paths-ignore:
- '**/build/**'
- '**/build-*/**'
- '**/target/**'
- '**/dist/**'
- '**/__pycache__/**'
- '**/.venv/**'
- '**/venv/**'
- '**/node_modules/**'
- '**/.pytest_cache/**'
- '**/.mypy_cache/**'
Benefits: - Prevents false language detection from build artifacts - Avoids CodeQL analysis failures on generated code - Focuses security analysis on actual source code - Reduces analysis time by skipping irrelevant paths
Migration Path for Existing Forks
Forks created before this change may have the code_scanning rule in their rulesets. To fix:
- Navigate to: Settings → Rules → Rulesets → "Default Branch Protection"
- Edit the ruleset
- Remove the "Code scanning" rule
- Save changes
Or via API:
Next template sync will provide the updated CodeQL workflow automatically.
Related Decisions
- ADR-010: YAML-safe shell scripting pattern (used in summary job)
- ADR-028: Workflow script extraction pattern (influenced job structure)
- Product Spec: CodeQL workflow specification (dynamic path filtering)
Success Criteria
✅ Config-only PRs merge without manual intervention ✅ Code changes still trigger CodeQL analysis ✅ Security findings appear in Security tab ✅ Status check "CodeQL" reports correctly for both scenarios ✅ Existing forks can apply fix with documented migration path
References
- GitHub Issue: PR blocking on template sync (2025-10-29)
- Production Testing: danielscholl-osdu/partition, danielscholl-osdu/entitlements
- GitHub Docs: Repository rulesets
- GitHub Docs: CodeQL code scanning