ADR-007: Initialization Workflow Bootstrap Pattern
Status
Accepted - 2025-10-01
Context
During testing with OSDU repositories, we discovered a critical bootstrap problem in the template initialization process:
The Bootstrap Problem: When a new repository is created from this template, it runs the initialization workflow from the template's initial commit, not the current/updated version. This means any fixes or improvements to the initialization workflow (like adding --allow-unrelated-histories to handle merge conflicts) are not available during the actual initialization.
Discovery Timeline: 1. User creates repository from template 2. Initialization workflow runs from commit b40474835cd53d4e78bf20108e18ac6178af6842 3. Workflow fails at: git merge fork_integration --no-ff -m "chore: complete repository initialization" 4. Error: fatal: refusing to merge unrelated histories 5. Current template already has the fix: --allow-unrelated-histories flag 6. But the fix isn't available to the running workflow
Additional Discovery: Even with --allow-unrelated-histories, merge conflicts occur in common files: - .gitignore - Template version vs upstream version - README.md - Template documentation vs upstream documentation - Solution requires -X theirs merge strategy to automatically resolve conflicts
Permission Discovery: The built-in GITHUB_TOKEN lacks permissions to create repository secrets: - Error: HTTP 403: Resource not accessible by integration - Solution: Use Personal Access Token (PAT) stored as GH_TOKEN secret - Fallback: Skip secret creation with warning if PAT not available
Root Cause: GitHub Actions runs workflows from the commit that triggered the event. For template-created repositories, this is the initial commit containing the old workflow version, creating a chicken-and-egg problem.
Decision
Implement a self-updating initialization workflow pattern:
- Phase 1: Bootstrap Update - The initialization workflow first updates itself from the template repository
- Phase 2: Execute Initialization - Run the updated workflow logic
This ensures that any fixes or improvements to the initialization process are immediately available to new repositories.
Rationale
Why This Matters
- Fix Propagation: Critical fixes reach new repositories immediately
--allow-unrelated-historiesfor unrelated history errors-X theirsfor automatic merge conflict resolution- Continuous Improvement: Template improvements benefit all future users
- Reduced Support: Users don't encounter already-fixed issues
- Maintainability: Single source of truth for initialization logic
Alternative Approaches Considered
1. Document Manual Workarounds
- Approach: Tell users to manually run missing commands when initialization fails
- Pros: Simple, no code changes needed
- Cons: Poor user experience, requires technical knowledge, defeats automation purpose
- Decision: Rejected - Goes against the template's goal of automation
2. Pre-create All Branches in Template
- Approach: Include fork_upstream and fork_integration branches in template
- Pros: Might avoid unrelated histories issue
- Cons: Pollutes template with upstream-specific content, still doesn't solve workflow updates
- Decision: Rejected - Doesn't address root cause
3. External Initialization Script
- Approach: Use a separate script hosted externally that gets downloaded and run
- Pros: Always runs latest version
- Cons: External dependency, security concerns, complexity
- Decision: Rejected - Adds unnecessary external dependencies
4. Two-Stage Workflow with Self-Update
- Approach: Workflow updates itself before running initialization
- Pros: Self-contained, automatic, always uses latest fixes
- Cons: Slightly more complex workflow logic
- Decision: Accepted - Best balance of automation and reliability
Implementation Details
Proposed Workflow Structure
name: Initialize Fork
on:
issue_comment:
types: [created]
jobs:
update-workflow:
name: Update initialization workflow
runs-on: ubuntu-latest
steps:
- name: Checkout current repository
uses: actions/checkout@v5
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Fetch latest workflow from template
run: |
# Add template as remote
git remote add template https://github.com/azure/osdu-spi.git
git fetch template main
# Update workflows to latest version
git checkout template/main -- .github/workflows/init.yml
git checkout template/main -- .github/workflows/init-complete.yml
# Commit if changes exist
if git diff --staged --quiet; then
echo "Workflows are already up to date"
else
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git commit -m "chore: update initialization workflows to latest version"
git push
fi
initialize:
name: Initialize repository
needs: update-workflow
uses: ./.github/workflows/init-complete.yml
# Now runs with updated workflow
Key Design Elements
- Self-Updating: Workflow fetches its own latest version before executing
- Idempotent: Safe to run multiple times, only updates if changes exist
- Transparent: Users see workflow update commit in history
- Secure: Uses same permissions, no external dependencies
Consequences
Positive
- Automatic Fix Distribution: All template improvements immediately available
- Reduced User Friction: Initialization "just works" with latest fixes
- Simplified Support: No need to maintain fix instructions for old versions
- Better User Experience: Users always get the best version of initialization
- Traceable Updates: Git history shows when workflows were updated
Negative
- Additional Complexity: Two-phase initialization adds complexity
- Extra Commit: Creates an additional commit in repository history
- Potential Conflicts: If user modifies workflows before initialization completes
- Dependency on Template: Requires template repository to remain accessible
Mitigation Strategies
- Clear Documentation: Explain the self-update process in initialization issue
- Error Handling: Graceful fallback if template fetch fails
- Version Checking: Only update if template version is newer
- Conflict Prevention: Check for local modifications before updating
Success Criteria
- ✅ New repositories always use latest initialization workflow
- ✅ Critical fixes are immediately available:
--allow-unrelated-historiesfor unrelated history errors-X theirsfor automatic merge conflict resolution- PAT token usage for secret creation operations
- Template file cleanup and repository-specific README generation
- ✅ Users don't encounter previously-fixed initialization issues
- ✅ Workflow updates are transparent in git history
- ✅ Process handles edge cases gracefully (template unavailable, etc.)
- ✅ Template documentation is cleaned up after initialization
Actual Implementation
The bootstrap problem was solved using Local Actions Pattern instead of the workflow self-update approach described above:
Solution: Extracted Local Actions
Critical initialization logic was extracted to .github/local-actions/merge-with-theirs-resolution/:
# .github/local-actions/merge-with-theirs-resolution/action.sh
git merge "$SOURCE_BRANCH" --allow-unrelated-histories --no-ff -X theirs -m "$COMMIT_MESSAGE"
Why This Works
- Always Available: Local actions are part of the template repository and copied during initialization
- No Bootstrap Problem: Since the action exists in the initial commit, fixes are automatically available
- Cleaner Architecture: Reusable action provides better separation than embedded workflow code
- No External Dependencies: Self-contained within the repository
Usage in init-complete.yml
- name: Complete initialization
uses: ./.github/local-actions/merge-with-theirs-resolution
with:
source_branch: fork_integration
target_branch: main
commit_message: "chore: complete repository initialization"
Benefits Over Self-Update Approach
- ✅ Simpler: No two-phase initialization complexity
- ✅ Reliable: No dependency on template repository accessibility
- ✅ Clean History: No bootstrap commits
- ✅ Testable: Local actions can be tested independently
- ✅ Maintainable: Single source of truth in
.github/local-actions/
Related Decisions
- ADR-028: Workflow Script Extraction Pattern - Documents the local actions pattern used
- ADR-012: Template Update Propagation Strategy - How template improvements reach existing forks
- ADR-006: Two-Workflow Initialization Pattern - This ADR builds upon the two-workflow pattern
- ADR-003: Template Repository Pattern - Aligns with template-based architecture