Brigade Design

This is a living document, and is kept up to date with the current state of Brigade. It is a high-level explanation of the Brigade design.

Brigade is an in-cluster runtime environment. It interprets scripts, and executes them often by invoking resources inside of the cluster. Brigade is event-based scripting of Kubernetes pipelines.

Event-based scripting of pipelines

Terminology

Brigade Run

The Developer’s View

From the developer’s view, Brigade works like this:

A project describes the context in which a Brigade script will run. It may define the following:

One or more scripts may be executed within the context of a project. Brigade assumes that a default script will reside in the project’s VCS repository at the relative path ./brigade.js. Gateways may provide other ways of sending scripts into Brigade.

A Brigade script should have at least one event handler defined. Event handlers are triggered when a gateway emits an event. Events are bound to projects, so an event will only be triggered for the explicitly declared project. (In other words, there are no global events, only project-bound events.)

An event specifies the following things:

(The list above is not exhaustive.)

When Brigade receives an event, it loads the referenced project, then starts a new worker. The worker executes the Brigade script, using as its entry point the event handler for the triggered event (e.g. events.on('pull_request'). The worker processes the script until one of the following occurs:

The fundamental units of a script are event handlers, jobs, and tasks.

An event handler associates a named event with a function that can process the event:

events.on('event name', () => { /* handler */ })

An event handler is explicitly given two pieces of information: the event record and the project record.

A typical event handler declares and runs one or more jobs. A job is a discrete unit of work that is associated with a container image.

const myJob = new Job("job-name", "image:tag")

When a job is executed, the container image is pulled from an origin (such as DockerHub), and is executed in the cluster. A job specifies configuration and input to that container. And the output of that container is returned from the job.

A job may declare zero or more tasks. A task is an individual step executed inside of a container. For example, if a container is just a simple Linux container with a shell, multiple shell commands can be run as tasks:

myJob.tasks = [
  "echo hello",
  "echo world"
]

In addition to jobs, scripts may declare groups, where groups are merely organizing units that can execute multiple jobs according to predefined patterns (e.g. all in parallel, each serially).

When a script is executed, cluster resources are allocated to execute each job as an independent cluster resource (a Pod). Various storage configurations may provide shared space between jobs in a build, or between multiple instances of the same job in different builds.

The Operator’s View

Operationally speaking, Brigade is a tool for chaining together Kubernetes pods in order to accomplish high level goals. It is analogous to the way UNIX shell scripts work.

A UNIX shell script defines the workflow around executing one or more lower-level system executables. Similarly, a Brigade script defines a workflow for executing multiple containers within a cluster.

Brigade has several functional concepts.

Design Overview

A Gateway is a workload, typically a Kubernetes Deployment fronted by a Service or Ingress, that transforms a trigger (inbound webhook, item on queue) into a Brigade event. The default brigade-gw gateway provides HTTP(S) endpoints that GitHub webhooks can target.

Service, Trigger, Gateway, Event

The illustration above shows how GitHub translates a Git event into a webhook, which the Brigade Gateway translates into an event to be consumed by the Brigade controller.

In Brigade, all scripts are executed in the context of a project. Projects are represented as Kubernetes Secrets.

The Controller is a Kubernetes controller that listens for Brigade event objects, and handles these objects by starting workers.

Brigade events are currently specified as Kubernetes Secrets with particular labels. We use secrets because at the time of development, Third Party Resources were deprecated and Custom Resource Descriptions are not final. This aspect of the system may change between the 0.1.0 release of Brigade and the 1.0.0 release.

Brigade Workers are pods that execute brigade scripts. Each worker handles exactly one brigade script. Workers are never pooled. A worker runs to completion, to failure, or to timeout. Prior to the 1.0.0 release of Brigade, the simple pods may be replaced by Kubernetes Job objects instead.

Brigade workers handle an event by starting a build, where a build executes a script. A build will create a PVC for shared storage (job-to-job shared filesystem), and will create one or more pods (one per job). The worker will attempt to destroy all destroyable resources once the build has completed. Note that jobs are left in the Complete state (not deleted) so that their logs may be accessed. Cache PVCs are left unattached, and prepared for re-use.

A Brigade Job is started by a worker, and is executed as a pod. A job is run to completion, to error, or to timeout. Its status and results are made available to the calling worker, which in turn provides access to the script.

Along with the execution of an event-build pipeline, Brigade also provides an API server that provides access to information about current and past builds, projects, and jobs. The API server is typically fronted by a Service or Ingress.

Reasoning for Certain Design Decisions

History of Brigade

Brigade was designed in March 2016 by the Deis Helm Team (now part of Microsoft).

The first design used Lua instead of JavaScript, and relied on very few Kubernetes resources. Instead, it used a Redis queue for message passing and key/value storage. Other than some proof-of-concept work, the Lua engine never materialized. JavaScript’s popularity made it a better choice.

An original Kubernetes-oriented JavaScript engine was developed several months later. This was intended to be both a stand-alone component and a foundational piece for Brigade. Work was abandoned in favor of the Node.js worker pattern.

In April of 2017, Brigade was designated as the third part of the Helm/Draft/Brigade ecosystem. At this point, it was renamed “Acid” (Acme Continuous Integration & Deployment).

Brigade reached a stability point in September 2017, and was re-renamed back from Acid to Brigade. Brigade was released publicly under the MIT license in October 2017, as release 0.1.0.