Continuous Integration and Deployment to multiple environments with Terraform Cloud and GitHub Actions
Summary
In this post I cover what CI/CD means and how you can achieve it across multiple different environments (e.g. dev, stage, prod) using the GitHub Actions platform. The infrastructure is managed using Terraform and the state is coordinated using Terraform Cloud.
I detail specific issues I experienced while implementing CI/CD for a real high-scale/high-profile public API, and the workarounds I needed to implement to resolve them.
What is CI/CD?
Let’s start off by answering the question:
What is CI/CD?
CI is short for Continuous Integration.
CD refers to two separate concepts:
- Continuous Delivery
- Continuous Deployment
People often make the mistake of thinking they are interchangeable.
They’re not. But what these terms mean is actually not that complicated:
- Continuous Integration: you’re testing every change (i.e. commit)
- Continuous Delivery: you’re able to ship every change on demand
- Continuous Deployment: you’re shipping every change automatically
Now we have a bit of an understanding of what CI/CD is,
let’s see how I’ve been able to use each of these individual concepts.
API Gateway
I needed to implement a complete CI/CD pipeline for my employer.
They had an API Gateway service which required a mixture of continuous integration, delivery and deployment across three separate ‘environments’ (i.e. a dev environment, a stage environment, and a production environment).
Here is the summary of their usage per environment:
- Dev: Continuous Integration/Deployment
- Stage: Continuous Integration/Deployment
- Production: Continuous Delivery
Let’s dig into each environment to understand how they use CI/CD.
Dev Environment
The development environment is spun-up automatically when pushing to a PR.
It is automatically spun-down when the associated PR is merged.
NOTE: We control which files will trigger the dev environment creation.
The service name is api-gateway-dev-<BRANCH>
.
The service domain is api-gateway-dev-<BRANCH>.edgecompute.app
.
The <BRANCH>
is calculated from github.head_ref.
The Terraform state that manages this environment is stored in a Terraform
Cloud (TFC) project under a dynamically created, and managed,
workspace called api-gateway-dev-<BRANCH>
.
The dev CI workflow has some important steps it must take in order to spin-up a development environment that is isolated between different PR authors:
- Call the TFC API to create a new workspace in the TFC project.
- Cache the result of the workspace creation to avoid API rate limits.
- Also create a workspace environment variable for
FASTLY_API_KEY
.
The last step is required as we’re deploying the service to Fastly.
NOTE: You’ll notice shortly that our stage and production flows use a single service and workspace, where as our dev environment is a bit more involved and requires its own dynamically created workspace and services. This is because we have multiple developers working on multiple features at any one time. So we use the ‘branch’ name as the unique identifier, and by having a different workspace per branch it means a developer can modify the Terraform infrastructure without affecting anyone else.
Stage Environment
There are two scenarios where the staging environment is deployed to:
- Whenever a PR is merged
- Whenever a commit is pushed directly to the
main
branch
NOTE: Again, we control which files will trigger the environment creation.
The service name is api-gateway-stg
.
The service domain is api-gateway-stg.edgecompute.app
.
Once the deployment is complete, the Compute package (i.e. the application code)
is uploaded as an artifact to GitHub and is deleted after 14 days.
The name of the artifact is the hash of the Compute package.
A git tag is pushed for each stage deployment, and the hash is appended to it.
During that 14 day time frame, the artifact can be promoted to production.
The Terraform state that manages this environment is stored in the Terraform Cloud project under the “api-gateway-stage” workspace.
Production Environment
The production environment is deployed to manually (i.e. Continuous Delivery) by triggering the relevant GitHub Actions workflow. The person triggering the deployment will first need to identify the Compute package ‘hash’ (which is appended to each staging tag).
The reason for ‘promoting’ the staging Compute package to production,
rather than recompiling the application code, is to ensure consistency
between the staging and production environments.
We require the person triggering the deployment to manually locate the relevant
Compute package hash. We can’t rely on automation
(e.g git tag --sort=-creatordate
) because at the moment of checking the git
tags, another
person might have merged another PR and thus we would pick up that new staging
tag and deploy the wrong artifact to the production environment.
The reason that is a concern, is because a newly merged PR should have some time
on the staging environment for any unexpected bugs to be identified. At some
point in the future we might have complete confidence in our test suite to move
from Continuous Delivery to Continuous Deployment, but that time isn’t now.
The service name is api-gateway-prd
.
The service domain is api-gateway-prd.edgecompute.app
.
The Terraform state that manages this environment is stored in the Terraform Cloud project under the “api-gateway-production” workspace.
Implementation
So here is where I’m going to display the complete set of GitHub Actions workflow configuration. We’ll review the code and I’ll make comments about specific sections of interest.
Let’s start with dev, then we’ll see stage, and then production.
Dev Config
# We deploy to dev environment when PR is pushed to.
# We name the environment after the branch to avoid collisions.
# When PR is merged we automatically tear-down the dev environment.
name: "API Gateway TF Dev"
# Stop any in-flight CI jobs when a new commit is pushed.
concurrency:
group: api-gateway-dev-${{ github.ref_name }}
cancel-in-progress: true
on:
workflow_dispatch:
pull_request:
types: [opened, edited, closed, reopened, synchronize]
paths:
- .github/workflows/api-gateway-*
- cmd/gateway/**
- internal/gateway/**
- infrastructure/fastly/compute/api-gateway/**
# IMPORTANT: for TF_VAR_* wrap double-quotes in single-quotes.
env:
APP_DIRECTORY: "./cmd/gateway"
CONFIG_DIRECTORY: "./infrastructure/fastly/compute/api-gateway"
PACKAGE_PATH: "/pkg/gateway.tar.gz"
TF_API_TOKEN: "${{ secrets.TF_API_TOKEN }}"
TF_CLOUD_ORGANIZATION: "example"
TF_VAR_backend: '"https://api-dev.example.com"'
TF_VAR_env: '"dev"'
jobs:
create-workspace:
name: "Create Terraform Cloud Workspace"
# The following if statement says: don't run this when the PR is closed
if: ${{ github.event.action != 'closed' && github.actor != 'dependabot[bot]' }}
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
- name: Cache/Restore Workspace ID
id: cache-workspace-id
uses: actions/cache@v4
with:
path: tfc-workspace-id
key: ${{ runner.os }}-cache-workspace-id-${{github.head_ref}}
# IMPORTANT: We don't want to run the following steps if we get a cache-hit.
# That is because a cache-hit indiciates we've already created the workspace.
# This means we're forced to add a `if:` check on each following step.
- name: Set TF_VAR_branch
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
uses: ./.github/actions/tf-var-branch
- name: Set WORKSPACE_NAME
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
uses: ./.github/actions/tfc-dev-workspace-name
- name: Modify Workspace JSON Payload
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
run: |
payload=./infrastructure/fastly/compute/api-gateway/tfc-api/payloads/workspace.json
jq '.data.attributes.name = "${{ env.WORKSPACE_NAME }}"' \
"$payload" > temp.json && mv temp.json "$payload"
- name: Modify Workspace Variable JSON Payload
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
run: |
payload=./infrastructure/fastly/compute/api-gateway/tfc-api/payloads/workspace-variables.json
jq '.data.attributes.value = "${{ secrets.FASTLY_API_KEY }}"' \
"$payload" > temp.json && mv temp.json "$payload"
# NOTE: A cache is volatile/ephemeral.
# If the cache is invalidated/cleared, then the following step could still
# be run. This would mean the curl request to create the workspace would fail
# (because it could already exist). This is OK, because we check the API
# response for errors and the last step reacts to the failure status. The
# intention is to avoid trying to read from a response.json that doesn't contain
# the required structure (i.e. doesn't contain a .data.id field). As well as
# avoid trying to call the TFC API to create a workspace variable when there was
# an error initially creating the workspace.
- name: Create Dev Workspace
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
run: |
curl -s \
--header "Authorization: Bearer ${{ secrets.TF_API_TOKEN }}" \
--header "Content-Type: application/vnd.api+json" \
--request POST \
--data @infra/fastly/compute/api-gateway/tfc-api/payloads/workspace.json \
https://app.terraform.io/api/v2/organizations/example/workspaces > response.json
if [[ $(jq '.errors | length == 0' response.json) == false ]]; then
jq '.errors' response.json
echo "STEP_STATUS=failed" >> "${GITHUB_ENV}"
fi
- name: Create FASTLY_API_KEY as Workspace Variable
if: ${{ steps.cache-workspace-id.outputs.cache-hit != 'true' && env.STEP_STATUS != 'failed' }}
run: |
workspace_id=$(jq -r '.data.id' response.json)
curl -s \
--header "Authorization: Bearer ${{ secrets.TF_API_TOKEN }}" \
--header "Content-Type: application/vnd.api+json" \
--request POST \
--data @infra/fastly/compute/api-gateway/tfc-api/payloads/variables.json \
"https://app.terraform.io/api/v2/workspaces/$workspace_id/vars"
echo "$workspace_id" > tfc-workspace-id
terraform-plan:
name: "Plan"
runs-on: ubuntu-latest
permissions: # https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
contents: read
pull-requests: write
needs: create-workspace
# don't run this when the PR is closed (and also only if the create-workspace was successful)
if: ${{ github.event.action != 'closed' && github.actor != 'dependabot[bot]' && success() }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
- name: Install Go and Fastly CLI
uses: ./.github/actions/fastly-cli
- name: Build Compute Package
run: fastly compute build --dir ${{ env.APP_DIRECTORY }} --metadata-show
- name: Calculate Compute Package Hash
run: fastly compute hash-files --skip-build --dir ${{ env.APP_DIRECTORY }}
- name: Copy Compute Package to Terraform workspace
run: cp ${{ env.APP_DIRECTORY }}${{ env.PACKAGE_PATH }} ${{ env.CONFIG_DIRECTORY }}
- name: Set TF_VAR_branch
uses: ./.github/actions/tf-var-branch
- name: Set WORKSPACE_NAME
uses: ./.github/actions/tfc-dev-workspace-name
# NOTE: This job (and all jobs that follow) requires a TFC Workspace.
# The name of the workspace is expected to match WORKSPACE_NAME.
# The creation of the workspace happens in the `create-workspace` job.
#
# But there are scenarios where the workspace might not exist!
# This can happen when there is an error in the `create-workspace` job.
# For example, when calling the TFC API.
#
# Although the API might error, we don't want to mark the job as failed.
# This is because the API could be called and fail for multiple reasons.
# e.g. the cached API result wasn't found and we got a 422 from the API.
#
# A 422 code response from the TFC API thus indicates we tried to create
# a workspace that already exists. So in this scenario we still want the
# remaining jobs to run (as the workspace does exist).
#
# Ultimately, if the error was something else and the workspace thus
# wasn't created, then at worst we'll get an error back from TFC when
# trying to upload the TF configuration.
- name: Upload Configuration
uses: hashicorp/tfc-workflows-github/actions/upload-configuration@v1.2.0
id: plan-upload
with:
workspace: ${{ env.WORKSPACE_NAME }}
directory: ${{ env.CONFIG_DIRECTORY }}
speculative: true
- name: Create Plan Run
uses: hashicorp/tfc-workflows-github/actions/create-run@v1.2.0
id: plan-run
with:
workspace: ${{ env.WORKSPACE_NAME }}
configuration_version: ${{ steps.plan-upload.outputs.configuration_version_id }}
plan_only: true
- name: Get Plan Output
uses: hashicorp/tfc-workflows-github/actions/plan-output@v1.2.0
id: plan-output
with:
plan: ${{ fromJSON(steps.plan-run.outputs.payload).data.relationships.plan.data.id }}
- name: Update PR
uses: actions/github-script@v7
id: plan-comment
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
// Retrieve existing bot comments for the PR
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const botComment = comments.find(comment => {
return comment.user.type === 'Bot' && comment.body.includes('TF Plan Output')
});
const output = `#### Terraform Cloud Plan Output
\`\`\`
Plan: ${{ steps.plan-output.outputs.add }} to add,
${{ steps.plan-output.outputs.change }} to change, \
${{ steps.plan-output.outputs.destroy }} to destroy.
\`\`\`
[Terraform Cloud Plan](${{ steps.plan-run.outputs.run_link }})
`;
if (botComment) {
github.rest.issues.deleteComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: botComment.id,
});
}
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
});
terraform-apply:
name: "Apply"
needs: terraform-plan
runs-on: ubuntu-latest
permissions:
contents: read # for Terraform Actions
statuses: write # for Status Action (myrotvorets/set-commit-status-action)
# don't run this when the PR is closed (and also only if the terraform-plan was successful)
if: ${{ github.event.action != 'closed' && github.actor != 'dependabot[bot]' && success() }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
- name: Install Go and Fastly CLI
uses: ./.github/actions/fastly-cli
- name: Build Compute Package
run: fastly compute build --dir ${{ env.APP_DIRECTORY }} --metadata-show
- name: Calculate Compute Package Hash
run: fastly compute hash-files --skip-build --dir ${{ env.APP_DIRECTORY }}
- name: Copy Compute Package to Terraform workspace
run: cp ${{ env.APP_DIRECTORY }}${{ env.PACKAGE_PATH }} ${{ env.CONFIG_DIRECTORY }}
- name: Set TF_VAR_branch
uses: ./.github/actions/tf-var-branch
- name: Set WORKSPACE_NAME
uses: ./.github/actions/tfc-dev-workspace-name
- name: Terraform Apply
id: tf-apply
uses: ./.github/actions/tf-apply
with:
workspace: ${{ env.WORKSPACE_NAME }}
- name: "Terraform Apply Run URL"
uses: myrotvorets/set-commit-status-action@master
if: success()
with:
token: ${{ secrets.GITHUB_TOKEN }}
status: "success"
description: "click 'Details' to view"
context: Terraform Apply Results
targetUrl: "https://app.terraform.io/app/example/workspaces/\
api-gateway/runs/${{ steps.tf-apply.outputs.run_id }}"
# Remove any dev services once the corresponding PR has been merged.
# https://docs.github.com/en/actions/using-workflows/\
# events-that-trigger-workflows#running-your-pull_request-workflow-when-a-pull-request-merges
terraform-destroy:
name: "Cleanup Dev Environment"
runs-on: ubuntu-latest
if: ${{ github.event.pull_request.merged == true && github.actor != 'dependabot[bot]' }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
# NOTE: We still need to install the CLI and compile the Compute package.
# This is because we need to do a two-step apply (see IMPORTANT below).
- name: Install Go and Fastly CLI
uses: ./.github/actions/fastly-cli
- name: Build Compute Package
run: fastly compute build --dir ${{ env.APP_DIRECTORY }} --metadata-show
- name: Calculate Compute Package Hash
run: fastly compute hash-files --skip-build --dir ${{ env.APP_DIRECTORY }}
- name: Copy Compute Package to Terraform workspace
run: cp ${{ env.APP_DIRECTORY }}${{ env.PACKAGE_PATH }} ${{ env.CONFIG_DIRECTORY }}
- name: Set TF_VAR_branch
uses: ./.github/actions/tf-var-branch
- name: Set WORKSPACE_NAME
uses: ./.github/actions/tfc-dev-workspace-name
# IMPORTANT: Deleting a service with a resource-link is a two-step apply.
# We can't pass a -target to the Terraform GH Action as a prerequisite step.
# So this means we'll need to clear any `resource_link` blocks from the config.
# Then, after we've applied the change, we'll need to run a TF destroy.
- name: Remove resource_link blocks from TF config
run: |
while IFS= read -r line; do
if [[ "$line" =~ "resource_link {" || "$clear_line" == true ]]; then
if [[ "$line" =~ "}" ]]; then
clear_line=false
else
clear_line=true
fi
echo ""
else
echo "$line"
fi
done < ${{ env.CONFIG_DIRECTORY }}/main.tf > ./main.tf && \
mv ./main.tf ${{ env.CONFIG_DIRECTORY }}/main.tf
- name: Terraform Apply
uses: ./.github/actions/tf-apply
with:
workspace: ${{ env.WORKSPACE_NAME }}
# NOTE: Destroy config once Resource Links have been removed.
- name: Terraform Apply
if: success()
uses: ./.github/actions/tf-apply
with:
workspace: ${{ env.WORKSPACE_NAME }}
destroy: true
# NOTE: Once all Fastly resources are deleted, we delete the TFC workspace.
- name: Delete Dev Workspace
run: |
curl -s \
--header "Authorization: Bearer ${{ secrets.TF_API_TOKEN }}" \
--header "Content-Type: application/vnd.api+json" \
--request POST \
https://app.terraform.io/api/v2/organizations/example/workspaces/\
${{ env.WORKSPACE_NAME }}/actions/safe-delete > response.json
if [[ $(jq '.errors | length == 0' response.json) == false ]]; then
jq '.errors' response.json
exit 1
fi
OK, the first thing I tripped up on was…
# Stop any in-flight CI jobs when a new commit is pushed.
concurrency:
group: api-gateway-dev-${{ github.ref_name }}
cancel-in-progress: true
I’m used to naming the group
with ${{ github.ref_name }}
. This has
historically always worked fine for me, until I started developing a more
complex CI/CD pipeline where there are lots of different events causing a
workflow file to be triggered …and more importantly, causing a workflow file
to be STOPPED!
Specifically, I ran into an issue where merging a PR should cause the dev environment to be spun-down, but I had a stage workflow also running and that caused by dev workflow to be unexpectedly cancelled, meaning I had infrastructure resources orphaned, and it was a pain to have to go back and manually clean them up.
So to resolve that I had to make sure to prefix each group
with a unique name.
The next thing of interest is…
on:
workflow_dispatch:
pull_request:
types: [opened, edited, closed, reopened, synchronize]
paths:
- .github/workflows/api-gateway-*
- cmd/gateway/**
- internal/gateway/**
- infrastructure/fastly/compute/api-gateway/**
It took me a while to figure out the types
to use. I have a very specific set
of events I want to react to and I discovered that edited
on a pull_request
doesn’t mean “code pushed to the PR”. Nope, for that you need synchronize
.
Also, I wanted to avoid having workflow files triggered unnecessarily, hence I restrict this particular workflow file to changes made to relevant API gateway files. Now, this isn’t perfect because the application code could change in the future to depend on other packages in the codebase (as it’s a mono-repo) and that would mean having to update this list (which as you can imagine is likely to get forgotten about).
The next thing of interest is…
# IMPORTANT: for TF_VAR_* wrap double-quotes in single-quotes.
env:
APP_DIRECTORY: "./cmd/gateway"
CONFIG_DIRECTORY: "./infrastructure/fastly/compute/api-gateway"
PACKAGE_PATH: "/pkg/gateway.tar.gz"
TF_API_TOKEN: "${{ secrets.TF_API_TOKEN }}"
TF_CLOUD_ORGANIZATION: "example"
TF_VAR_backend: '"https://api-dev.example.com"'
TF_VAR_env: '"dev"'
When passing a value to Terraform via an environment variable, you need the
value itself to be a string, but doing export FOO=bar
doesn’t assign "bar"
to the variable, it only assigns bar
without the quotes and that causes
Terraform to treat the value like an undefined variable. So it’s very important
to wrap the string in single quotes.
The next thing of interest is…
# don't run this when the PR is closed
if: ${{ github.event.action != 'closed' && github.actor != 'dependabot[bot]' }}
Specifically, the bit I want to call out is the check for dependabot[bot]
.
This is because when this automated-bot opens a PR on your repo, it won’t have
access to your secrets and so it won’t be able to make TFC API calls (and do
other things) so you might as well skip the CI/CD pipeline workflow.
The next difficulty I ran into was figuring out how best to avoid API rate limits from the Terraform Cloud API. This particular set of workflow files would be run many multiple times a day, and using the TFC API each time would quickly cause us to hit a rate-limit threshold.
To avoid that situation I needed to figure out clever caching patterns. But then, I needed to also account for the fact that a cache is volatile/ephemeral and can cause unexpected error scenarios.
If you take a look at the create-workspace
job, you’ll see that once I
successfully create a new workspace, I’ll then call the API to create an
environment variable that contains a required secret value (for interacting with
the Fastly API) and when that is successful I create a file called
tfc-workspace-id
and it’s that file that I cache and restore on each job run.
When the workflow calls the terraform-plan
job you’ll see I’ve added a comment
about handling error scenarios. Originally, I would depend on the
create-workspace
job and only run terraform-plan
if the other job had
completed successfully. But that didn’t always work and ties back to my comment
about a cache being volatile/ephemeral.
To explain: the terraform-plan
job (and all jobs that follow it) requires a
TFC Workspace. The name of the workspace is expected to match WORKSPACE_NAME
.
The creation of the workspace happens in the create-workspace
job.
Part of the create-workspace
is making API requests, and to make that easier I
pass a JSON file in to be used as the data input rather than manually
configuring it in a single curl
call. But first I have to modify the JSON data
to include the values required (as I don’t want them hardcoded into the JSON):
- name: Modify Workspace JSON Payload
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
run: |
payload=./infrastructure/fastly/compute/api-gateway/tfc-api/payloads/workspace.json
jq '.data.attributes.name = "${{ env.WORKSPACE_NAME }}"' "$payload" \
> temp.json && mv temp.json "$payload"
- name: Modify Workspace Variable JSON Payload
if: steps.cache-workspace-id.outputs.cache-hit != 'true'
run: |
payload=./infrastructure/fastly/compute/api-gateway/tfc-api/payloads/workspace-variables.json
jq '.data.attributes.value = "${{ secrets.FASTLY_API_KEY }}"' \
"$payload" > temp.json && mv temp.json "$payload"
Here’s the payload JSON for both API calls:
{
"data": {
"type": "workspaces",
"attributes": {
"name": "api-gateway-dev-"
},
"relationships": {
"project": {
"data": {
"type": "projects",
"id": "prj-<REDACTED>"
}
}
}
}
}
{
"data": {
"type": "vars",
"attributes": {
"key": "FASTLY_API_KEY",
"value": "",
"description": "API key used by the Fastly Terraform Provider",
"category": "env",
"sensitive": true
}
}
}
But there are scenarios where the workspace might not exist! This can happen
when there is an error in the create-workspace
job. For example, when calling
the TFC API. Although the API might error, we don’t want to mark the job as not
being successful. This is because the API could be called and fail for multiple
reasons. e.g. the cached API result wasn’t found and we got a 422 from the API.
A 422 code response from the TFC API thus indicates we tried to create a workspace that already exists. So in this scenario we still want the remaining jobs to run (as the workspace does exist).
Ultimately, if the error was something else and the workspace thus wasn’t created, then at worst we’ll get an error back from TFC when trying to upload the TF configuration to it.
The last issue worth noting is related to tearing-down the infrastructure when the PR is merged. There was an issue with the Fastly Terraform provider (or more specifically a Fastly API issue) where the removal of a Fastly ‘Config Store’ (which was needed by the application code) has to be done as a two-part terraform apply.
i.e. you have to remove the resource_link
block from the
fastly_service_compute
resource and run terraform apply
, and then you can
trigger a terraform apply -destroy
via the HashiCorp GitHub Action using the
is_destroy
field.
Stage Config
# We deploy to staging environment when PR is merged or we push to `main`.
name: "API Gateway TF Stage"
# Stop any in-flight CI jobs when a new commit is pushed.
concurrency:
group: api-gateway-stg-${{ github.ref_name }}
cancel-in-progress: true
on:
workflow_dispatch:
push:
branches:
- main
paths:
- .github/workflows/api-gateway-*
- cmd/gateway/**
- internal/gateway/**
- infrastructure/fastly/compute/api-gateway/**
# IMPORTANT: for TF_VAR_* wrap double-quotes in single-quotes.
env:
APP_DIRECTORY: "./cmd/gateway"
CONFIG_DIRECTORY: "./infrastructure/fastly/compute/api-gateway"
PACKAGE_PATH: "/pkg/gateway.tar.gz"
TF_API_TOKEN: "${{ secrets.TF_API_TOKEN }}"
TF_CLOUD_ORGANIZATION: "example"
TF_VAR_backend: '"https://api-stage.example.com"'
TF_VAR_env: '"stg"'
TF_WORKSPACE: "api-gateway-stage"
jobs:
terraform-apply:
name: "Apply"
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
- name: Install Go and Fastly CLI
uses: ./.github/actions/fastly-cli
- name: Build Compute Package
run: fastly compute build --dir ${{ env.APP_DIRECTORY }}
- name: Calculate Compute Package Hash
id: cph
run: |
echo "compute_package_hash=\
$(fastly compute hash-files --skip-build --dir ${{ env.APP_DIRECTORY }} --quiet)"\
>> $GITHUB_OUTPUT
- name: Copy Compute Package to Terraform workspace
run: cp ${{ env.APP_DIRECTORY }}${{ env.PACKAGE_PATH }} ${{ env.CONFIG_DIRECTORY }}
- name: Terraform Apply
uses: ./.github/actions/tf-apply
with:
workspace: ${{ env.TF_WORKSPACE }}
- name: Upload file as artifact
uses: actions/upload-artifact@v4
with:
name: ${{ steps.compute-package-hash.outputs.compute_package_hash }}
path: ${{ env.APP_DIRECTORY }}${{ env.PACKAGE_PATH }}
retention-days: 14
overwrite: true
- name: Create Git Tag
run: |
echo "GIT_STAGE_TAG=stg/$(date +'%Y-%m-%d/%H-%M-%SZ')/\
$(echo ${{ github.triggering_actor }} | tr '[:upper:]' '[:lower:]' | \
tr -d ' ')/${{ steps.cph.outputs.compute_package_hash }}" >> "${GITHUB_ENV}"
- name: Check Git Tag
run: echo "$GIT_STAGE_TAG"
- name: Apply Git Tag
uses: actions/github-script@v7
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'refs/tags/${{ env.GIT_STAGE_TAG }}',
sha: context.sha
})
The stage config isn’t as complex as the dev workflow, but there are some things of interest to point out, such as uploading our Compute package (which is our application code) to GitHub, which is for the purposes of later promoting that same package to production.
We then use the hash of the Compute package as part of a git tag for staging deploys. Again, this hash is required for the promoting of the staging artifact to the production environment.
Production Config
# We deploy to production environment when workflow is triggered manually.
# Person triggering workflow must provide hash of Compute package to deploy.
name: "API Gateway TF Prod"
# Stop any in-flight CI jobs when a new commit is pushed.
concurrency:
group: api-gateway-prd-${{ github.ref_name }}
cancel-in-progress: true
on:
workflow_dispatch:
inputs:
compute_package_hash:
type: string
required: true
description: hash of compute package (see `git tag`)
# IMPORTANT: for TF_VAR_* wrap double-quotes in single-quotes.
env:
APP_DIRECTORY: "./cmd/gateway"
CONFIG_DIRECTORY: "./infrastructure/fastly/compute/api-gateway"
GH_TOKEN: ${{ github.token }} # Required to use the `gh` CLI
PACKAGE_PATH: "/pkg/gateway.tar.gz"
TF_API_TOKEN: "${{ secrets.TF_API_TOKEN }}"
TF_CLOUD_ORGANIZATION: "example"
TF_VAR_backend: '"https://api.example.com"'
TF_VAR_env: '"prd"'
TF_WORKSPACE: "api-gateway-production"
jobs:
terraform-apply:
name: "Apply"
runs-on: ubuntu-latest
permissions:
actions: read
contents: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Debug Information
uses: ./.github/actions/debug
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Download Compute Package to Terraform workspace
run: |
gh run download -n ${{ inputs.compute_package_hash }}
mv gateway.tar.gz ${{ env.CONFIG_DIRECTORY }}
- name: Terraform Apply
uses: ./.github/actions/tf-apply
with:
workspace: ${{ env.TF_WORKSPACE }}
- name: Create Git Tag
run: |
echo "GIT_PROD_TAG=prd/$(date +'%Y-%m-%d/%H-%M-%SZ')/\
$(echo ${{ github.triggering_actor }} | tr '[:upper:]' '[:lower:]' |\
tr -d ' ')/${{ inputs.compute_package_hash }}" >> "${GITHUB_ENV}"
- name: Check Git Tag
run: echo "$GIT_PROD_TAG"
- name: Apply Git Tag
uses: actions/github-script@v7
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'refs/tags/${{ env.GIT_PROD_TAG }}',
sha: context.sha
})
Again, this workflow isn’t as complex as the dev one, so the only interesting part here is that we’re downloading the artifact from GitHub and then running a Terraform apply to upload the Compute package to our production environment.
Custom Actions
The only other files probably worth showing you are the
.github/actions/tf-var-branch
, .github/actions/tfc-dev-workspace-name
and
.github/actions/tf-apply
steps which I moved into separate action files to
make them easier to reuse.
Here is tf-var-branch
:
name: "Terraform Variable Branch Calculation"
description: "Generates a value for TF_VAR_branch"
runs:
using: composite
steps:
- name: Set TF_VAR_branch
shell: bash
run: |
echo "TF_VAR_branch=\"-$(echo ${{ github.head_ref }} |\
tr '[:upper:]' '[:lower:]' | tr '/' '-' | tr '.' '-' |\
tr '_' '-')\"" >> "${GITHUB_ENV}"
# NOTE: We only set the branch variable for dev (stg/prd will use an empty string).
NOTE: I had to update this action twice. The first was to replace dot characters with a hyphen as TFC complained. The second was to replace underscores with a hyphen as Fastly complained (i.e. an underscore is not a valid character in a domain).
Here is tfc-dev-workspace-name
:
name: "Terraform Cloud Dev Workspace Name"
description: "Generates a workspace name for the PR dev environment"
runs:
using: composite
steps:
- name: Set WORKSPACE_NAME
shell: bash
run: |
# e.g. `"-example"` (set in ../tf-var-branch/action.yml)
branch_quoted='${{ env.TF_VAR_branch }}'
# strip the leading quote (e.g. turn `"-example"` to `-example"`)
branch_strip_leading_quote="${branch_quoted#\"}"
# strip the trailing quote (e.g. turn `-example"` to `-example`)
branch="${branch_strip_leading_quote%\"}"
echo "WORKSPACE_NAME=api-gateway-dev${branch}" >> $GITHUB_ENV
Here is tf-apply
:
# yamllint disable rule:line-length
name: "Terraform Apply"
description: "Uploads TF config, creates a run and applies it"
inputs:
workspace:
description: "The TFC Workspace"
required: true
destroy:
description: "When true, this uses terraform to DESTROY the managed infrastructure"
default: "false"
outputs:
run_id:
description: "The ID of the created run"
value: ${{ steps.apply.outputs.run_id }}
runs:
using: composite
steps:
- name: Upload Configuration
uses: hashicorp/tfc-workflows-github/actions/upload-configuration@v1.2.0
id: apply-upload
with:
workspace: ${{ inputs.workspace }}
directory: ${{ env.CONFIG_DIRECTORY }}
- name: Create Apply Run
uses: hashicorp/tfc-workflows-github/actions/create-run@v1.2.0
id: apply-run
with:
workspace: ${{ inputs.workspace }}
configuration_version: ${{ steps.apply-upload.outputs.configuration_version_id }}
# IMPORTANT: This will DESTROY the dev infrastructure.
is_destroy: ${{ inputs.destroy == true || inputs.destroy == 'true' }}
- name: Apply
uses: hashicorp/tfc-workflows-github/actions/apply-run@v1.2.0
if: fromJSON(steps.apply-run.outputs.payload).data.attributes.actions.IsConfirmable
id: apply
with:
run: ${{ steps.apply-run.outputs.run_id }}
comment: "Apply Run from GitHub Actions CI ${{ github.sha }}"
Conclusion
I’ve not covered every single line of code in the GitHub configuration. I’ve also not included the Terraform config files because they are specific to my use case and so would be considered ‘noise’ as far as understanding the fundamental CI/CD flow we’re trying to learn about here.
But hopefully the explanation of each environment workflow, and then being able to compare that to the actual implementation files for GitHub Actions is enough to help you see how you might be able to implement your own CI/CD workflow across multiple environments.
Good luck out there!