Deployment Process#
The Platform Team heaviliy leverages infrastructure-as-code to make all of its changes. However, there are a number of inter-coordinated parts to maintain. This section provides a high-level overview of what is deployed and where it is configured in relation to deploying the clusters.
More detailed information about the single code base that controls all of our clusters here.
Over-arching Principles#
- Everything-as-code - we want as much as possible to be managed by code and automated pipelines. This helps keep track of changes as they occur and helps reduce the chance of human error when deploying changes.
- All changes are peer-reviewed - all changes must be reviewed by someone else on the team, ensuring no changes is done secretly or hidden from others. This also helps catch issues before they become larger ones.
- All work is logged and tracked on the Kanban board - we want to make sure we keep track of the changes being made to the platform. Therefore, any work being done on the platform must have an associated JIRA ticket. We do our best to log time on the tickets to help us budget time and metricize work.
- Lock ourselves out of production - while the Platform Team has full write access on the dvlp and pprd clusters, all team members only have read access on the prod cluster. If changes are needed in prod, they must go through the appropriate review and deployment process.
Making Changes to the Cluster#
Changes to the core of platform typically operate in the eks-cluster project or various tributary projects and follow the following steps:
- An issue is created and placed on the Kanban board to track the work being performed.
- The necessary work is made in the code repository on a new branch. Quite often, experiments are manually performed on the dvlp cluster, locally, or on an ephemeral cluster to identify the changes that are needed. Once identified, the experiment is removed, converted into code, and committed.
- A merge request is created.
- A link is automatically in the #it-common-platform-devops channel with the MR.
- The MR is peer-reviewed and approved by others. We typically try to have two others look at the changes, but only one is required. The last approver merges the change.
- A pipeline is triggered and the changes are automatically deployed to the dvlp cluster.
- Changes are tested and verified, most often manually.
- When successful, a pre-production maintenance is planned ad eventually deployed by targetting a commit in the form of
YYYY-MM-DD_HH-mm-pprd
- After successful testing for a period in PPRD, the release is ready, a new tag is made on the cluster config repo
to kick off a prod deployment. Tag tag should follow the format of
YYYY-MM-DD_HH-mm-prod
.