Engineer Onboarding with Breakable Toy

Engineer Onboarding with Breakable Toy

I joined the Azavea team in May of 2018. I had experience in software development and systems administration, but I did not necessarily have much experience with production cloud infrastructure. Starting at Azavea gave me an opportunity to step into the shoes of an Operations Engineer for the first time in my professional career.

This was also the first time the Operations team used a Breakable Toy in its onboarding process. My colleagues wanted to design a program that encapsulated the foundational knowledge I’d need to succeed in my position. Implementing this program was a learning experience for me and everyone involved.

Breakable Toy

Breakable Toys are a training exercise that allow you to budget for failure by designing and building toy systems that are similar in toolset, but not in scope, to the systems you build at work. This is great for an Operations team, because unlike software development, a lot of our time is dedicated to managing client production infrastructure. Without a Breakable Toy, there is no budget for failure.

My colleagues designed a Breakable Toy where I would build and deploy a Django web application to Amazon Web Services (AWS). There was also something different from our vanilla approach to deployment: we would use the AWS Fargate compute engine rather than a traditional AWS Elastic Container Service (ECS) based deployment. Fargate was a new AWS service we didn’t have experience with, which allowed me to later present my findings in a monthly forum we have to discuss architecture decisions at the company.

We separated the Breakable Toy into multiple sections:

  • Setup a base development environment
  • Define services with Docker Compose
  • Assemble an MVC application
  • Add unit tests
  • Serve static files
  • Add a health check endpoint
  • Publish container images
  • Deploy to AWS
  • Wire up CI pipeline using Jenkins

After completing a milestone, I’d touch base with my colleagues to review things I encountered, but did not understand.

Developing the Breakable Toy

I started by cloning an internal app template as a starting point for a Vagrant and Docker based development environment. Then, my first task was to open a pull request that upgraded Docker and Docker Compose versions. The pull request was also the first step in assuming ownership over the app template, one of my new responsibilities.

Next, I created a new repository for the Breakable Toy. From there I started development, and added a Docker Compose service which utilized our base Docker image for Django. This allowed me to follow the official Django tutorial, Writing your first Django app. I completed Part 1 by creating a blank project.

Following Part 2 of the tutorial, I added a Docker Compose service, using our PostGIS image, and data models. Parts 3 and 4 covered the model-view-controller application design pattern. I assembled views and data models for the app. Although I didn’t expect to be doing much application development, I now feel more comfortable with a framework that is a core component in one of Azavea’s most popular technology stacks.

Azavea’s maintains its own flavor of GitHub’s Scripts to Rule Them All (STRTA). This is one of a number of techniques we use to maintain workflow consistency when switching between projects. Using that framework, I wrote unit tests for Django that were executed by a test script. I wrapped the test script in a cibuild script that gets executed in CI and builds a container image.

I researched WhiteNoise, a Django middleware that allows it to serve static files in non-development environments. This is counter to the approach Django would like you to take to serve static files in production. We proceeded anyway because we knew we were going to put Django behind an AWS CloudFront distribution. WhiteNoise sets cache control headers, and by proxying requests through a CDN, CloudFront would be able to aggressively cache static assets. This eliminates the need to host static assets out-of-band with object storage solutions like S3.

I added a health check endpoint using django-watchman, and then tested it by taking down the PostgreSQL service.

Deploying the Breakable Toy

I added a cipublish script that tagged and published tested container images to AWS Elastic Container Registry (ECR). Using existing Azavea projects as reference, I wrote Terraform configuration that codified my app’s infrastructure on AWS. I defined a Fargate-based ECS cluster and used our existing VPC and AWS Relational Database Server (RDS) modules to setup networking and a PostgreSQL database. I setup log aggregation with AWS CloudWatch Logs. And, finally, I layered in the CloudFront distribution I referenced earlier to front an application load balancer.

Although not outlined as part of the Breakable Toy, I wanted to wire up a CI pipeline to see how our Jenkins infrastructure was setup. The great thing is, since we follow the STRTA practice, this was as easy as having a Jenkinsfile that executed each script in sequence.

It was awesome to see my project running against production-grade infrastructure.

Retrospective

When I added unit testing, I encountered and resolved a race condition where the Django service wasn’t respecting the Docker Compose defined PostgreSQL health check. This issue gave me a better understanding on how Docker Compose interacts with Docker.

When I implemented WhiteNoise, I noticed that changes to static content, like stylesheets, weren’t being reflected in my browser. I traced this to a bug with vboxsf, the shared folder driver we were using with Vagrant. After swapping out vboxsf for the rsync driver, I was able to get the changes to cycle through.

This is the killer feature of Breakable Toy. Throughout the course of the program, I encountered and worked through scenarios that are part of the everyday, but would have been difficult to express in a more traditional do X, Y, and Z training program. Not only that, but these issues were catalysts for insightful discussions with my colleagues. We established rapport, and I believe my colleagues were able to gauge a baseline of my strengths and weaknesses.

I’m happy to be a member of the Operations team. The work we do is engaging, and Azavea’s impact and reach is inspiring.