Back last year, I wrote about our journey through the serverless technologies at FindHotel and closed on the fact that managing AWS infrastructure using bare CloudFormation (CF) files rapidly becomes a hassle. I thought it would be interesting to follow up by sharing how we decided to tackle this issue.
As many companies at the moment, our entire infrastructure lives in AWS. The data sciences & engineering team, to which I belong, focuses on crafting small or middle-sized projects such as a data lake, some HTTP APIs, a data streaming pipeline and several batch jobs to train machine learning models.
All of those projects serve a completely different purpose but often use similar services from AWS. When we started managing these infrastructures with CF, we quickly noticed that we were re-writing the same boilerplate code again and again. Every time one of our project needed an AWS service that was already used somewhere else we had to write almost the exact same CF code and that began to waste a considerable amount of our time.
This is why we decided to develop our own in-house tool: Humilis. It operates one level of abstraction higher than CF and saves us time by allowing us to reuse code and deploy the same pieces of infrastructure across different projects. As a beneficial side effect, Humilis keeps the user from the burden of writing CF templates whose syntax can be unhandy and typos prone, especially when the files size is growing.
To get started with Humilis you will just have to write a Humilis environment file, which is a collection of CF stacks that are required for your application and that are represented by what we call a Humilis layer. Your infrastructure is thus described as several layers combined into one environment file. Each of these layer translates exactly into one CF template, therefore into one CF stack after the layer is deployed.
Under the hood, the Humilis project is organised in two parts:
- A library of layers. Those are all the directories named like “humilis-kinesis-proxy”, “humilis-sam”, “humilis-kms”. Each layer represents a collection of infrastructure resources that can be parametrized and serves a specific purpose. For any part of your architecture that might be reused across projects it is worth investing a bit of time to implement a layer. Of course if such a layer has already been built by someone else and is available in the humilis library, you will just have to pick it up, tweak it with some parameters and add it to your environment file. Here are some examples of relevant use-case for a layer: load balancer, VPC networking and security group, cluster of databases, auto scaling group, IAM role management setup, pre-configured lambda functions.
- The humilis engine, where the core logic of the project lies. It simply parses the humilis layers to deploy your infrastructure in the following way:
Under the hood of Humilis
The Humilis engine processes the environment file to generate one CF template by Humilis layer and then deploy them to AWS. Thus you will end up with as many CF stacks in AWS as layers in your environment file.
Having several small pieces living in different CF stacks is convenient as it makes the overall infrastructure easier to maintain. For instance you won’t have to redeploy the entire architecture every time you want to update it: only the layers involved in the changes will be modified.
Here is an example of an environment file for a serverless API using 4 layers : VPC, security group, elasticache and a serverless API based on the AWS SAM model.
The VPC layer is the perfect illustration of where Humilis shines: reproducing the behaviour implemented in those few lines would ask us to write around 50 lines of CloudFormation. As we are using a VPC with different parameters in many of our projects, I let you guess the amount of time we save thanks to that layer.
If you are wondering why you should use Humilis or how it could help you, here is a few example use cases:
- Data pipeline based on Kinesis stream and lambda functions or EC2 consumers
- Serverless API
- Event driven data lake
- Set of batch jobs sharing some I/O or configuration
But Humilis comes in handy for any project that relies on an AWS infrastructure which would benefit from being broken down in smaller independent and reusable pieces. And we believe that many teams out there could be interested by this idea.
If you are one of them and are willing to give a try to Humilis, be our guest! You can start with checking the tests inside the code of any layer to get a grasp of how to use it. For instance here are the auto scaling group or elasticache ones.
Then put the layers together in your own environment file, deploy, and voilà.
To Wrap up, Humilis eases the AWS deployments while keeping enough flexibility to leverage any service and feature offered by CloudFormation. In that sense, it does not try to replace CF but rather extends it with a human-friendly syntax, templating capabilities, infrastructure versioning and some extra features like cross-stack references.
If anything, it allows you to design your infrastructures as puzzles rather than monoliths. This provides two great benefits : reusability and maintainability.
There are many other projects out there achieving the same purpose with different approach and level of complexity. For example Senza and Troposphere are two worth mentioning. If your devops toolbox is lacking such a tool, I invite you to take a look at these different solutions, choose your weapon and start making your life easier.