Whether it is to reduce your infrastructure costs, shrink the time you spend on ops tasks, simplify your deployment processes, reach infinite scalability or just play around, you have probably got interested in serverless architectures already, or at least heard about them.
Indeed this emerging technology is now offered by all the public cloud providers and being adopted by more and more companies for several use-cases: building event-driven applications, serving static websites, parallelising isolated tasks or running lean microservices.
This article focus on the latter use-case : we will explain how to choose the right tool to run a serverless microservice laying out an HTTP API.
Within FindHotel's data engineering team, we adopted serverless quite early (around one year ago) to build our near real-time data pipeline. Recently a new project emerged: setting up an API to act as an interface between the machine learning models built by the data scientists, and our website (also known as the concept of “putting data sciences in production”). After clarifying the requirements and running some basic costs calculation, it turned out that relying on a serverless architecture seemed to be a perfect fit.
This is where we started doing a state-of-the-art research about the different solutions to craft a serverless API. As we are extensive users of AWS, the study was fully oriented toward this platform, yet most of the insights can be applied to other cloud provider.
When starting the research, as serverless was still pretty new, we thought there would only be a few solutions to assess. Well, we were wrong: there were already many frameworks. The first thing to consider when choosing among all those fancy names is the language you want to use to develop your API. Even though in theory AWS Lambda only supports Node.js, Java (hence Scala or any JVM languages as well) and Python, it is in practice possible to run any language in a Lambda function. It just adds a bit of overhead to bootstrap the machine when the function starts but it is completely feasible. For instance, Sparta and Apex are successfully applying that for Go.
We have not been adventurous with the language and just picked the most popular one we were comfortable with, namely Python. Thus we selected three projects able to deploy and run Python code into lambda functions and benchmarked them:
After a few weeks playing with those three tools, we obtained the following results:
Zappa caught our attention for several reasons:
- its simple concept of turning a classic WSGI application, whether Flask or Django into a serverless service.
- the several features we needed were natively supported: custom domain name with auto-renewing TLS certificates, keeping your lambda warm automatically, reading variables from S3, managing API key to secure endpoints.
- the community involved and support : active slack channel, many companies were using it in production, high number of stars on GitHub, issues and pull requests quickly resolved or discussed.
Serverless Framework seemed promising with some interesting features like the possibility of having a different lambda function with a different code-base for each endpoint (freshly called "nanoservices" trend) which is relevant for large applications. Albeit convenient to launch a basic application, we soon were limited with the Serverless Framework when we moved toward more specific use cases like the ones Zappa allowed us to solve easily.
Chalice was released just two months before our study, thus not mature enough so we soon discarded it. Moreover Chalice had fewer features than Zappa and was offering to do exactly the same thing : turning a small web application into a serverless one but in a less convenient way because they chose to create their own DSL to define the app. Even though using this DSL is very similar to Flask, it is not exactly the same thus, being already comfortable with Flask, we naturally went for Zappa.
Our findings led us to build our serverless API on top of Zappa, which satisfies our needs so far. But before choosing a framework there are a few points to consider that could have saved us some time:
- Language of development
- Are you building a new application from scratch or is it just a migration from a classic architecture to a serverless one ? The latter is particularly simple to handle with Zappa and there are probably frameworks allowing to do the same for other languages so it is worth checking before taking the leap.
- API requirements: this one is the most important because, even though at first glance all those frameworks look brilliant and very easy to use, when you start having some specific use-cases, only a few of them are advanced enough to be helpful. For instance, questions to consider are: will you need a cache in your API, how will you manage authorization and API key, or how do you plan to handle the custom domain name?
- Support of the project: as serverless is a very recent technology, any project you pick will undergo frequent impactful modifications. This is why the community involved has to be organized, reactive and easily reachable. No matter how skilled you are, at some point, you will have questions to ask or need a bit of help.
- The size of your application: not all of the serverless frameworks are capable of properly handling a huge code-base while reaching demanding performance.
- Integration with the other services of the cloud platform.
This last point is tricky and brings us to one of the limitations of a project like Zappa.
So you can deploy a working API based on AWS Lambda and API Gateway in one line. This is great and I have enjoyed using it but, what are you supposed to do when your infrastructure evolves from the first picture to the second:
Simple PoC, comfortable to handle with a serverless framework
Real-world production application, unworkable with a serverless framework
Managing the Lambda plus API Gateway part with Zappa (or any other framework) and the rest manually or with another tool isn’t convenient nor simple. Unfortunately, Terraform or any similar project to manage infrastructure as a code doesn’t allow in any simple way to deploy a serverless API like a framework would do.
This is where the recent release of SAM by AWS is, in my opinion, a game changer. Being able to include your Lambda/API Gateway architecture inside a CloudFormation template, thus managing your whole application as a single stack takes the serverless architectures to a whole new level.
But still, writing and managing CloudFormation templates is not that handy. Especially as it involves a lot of repetitive manual tasks and boilerplate code with a high risk of typos.
This is why we are currently developing Humilis, our open source solution to manage AWS stacks through CloudFormation with an easier abstraction for the user when it comes to deploying large infrastructures and plugging everything together. But I’m getting ahead of myself here and Humilis will for sure be the topic of an upcoming blog post, so stay tuned if you want to learn more about it.