Serverless, for the most part, is the promised land of software development.
It cuts downtime to ship the product and significantly lowers the cost of implementation and usage of the system. It almost removes the need for any operational tasks and offers an unbeatable grade of availability, security, and scalability.
According to Gartner, more than 20% of global enterprises will have deployed serverless computing technologies by 2020, so this is the train you do not want to miss. Serverless is all about building an IT system using cloud-provided services and, if necessary, gluing them together with functions like Lambda. As a result, we get a solution that we don’t have to manage and we only pay for what we use.
Be the first to know about the jobs you want.
Register now and get hand-picked AWS roles direct to your inbox.
Serverless isn’t always a bed of roses. It has some pain points that you’d better get familiar with to prevent you from failing on your journey. Often, best practices that we have been using for decades for building our systems turn out to be antipatterns.
That’s not a problem if you’re aware of conceptual differences. There are still some technical limitations you should be aware of when it comes to serverless, they’re gradually being mitigated as tech develops.
Let’s take a closer look at the obstacles you could encounter while building a serverless system; I’ve put this into context because the good parts vastly outweigh the bad.
Yesterday’s best practice is today’s antipattern
For decades, the following principles were the touchstone of development:
- Strong data consistency
- Data normalization
- Transactions
- Fast immediate responses
- High-performance infrastructure that rarely fails
You can still use these principles today, but they’re not ideal for serverless. Strong data consistency, data normalization, and transactions are the base for SQL databases, and SQL databases do not scale. So, if you need an ultra-scalable solution, you need a NoSQL database, but not just on a database level; these principles aren’t right for the whole infrastructure.
In the serverless world, data often flows through the system as responses to specific events. In such a scenario, data is consistent in the end (eventually!). Transactions are also not the best fit in this case. For maximum performance, databases need to be structured for fast reading—that’s why data is often duplicated and denormalized.
Functions, the core and glue of the serverless system, run in containers. With a memory setting, you set how powerful your hardware is. It can have 128 MB to 3 GB memory. That’s not very much, but you don’t need more to run a single function. If you do need more than one function at a time, you get a new container with new capabilities.
New guidelines
How can we create a new system using these core principles that enabled us to build reliable systems for decades? The answer is with new practices that enable us to build even more reliable and, above all, scalable systems.
Stateless
A container running Lambda can be reused, but you can’t hold state at the container, because it can be stopped at any time and you never know which container will process the next request.
Favor asynchronous operations
Try to process more resource-heavy tasks in the background. That gives you the option to queue requests, retry, scale, and optimize capacity. If the user is involved, inform them that their request has been received and notify them of the results if necessary.
Microservices and nanoservices
Break the system into small autonomous pieces. Small, independent microservices can easily be managed, monitored, and scaled.
Design for failure and make functions idempotent
Most of the services that trigger Lambda retry the call in case of failure, for example. The cloud environment is reliable, but in a distributed system such as serverless you must anticipate failures. Functions have to be idempotent. For example, if the same order is received twice, you shouldn’t end up shipping two products. You have to perform checks if the order has already been proceeded at the beginning of processing, and just before outputting results.
Do as little as possible in Lambda
Functions should be small. That means they should only be doing one thing, and shouldn’t have a lot of dependencies. All of this makes them easier to manage and start up quickly. A lot of things can be done without Lambda, like downloading or uploading a file from or to S3. Another example is using API Gateway to write directly to DynamoDB, SQS, SNS, Kinesis, and so on, without Lambda.
Prefer NoSQL databases
NoSQL databases are scalable and cheap. In AWS, DynamoDB is the preferred database—but be cautious. NoSQL databases require a different data model and don’t support complex queries. Concepts are very different, and especially hard to grasp for beginners. Access patterns must be very well defined at the very start of the design phase.
Prefer eventual consistency
Data flows between events, from Lambda to Lambda and service to service, in response to events. So there is no strong (immediate) consistency.
New design patterns for serverless systems
Currently, there is no comprehensive book or course that explains all the patterns in detail. The best source right now is Jeremy Daly’s blog, where he identifies nineteen patterns. Yan Cui’s blog is another great resource when it comes to serverless—you’ll even find some insightful videos of Yan’s talks on YouTube. They both produce a weekly newsletter.
Let’s take a look at two patterns as an example:
The Simple Web Application
This is the most common pattern—CloudFront is used as CDN for fast delivering static files that are stored on S3, while Cognito is used for authentication. The API Gateway represents an HTTP endpoint for dynamic content. It forwards requests to Lambda. In DynamoDB we store data.
Fan-out pattern
The fan-out pattern is also very common, and reveals the power of serverless and the different mindset you have to have to take advantage of it. It allows you to create scenarios that would otherwise be impossible in a traditional environment. The basic idea is that we distribute work to multiple instances of Lambda. That could mean all Lambdas do the same kind of processing or each execute its own type of work. There are many variations of this pattern.
Fan-out with Lambda
The most simple is fan-out with Lambda, where it executes new Lambdas. You can do it synchronously, delivering results to the user instantly, but it’s more common is to work asynchronously. In this case, initiator Lambda does not wait for the response and you should configure the dead-letter queue, which enables you to get notified in the event of an error.
Fan-out with SNS
The concept behind this one is simple. Lambdas are subscribed to SNS topics; you trigger a notification, and each Lambda begins with its own processing.
Fan-out with SQS
You put messages into SQS. Then lots of Lambdas are automatically started. Each receives a set of messages. In this case, all Lambdas do the same kind of processing. This pattern also enables batching, because each Lambda can receive several messages in a batch.
The challenges of serverless
Cold start
A cold start is when a new container that runs Lambda needs to boot up. .NET or Java have longer cold starts compared to other languages, taking up to a few seconds to get up and running. Try to use Python, Node.js, or Go for user-facing services.
If you use a virtual private cloud (VPC), you add another 10 seconds to some cold starts. This is your own network within AWS. You can have private services like the SQL database. Using VPC is the only way to access the SQL database if you do not want to open it to the public. A 10-second cold start is a consequence of attaching an Elastic Network Interface (ENI). How often this happens depends on the amount of memory you assign to Lambda; with more memory, you get a better network, which means less sharing of ENI. If you assign 3 GB of memory, Lambda’s ENI won’t be shared, and there will always be a 10-second cold start—but this should soon be resolved by AWS.
Solutions:
- Don’t use a VPC if you don’t need access to private services, or just because it looks more secure. Lambda is secure enough, and this is AWS’s official recommendation.
- Use Node.js, Python, and Go. If you use those and avoid VPCs, you’ll never even notice a cold start.
- Use library Lambda Warmer by Jeremy Daly. It periodically triggers Lambda, or several Lambdas, and keeps them warm.
Observability – monitoring, logging, and tracing
When errors crop up in serverless, it’s much harder to resolve them than in their more-traditional counterparts. Serverless systems are distributed, which means every part of the system produces logs. It can difficult to make sense of these.
Solutions:
- Use a common correlation ID to identify which logs belong to the same requests.
- Use new CloudWatch features named ‘Logs Insights’.
- Service X-Ray is indispensable. It allows you to visualize connected services and trace calls that flow through different parts of the system so you can see the architecture, execution time, error rate, and so on.
- Third-party services like Epsagon, Thundra, IOpipe and DataDog are extremely useful.
Connecting to SQL databases
One of the main issues with SQL databases is that if you don’t want to give public access to the database, you must use a VPC. We’ve already mentioned the main pitfalls it brings. Another is that opening the connection is expensive. That’s why you have a connection pool in traditional systems.
A connection pool is a set of open connections. You get one when you need one, then you return it to the pool. In serverless, there is nowhere to hold this connection pool; Lambdas come and go. You can either open and close the connection each time, which is expensive, or keep them open and risk running out of available connections.
Solution:
- Use library Serverless MySQL by Jeremy Daly. On the end of each request, it checks if it needs to close the connection or not. It is not a perfect solution, but an acceptable one.
DDOS and other attacks = wallet attack
Because serverless is automatically scalable, this means in the case of a DDOS attack that your bill will increase.
Solutions:
Set the following to limit your use of resources:
- Lambda Concurrent Execution Limit
- Lambda timeout
- API Gateway throttling
CloudFormation
CloudFormation is an AWS solution for infrastructure as code. It’s used by most of the tools you need for serverless deployment, including the most popular Serverless Framework. There’s a limit of 200 resources per CloudFormation stack; that sounds a lot, but you’ll usually need several resources for one Lambda, and you’ll basically reach this limit within 30 to 35 Lambdas. This isn’t much if you consider that you should favor small Lambdas.
Solutions:
- The best way is to have one stack just for Lambdas that are part of the same microservice.
- Use nested CloudFormation stacks, which may not be ideal, but it works.
- You can use Terraform instead of CloudFormation, which is much harder compared to tools like Serverless Framework.
Common pitfalls to avoid
Don’t call one function from another
Calling one function from another is a viable option in some patterns like fan-out. In general, you should avoid it. They add latency and costs with each invocation of the function. Additionally, if the calls are synchronous, you are paying for two functions at the same time.
Solutions:
- You can share code between functions to avoid calling them or you can use Lambda Layers
- Call one function from another with asynchronous invocation so initiation functions can end after the call
- For invocation, use AWS SDK. Don’t use an HTTP call that goes through the whole infrastructure including API Gateway
Beware of recursive calls and infinite retry
Incident recursive calls and infinite retry can produce extremely large bills. A very common error is configuring Lambdas to process new files in S3—if you drop results in the same S3 bucket, it will trigger another recursive call.
Solutions:
- Mark data that has already been processed, or is the output of a previous process
- Set limits to retries
- If you process files in S3, save results in another S3 bucket
Security
AWS takes care of infrastructure security, but you’re in charge of security in the code you write. There are some common errors that are often picked up from oversimplified code samples.
Solutions:
- Set IAMs for each function separately, and only set the permissions they need. You shouldn’t use asterisk characters like “s3:*” as this grants all privileges to S3, which you probably don’t need if you only read data.
- Do not save secrets in environment variables or code. They are checked into the source repository and then difficult to protect. You should use System Manager Parameter Store or AWS Secrets Manager.
Know Lambda’s limits
You can have up to 3 GB of RAM, with a maximum execution time of 15 minutes, and you can have 512 MB of local disk storage.
Solutions:
- Check how much execution time you have left with the method context.getRemainingTimeInMillis()
- If you’re approaching the limit, you can invoke another Lambda asynchronously to continue the work or use a pattern like fan-out.
- Don’t use Lambda for ultra-low latency (constant < 100 ms). Cold starts are still a factor.
Serverless tips and tricks
Separate core logic from Lambda handlers
Handlers are entry points to function calls. Keep them small and separate core logic to other modules. You can reuse code in another Lambda.
Dead-letter queue
If you invoke Lambda asynchronously, configure dead-letter queue so any failed messages go to SQS or SNS, and you’ll know that something went wrong.
Optimize function memory
Use higher memory setting for critical functions.
Connecting to other systems
Serverless is infinitely scalable. This doesn’t mean that the services you call are also scalable. To prevent overloading the other system, put SQS queue or Kinesis Stream between.
Reusing data
Lambda is stateless, but containers are usually reused. So you can temporarily store some data, like decrypted configuration settings or database connections. Just add a reference to the data outside the function handler.
Limit default retention policy on CloudWatch
The default retention policy in CloudWatch is to never expire. You probably won’t need those logs forever, especially because storing them is expensive.
Set CloudWatch alarms
You can set alarms for many services metrics, but the most important one is the billing alarm.
CloudFront cache errors
CloudFront caches HTTP 4xx or 5xx status codes errors by default for 5 minutes. So you may encounter an error and resolve it, but still receive and error message because it’s cached. To remedy that, set TTL for errors to 0, at least for the development environment.
DynamoDB empty string
Properties of the objects we store in DynamoDB can’t have a value of an empty string.
For this:
{ id: 123, value: “” }
You get the error: “An AttributeValue may not contain an empty string”
Enable HTTP keep-alive
By default, AWS SDK does not enable HTTP keep-alive. For every request, a new TCP connection has to be established with a three-way handshake. You can clearly see the benefits when communicating with DynamoDB.
Test for scalability
Never assume that everything will scale—you need to test for that! Besides mistakes you might have made when architecting your solution, there are many AWS settings that limit scalability for your protection. To test scalability, you can use Serverless Artillery.
Connecting SQS and Lambda
When using SQS as a Lambda trigger and you want to configure reserved concurrency, set it to at least five, because pooling starts with five concurrent connections.
Be the master of your AWS destiny.
Browse the latest AWS jobs and find the perfect role for you, wherever you are in the world.
Marko also runs his own successful blog over on www.serverlesslife.com.
Blog: www.ServerlessLife.com
Email: marko@serverlesslife.com
Twitter: @ServerlessL
GitHub: https://github.com/ServerlessLife
LinkedIn: http://www.linkedin.com/in/marko-serverlesslife