Image

Portrait Mathieu Lienart
from Matthieu Lienart
Cloud Engineer, from Ostermundigen

#knowledgesharing #level 100

Comparing AWS Lambda on ARM vs x86 architectures

In September 2021, AWS announced AWS Lambda Functions powered by AWS Gravition2 ARM processor and stated that:

“Lambda functions powered by Graviton2 are designed to deliver up to 19 percent better performance at 20 percent lower cost”

In this article we investigate the gotchas of this claim.

The benchmarking

In order to test this claim, we used some Python code with the Numpy library to perform some complex matrix calculations to simulate CPU intensive operations for benchmarking.

Using 300x300 float matrices the benchmarking code performs multiple:

1. Matrix multiplications

2. Vector multiplications

3. Matrix Singular Value Decomposition

4. Matrix Cholesky Decomposition

5. And Matrix Eigenvector Decomposition

The AWS Lambda Function was packaged and deployed on both ARM and x86 architectures. We used the AWS Lambda Power Tuning tool to execute both functions with different memory amounts between 128MB and 5120MB.

It should be noted that, when configuring your AWS Lambda Functions, you should stick with memory sizes which are multipliers of 128 (e.g., 128, 256, 384, 512, 1024…). Doing otherwise generates memory management overhead reducing the function performances.

The results

The results at all memory levels are as follows:



If we zoom-in at memory level above 512MB for clarity we get the following results:



We can clearly see two patterns for the CPU intensive operations performed by this benchmark:

1. For memory level <= 512, AWS Lambda on x86 architecture is between 36% and 99% faster than on ARM architecture.

2. If the performance gap between AWS Lambda functions running on ARM vs x86 closes as memory increases, the function running on x86 architecture remains about 16% faster.

ARM vs x86 in the real world

Working on some customer project where we were trying to minimize the response time of an AWS StepFunction orchestrating multiple AWS Lambda functions, we performed some side-by-side comparison. To do so, we deployed the entire process in parallel, one with all AWS Lambda Functions configured with an x86 architecture, one with all the functions configured with an ARM architecture. We observe that in general

• Compute intensive functions were faster on x86 architectures.

• Other functions were faster on ARM architectures.

Migration effort - is it that simple?

In that same blog announcement, AWS states

“The change in architecture doesn’t change the way the function is invoked or communicates its response back.”

This suggests that the migration is as simple as changing the AWS Lambda function architecture parameter from ARM to x86.

Although that might be true in some cases, this is unfortunately not the full picture. If your function uses non default AWS Lambda compiled libraries (e.g., the benchmarking function imports the Python Numpy library which uses compiled C libraries), switching the architecture from x86 to ARM will break the function as the library has to be recompiled for the architecture it is running on.

In such cases, just changing the AWS Lambda function architecture parameter is not enough. You need to repackage your libraries, which, depending on the language and how you package and deploy these libraries can be more or less complicated.

Conclusion

The answer to whether AWS Lambda function executes faster on ARM vs x86 architecture, is as often: “it depends”.

From the benchmark we performed at Axians Amanox, the rule of thumb seems to be that – for now:

The bottom line is that you need to test for your particular applications and test all functions individually depending on the operation they perform. The results will probably show that you will need to run a mix of both.

Image

Fully Automated MLOps Pipeline – Part 2

In our last post, we explored training a forecasting model with SageMaker. Now, we’ll complete the journey by detailing how to monitor its performance and automate retraining, ensuring consistent and reliable predictions.
learn more
Image

Fully Automated MLOps Pipeline – Part 1

In the previous blog post we introduced the architecture and demo of a near real time data ingestion pipeline into Amazon SageMaker Feature Store. In this post and the following one, we will present the fully automated MLOps pipeline.
learn more
Image

Near Real Time Data Ingestion into SageMaker Feature Store

This blog post is the first part of a 3 parts series about testing a fully automated MLOps pipeline for machine learning prediction on near real time timeseries data in AWS. In this first part we focus on the data ingestion pipeline into Amazon SageMaker Feature Store.
learn more
Image

AWS AppConfig for Serverless Applications Demo

Wouldn’t it be nice to decouple application configuration from infrastructure configuration and code? This is where AWS AppConfig (a component of AWS Systems Manager) can help
learn more