Experience report deploying PureScript to AWS Serverless (Lambda) …

This post was adapted from a tweet thread on June 3rd, 2020.

High-Level Summary

Deployed a serverless application using AWS API Gateway (REST) and AWS Lambda.
Despite concerns about costs, we didn't break the bank and there is room for cost optimization.
Lambda written in PureScript, our second production backend deployment in the first half of 2020.
Debugging and troubleshooting in a serverless environment require adjustments for developers accustomed to a pre-serverless world.
Local development can be simulated using SAM with a localhost endpoint though not necessarily 1-to-1.
Emitted metrics to DataDog, but a custom Lambda runtime wasn't feasible due to our security concerns and limited infrastructure dependency review bandwidth before go-live date.
Deployment automation in CI/CD into AWS, specifically using SAM CLI, which came with challenges and required workarounds.
The current infrastructure involves EC2 instances in autoscaling groups (ASGs) with application load balancers (ALB) for zero downtime deploys.
Reserved concurrency in AWS Lambda has not been set up yet but is one of the next steps for our post-production cost optimization path.
The deployed AWS Lambda function is a 100% write path HTTP Lambda.
A sample Git repository was created on GitHub, providing a barebones Express-based PureScript Lambda and a SAM template.yaml for quicker setup of PureScript serverless applications using familiar HTTP APIs.
We deployed a CommonJS output bundle of our PureScript application using the AWS nodejs12.x Lambda runtime provided by AWS.

Costs & Metrics

Cost description	Amount	Units
API Gateway + Lambda	~15	$ (USD) per day
Invocations (peak)	~32,000	invocations per minute
Latency	50	milliseconds P99
Memory utilization	50-75%	memory allowance (max 128MB)
CloudWatch Logs	0.40	$ (USD) per day
Reserved Concurrency	N/A	not set up yet

Costs and metrics providing context to our serverless production deployment

Optimization Next Steps

Migrate from REST API Gateway (which we chose to meet deadline due to our prior familiarity) to HTTP API Gateway which is cheaper
Experiment with reserved concurrency
Pipeline multiple events in one request (application tweak)

Sample PureScript Lambda Repo Walk-through

TODO

Experience Report

Last week marked a milestone in my serverless experimentation with the successful deployment of our second "serverless app" to production. As an infrastructure engineer who is well-versed in deploying large-scale services to EC2 instances with auto scaling groups and managing costs effectively, venturing into the world of serverless deployments had initially left me apprehensive about cost considerations.

During the first week in production, without any cost optimization efforts, we found ourselves spending approximately $15 per day on API Gateway and AWS Lambda invocations. The peak load was higher than anticipated, with around 32,000 invocations per minute, averaging at an impressive rate of 530 invocations per second during high load. The performance was reasonable, with a 99th percentile latency of just around 50 milliseconds and a utilization of 50-75% of the memory allowance, capped at a maximum of 128MB for our lambda provisioning.

For the first push to production we chose to ship our application using the REST API Gateway, even though it was a slightly more expensive option. This decision was primarily driven by our familiarity with the REST API Gateway, but in retrospect, we could have easily utilized the newer and more cost-effective HTTP API Gateway since caching was unnecessary for this particular use case (100% writes for a metrics collector backend).

Amusingly, while the costs for API Gateway and Lambda invocations seemed substantial, our CloudWatch Logs barely made a dent in our budget, amounting to a mere $0.40 per day, or 40 cents.

Despite initial concerns, our costs did not skyrocket, offering some relief. Moreover, we realized that there were still opportunities for further cost optimization, considering that our functional barebones were already up and running and fully automated for delivery.

This marked the second production backend I had launched in the first half of 2020, both written in PureScript. While the deployment process had its fair share of challenges, particularly with conflicting AWS tooling documentation and implementation issues when using SAM CLI, overall, it was an enjoyable experience. The first backend deployment had been a different beast altogether, with lower traffic volume but higher complexity due to the inclusion of an authorizer lambda.

As someone accustomed to the pre-serverless cloud deployments, I quickly discovered that debugging and deploying serverless applications required adjustments. It demanded a shift in mindset for developers accustomed to a different deployment model. However, with determination and adaptability, this transition proved to be surmountable for our usage.

The inability to access certain debugging tools, such as strace, perf, ss, etc, inside the lambda container running in AWS required a rethinking toward strutured logging-based troubleshooting. Locally, I eventually found a way to run a simulated API Gateway locally using SAM with a localhost endpoint reliably, allowing for local development and debugging. This capability was hindered for some time due to outstanding SAM CLI bugs at the time of development which should now be resolved.

For our metrics service, we relied on DataDog, which unfortunately necessitated a custom Lambda runtime for full APM capabilities. However, due to security concerns and limited resources for infrastructure dependency review, we opted to leverage the embedded metric format offered by DataDog.

Deploying serverless applications in AWS always comes with its fair share of challenges. To achieve our desired level of deployment automation and satisfy our push-button zero downtime requirements, I found myself spending more time navigating the intricacies of SAM CLI's GitHub issues, seeking workarounds and solutions, than actually focusing on app code or deployment automation development. Nevertheless, I remained optimistic, knowing that the eventual payoff would be worth the effort, just as it had been with our existing infrastructure.

Our current infrastructure, which had been serving us well with push-button zero downtime (from an infrastructure perspective) deploys, relied on EC2 instances in autoscaling groups (ASGs) coupled with application load balancers (ALBs) and some custom deploy code. We had developed this infrastructure three years ago and had only made a handful of tweaks since then.

While we haven't yet implemented reserved concurrency in AWS Lambda, we planned to explore its potential benefits after transitioning to the HTTP API Gateway. We recognized that even with our current costs falling within budget, a surge in demand could quickly escalate our expenses. Thus, we aimed to tweak various parameters, including reserved concurrency, to ensure cost stability and efficient resource utilization.

To assist others looking to delve into the world of PureScript serverless applications and facilitate a quicker start utilizing familiar HTTP APIs from JavaScript, I compiled a sample Git repository on GitHub. This repository showcased a barebones Express-based PureScript Lambda along with a SAM template.yaml file for deployment automation, providing a helpful resource for those embarking on a similar journey.

In conclusion, deploying a pure functional serverless function to AWS using PureScript has proven to be an exciting and enlightening experience. While it required adjustments and overcoming challenges, the costs were manageable, and the potential for further optimization was promising. Going forward I anticipate the issues with SAM CLI to be lessened as the tooling matures and improves its coverage of CloudFormation primitives or options exposed in the Serverless Application Model (SAM) spec.

Susan Potter

Quant

Work with me

I spent the first half of my career building risk models and market data infrastructure at BNP Paribas, Bank of America, and Citadel, then fourteen years shipping production systems at scale. Now I bring both sides to quantitative trading. If you're a trading firm, family office, or fund looking to tighten the connection between your research ideas and your production trading systems, whether that's building validation pipelines, formalizing signal logic, or getting microstructure analytics into a deployable state, I'd like to hear what you're working on. Reach me at me@susanpotter.net.