If you're asking yourself: "Why should I care about Lambda? How does it help me build a fast website? In what context?", there's an interesting use case from a merchandise store that made real hats and such for Counter Strike Source.
Valve mentioned them on the official CS:S page, and things went haywire. The team restructured into a lambda friendly architecture, they scaled without breaking a sweat, and ended up paying pennies in costs.
This would have been useful for me during development..
But honestly, the big problem is the API Gateway. The product is a mess, the docs are a mess, and the whole system is kind of insane. Lambda is awesome, but API Gateway still seems half baked. 90% of my development time was fighting with APIGW. There are some kind of crazy hacks in there (base58 encoding cookies, regex based on b64 encoded status codes) that shouldn't have been necessary.
Still, now that the system works and it's easy to use, I think this is ready for real usage. And I'm sure Amazon will get their stuff together for future releases of APIGW. They probably weren't completely anticipating that people would use it in this way.
Yeah, I fought with APIGW for a while in the beginning because I had to make scripts to create my route structures to I could call Lambda in a fully RESTful manner for my data model. A lot of stuff just didn't work when calling from the AWS CLI. After a lot of trial and error and a lot of bug reporting I got to a script that would quickly build my APIGW for me because that web interface is horrible.
To ensure that your servers are kept in a cached state, you can manually configure a scheduled task for your Zappa function that'll keep the server cached by calling it every 5 minutes.
The cost of running the warmer (300s * ~3M * 512MB) comes out to about $18 and that's not counting the number of actual requests from end users. It's interesting but costly as a substitute.
The call should only run for under the minimum 100ms each time as it's just a ping and the instances is usually already loaded although AWS will likely recycle the instance from time to time outside of your control. There would be roughly 8640 calls/month which is about 864 seconds of execution time. You get 800,000 free seconds per month. This is totally negligible because even if you are counting it above the free level it is about 7/10 of a cent per month.
From my understanding of the pricing (I could be wrong), even if the call is 100ms, because the timeout is set to 300s so the app stays in cache, you'll be charged for 300s (app is "running" for that long).
I combined seconds into 3M, which is roughly 8640 (number of calls a month to keep it in cache) * 300s (timeout set so it stays in memory). My understanding of AWS lambda is that if you keep it up for 300 seconds, you get charged for that much. Because it's #reqs * #secs. Here's the example from the pricing page (with my comment at the end of the Total compute line):
The monthly compute price is $0.00001667 per GB-s and the free tier provides 400,000 GB-s.
Total compute (seconds) = 3M (1s) = 3,000,000 seconds # ( roughly equals to 8640 * 300 ~ 2.6M seconds in the case of this app)
Total compute (GB-s) = 3,000,000 * 512MB/1024 = 1,500,000 GB-s
Total compute – Free tier compute = Monthly billable compute GB- s
I don't think the timeout is related to the caching. When your function returns, it is over. The caching is not transparent, I don't know how it works under the hood, but it seems like if you just call it every 10 minutes it stays hot.
Would love an AWS engi to shed some light on this though.
You will wind up with multiple containers in use at once if you have enough traffic so a call to the general pool that is running your Lambda function will most likely only keep one of the containers from recycling.
It looks to me like you're misunderstanding the recommendation here, which is to hit a fast endpoint to ensure that lambda keeps it somewhere in their caches. This is similar to what one might do with Heroku.
The idea is that something like actually distributing the code to a front-line server seems to add to boot time, so if the lambda is not warm, it will be a little slower. If you keep it warm by having at least occasional traffic headed to the server, you're able to avoid this penalty.
If you look above, you'll see someone else did the math for you, and you're actually only talking about 864s of execution time, not 3M.
It also sort of looks like you just pulled up the pricing example which describes 3M requests that take 1s and worked back to it, because when you do the math you describe it works out to ~ 2.5M
> Where normal web servers like Apache and Nginx have to sit idle 24/7, waiting for new requests to come in, with Zappa, the server is created after the HTTP request comes in through API Gateway
To me, this sounds very much like reinventing the plain old CGI, just with different names ("Web server" -> "API Gateway", "CGI script" -> "server").
* CGI is the closest UNIX thing you can find, fork a process, write on STDIN, read from STDOUT. Lambda actually is a framework for running javascript/java/python code, from which you can call actual binaries.
* Lambda's attractive point is that containers are reused, which means that instead of paying the full price of a new process your context is already "hot", even for the actual binary you run. That means less latency and less CPU usage, allowing you to scale far more easily
I'm no Lambda user so I'm possibly wrong, but the idea and execution behind sure looks nice.
Both your statements are correct: Amazon is most likely running lxc containers under the hood though, so anyone could duplicate what Lambda does using an evaluator for javascript/python/etc with a REST API containerized in a docker container.
You're not completely wrong, the difference is that there is no configuration necessary, no permanent infrastructure, no limitations on scalability, and the costs are measured in milliseconds. AWS just takes care of everything. It's just _python manage.py deploy prod_ and you're done.
Obviously there are still ways that this can be optimized since Django wasn't really designed to be used like this, but it still seems performant enough for me, and the other advantages gained are major.
Response times for a warm server are almost always <200ms, averaging just over 100ms (we did some tests on reddit yesterday.) In my own tests just now, I was getting <80ms response times consistently. And I'm certain there are ways that we can shave this down further.
I get sub-100ms times quite often for light calls. The 'boot-up' call will always be longer, but that's because the service had gone idle. If you are getting heavy usage that is not an issue.
Lambda with the API Gateway (the Request/Response control flow, as opposed to the Event triggered control flow that Lambda uses with other services) is basically like a distributed, virtualized implementation of CGI.
However, being distributed and virtualized is a pretty big deal. In practice this means Lambda apps perform differently than traditional CGI, and also require different app architecture to support them.
I am toying around the idea of writing an open source implementation of Lambda on top of docker (for isolation) golang (for custom code to run) and erlang/elixir/OTP (for coordination). Are people interested ?
Feel free to contact me on the address on my profile...
I think there is a strong need for something like this. It's worth noting that API Gateway is also a huge component of this project, not just Lambda. Lambda alone wouldn't provide the benefits of Zappa.
Interesting! I've been playing around with Serverless/Lambda this weekend (the node version), but a main thing that struck me was that I couldn't use a connection pool for database connectivity. This means that each request to a lambda-backed API will need to create new connections.
Anyone who wants to share their thoughts about this?
The point of this is that you don't need to learn any new frameworks - it works with your existing code. You can deploy your existing apps on Lambda without having to change anything, and you're not locked in to AWS if you want to go back.
I rather doubt the "without changing anything" part. For example, I write static media to a subdirectory of my Django installation. Will that work without any changes? How about Django uploads? How about temporary files? Etc etc.
I used to use Django-nonrel for GAE so I wouldn't be logged in, and guess what: By the time I wanted to move away, I had so much GAE-specific behaviour that I pretty much had to rewrite the app.
Well you really shouldn't be serving static content through Django, you should be serving it through a CDN. If you use something like Django-Storages for uploads, it'll Just Work.
I'm not guaranteeing that everything will work out of the box on the first try, but I bet you'll be very close. You will have to make a few design decisions, but if you're making the right ones it should just work. At the very least it should be far, far easier that GAE.
(The one thing you'll have to watch out for are C-extensions, which, for now, require that you do your deployment from an x86_64 machine.)
Fantastic! We had been considering using lambda, but were worried that doing so would irreversibly lock us into AWS. Being able to develop for lambda using standard web frameworks (django already being our favorite) goes a long way in increasing our willingness to use it.
Yep! This should make it super easy, there are no code changes needed so you can go back and forth between Lambda and traditional hosts as often as you please. No "lock in" - although one of Lambda's advantages is how well it ties into the rest of the AWS ecosystem with RDS/S3/CloudFront/etc.
Valve mentioned them on the official CS:S page, and things went haywire. The team restructured into a lambda friendly architecture, they scaled without breaking a sweat, and ended up paying pennies in costs.
Link: https://www.reddit.com/r/webdev/comments/3oiilb/our_company_...