Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Scaling to millions of users with Google Sheets as a back end (levels.fyi)
96 points by aliabd on March 2, 2023 | hide | past | favorite | 29 comments


> Even now, one of our most trafficked services today still has a single node.js instance serving 60K requests per hour (topic for another blog post).

So 60k per hour is 1k per minute is 16 per second. That is... not that much? Certainly not really blog post material.


> Certainly not really blog post material.

Hi! Can you please link me to the minimum requirements for publishing blog posts on the internet? Much appreciated, thanks!


You'll know you've got great blog post material when you get a ton of attention on HN!


I love your tone.


I think that's the author's point: the industry tends to act as if complex, scalable architectures are inevitable if you have a useful service or want a viable business.

You can get a lot of mileage out of essentially any backend technology running on recent-ish hardware, provided the devs know what they're doing.

> Our backend today is more sophisticated but our philosophy to scaling is simple, avoiding premature optimization.


You could serve that with a shell script running off an uncached floppy.

EDIT: s/of/off/



I remember seek time on floppies to be most of a second. Maybe that was just CDs..


I think the main success is that they have succeeded in glorifying their work in this dimension. He'll probably tell it for 20 years. "You won't believe it dude but i was just using google sheets. Can you believe it dude?"


This seems like 5 years down the road this is going to turn into a horror story on HN, about "I went to this customer, and they had been running their entire company database on a single Google Sheet, and now are panicking because the secretary accidentally just deleted everyone from pay roll."


I'm not sure if I'm saying this seriously or not, but:

no worries! In addition to its granular permissions systems which will now let you revoke the secretary's access, Sheets also has a robust backup and restore story built-in! Just click File -> Version history and rollback the offending change. (You can do a live preview of what the effect will be, to confirm you're rolling back the right change.)

The problem Sheets has is that they built a tool that is too good :)


This is the same problem Excel has.

It's a horror show that you can run a major bank on it... but also not, because you can run it pretty well.


As if the same didn't happen plenty of times with proper DBs.


As a side note: it's nearly impossible to upgrade to a higher Sheets API rate limit.

We tried it all: options in Cloud Console, speaking with support, talking to sales reps. Google simply doesn't want that money and doesn't care if you are hitting limits and want to upgrade.


I run a Google Sheets-based SaaS. The default limit is 60 write req/user/min and 60 read req/user/min.

In my experience, that's plenty!

Especially so since Sheets has a very advanced API relative to, say, Airtable. For example, you can do atomic updates of multiple sheets in a single request. A single request can contain like 100MB of data. (It'll be very slow, and at this point, a person should question their life choices, but it'll work.)


We hit 60 read/write req/user/min pretty fast and yes, theoretically you can scale with the user numbers in GCP and by rotating them.


Probably because the sheets team doesn't want to be in the database business and there are lots of other Google products that let you use them as a database on purpose?


There’s something to be said for knowing a product’s limitations and sticking to it.

It might seem absurd there is no option to pay a bunch of money for “more”, but we also don’t really know what guaranteeing that more work cost. Could be quite a bit of the sheets infrastructure is built expecting those limitations?


Is Google Sheets designed for this sort of usage? I would have been worried about hitting some undisclosed limits with the service.

A single dedicated server is not that big of an expense either. The site probably could even run on a single high end server as it is now? I guess I wouldn't have considered using something like Sheets given that. Neat idea though.


I guess I'm confused, if you're already running Cloudfront, API Gateway, and lambdas, why you'd bother to make API calls out to Sheets rather than just plunking down a simple DynamoDB? Maybe the data access patterns are more complicated than I'm thinking, or just to capture the built in Forms -> Sheets data entry flow?


Co-founder here. Cloudfront, API Gateway and lambda all came a bit later (weeks to months later). They were incremental improvements and quite easy to set-up. Sure we could have used DynamoDB or any other database but in practice you never just need a database.

Sheets gives us an admin tool, BI tool, Charting, 'ETL', simple schema changes, etc. all-in-one.

Note we actually still use Google Sheets for some things like storing the list of companies. Just today, I needed a quick way to summarize the description of each company into a few words. I installed an add-on for =GPT() formula. Within a few minutes I had done ETL, column addition, made API calls, etc. 0 code.


I thought the same thing. Makes me wonder what the line is between short term thinking and deliberate short cuts to find product fit.


I’d love to see some screenshots of the site back in the early days.


I wonder why json over csv? csv should be a natural for sheets.

Plus you'd be able to cache them and use diff+patch for quicker updates when reading.


Great writeup, especially the diagrams. v0's manual update handling may seem like anathema to engineers, but is perfect for navigating product/market fit.

Though, now that I think about it, things like AirTable do the database thing better than Sheets does WRT being an app backend.


So when they say they're running without a server it means they are running on Google servers? I thought serverless meant.. no servers


Serverless means I can't ssh into the server.


Serverless generally means that you don't have a server / servers in the sense that you don't have some specific dedicated server(s) that you can point to running your code, not that there aren't any servers involved in making your code do its thing.


"Serverless" is just the latest buzzword, like "cloud", there's a server involved somewhere (if the app isn't completely P2P, anyway).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: