Right.. so now your developers don't need to understand how to configure and tun...

scarface74 · on April 5, 2019

You did just read our use case didn’t you? Yes we could overprovision a single server with 5x the resources for the once a week indexing.

We could also have 4 other servers running all of the time even when we weren’t demoing anything in our UAT environment.

We could also not have any redundancy and separate out the reads and writes.

No one said the developers didn’t need to understand how to do it. I said we didn’t have to worry about maintaining infrastructure and overprovisioning.

We also have bulk processors that run messages at a trickle based on incoming instances during the day but at night and especially st the end of the week, we need 8 times the resources to meet our SLAs. Should we also overprovision that and run 8 servers all of the time?

nilkn · on April 5, 2019

> Yes we could overprovision a single server with 5x the resources for the once a week indexing.

You assume this is bad, but why? Your alternative seems to be locking yourself into expensive and highly proprietary vendor solutions that have their own learning curve and a wide variety of tradeoffs and complications.

The point being made is that you are still worrying about maintaining infrastructure and overprovisioning, because you're now spending time specializing your system and perhaps entire company to a specific vendor's serverless solutions.

To be clear, I don't have anything against cloud infrastructure really, but I do think some folks really don't seem to understand how powerful simple, battle-tested tools like PostgreSQL are. "Overprovisioning" may be way less of an issue than you imply (especially if you seriously only need 8 machines), and replication for PostgreSQL is a long solved problem.

scarface74 · on April 5, 2019

You assume this is bad, but why? Your alternative seems to be locking yourself into expensive and highly proprietary vendor solutions that have their own learning curve and a wide variety of tradeoffs and complications.

So having 4-8 times as many servers that unlike AWS, we would also have five times (1 master and 8 slaves) as much storage is better than the theoretical “lock-in”? You realize with the read replicas, you’re only paying once for storage since they all use the same (redundant) storage?

Where is the “lock-in”? It is Mysql. You use the same tools to transfer data from Aurora/MySQL that you would use to transfer data from any other MySQL installation.

But we should host our entire enterprise on digital ocean or linode just in case one day we want to move our entire infrastructure to another provider?

Out of all of the business risks that most companies face, lock-in to AWS is the least of them.

The point being made is that you are still worrying about maintaining infrastructure and overprovisioning, because you're now spending time specializing your system and perhaps entire company to a specific vendor's serverless solutions.

How are we “specializing our system”? We have one connection string for read/writes and one for just reads. I’ve been doing the same thing since the mid 2000s with MySQL on prem. AWS simply load balances rate readers and adds more as needed.

And you don’t see any issue on spending 5 to 9 times as much on both storage and CPU? You do realize that while we are just talking about production databases, just like any other we company we have multiple environments some of which are only used sporadically. Those environments are mostly shut down - including the actually database server until we need them and scales up with reads and writes with Aurora Serverless when we do need it.

You have know idea how easy it is to set the autoscsling read replica up do you?

polote · on April 5, 2019

Usually you pay much more and have more servers when you use AWS than a basic server provider

scarface74 · on April 5, 2019

I’m making up numbers just to make the math easier.

If for production we need 5x of our baseline capacity to handle peak load are you saying that we could get our server from a basic server provider for 4 * 0.20 ( 1/5 of the time we need to scale our read replicas up) + 1?

Are you saying that we could get non production servers at 25% of the cost if they had to run all of the time compared to Aurora Serverless where we aren’t being charged at all for CPU/Memory until a request is made and the servers are brought up. Yes there is latency for the first request - but these are our non production/non staging environments.

Can we get point in time recovery?

And this is just databases.

We also have an autoscaling group of VMs based on messages in a queue. We have one relatively small instance that handles the trickle of messages that come during the day in production that can scale up to 10 at night when we do bulk processing. This just in production. We have no instances running when the queue is empty in non production environments. Should we also have enough servers to having 30-40 VMs running with only 20% utilization?

Should we also set up our own servers for object storage across multiple data centers?

What about our data center overseas close to our offshore developers?

If you have more servers on AWS you’re doing it wrong.

We don’t even manage build servers. When we push our code to git, CodeBuild spins up either prebuilt or custom Docker containers (on servers that we don’t manage) to build and run unit tests on our code based on a yaml file with a list of Shell commands.

It deploys code as lambda to servers we don’t manage. AWS gives you such a ridiculously high amount of lambda usage in the always free tier it’s ridiculous. No, our lambdas don’t “lock us in”. I deploy standard NodeJS/Express, C#/WebAPI, and Python/Django code that can be deployed to either lsmbda or a VM just by changing a single step in our deployment pipeline.

pcnix · on April 5, 2019

Basic replication is maybe close to what you could call solved, but I'd say that there's still complications like georeplicating multi master write machines are still quite complicated, and need a dedicated person to manage. Hiring being what it is, it might just be easier to let Amazon hire that person for you and pay Amazon directly.

I see cloud services as a proxy to hire talented devops/dba people, and efficiently multiplex their time across several companies, rather than each company hiring mediocre devops/dba engineers. That said, I agree that for quite a few smaller companies, in house infrastructure will do the job almost as well as managed services, at much cheaper numbers. Either way, this is not an engineering decision, it's a managerial one, and the tradeoffs are around developer time, hiring and cost.

stephenr · on April 5, 2019

> I see cloud services as a proxy to hire talented devops/dba people, and efficiently multiplex their time across several companies

It's only a proxy in the sense that it hides them (the ops/dbas) behind a wall, and you can't actually talk directly to them about what you want to do, or what's wrong.

If you don't want to hire staff directly, consulting companies like Percona will give you direct, specific advice and support.

scarface74 · on April 5, 2019

If something goes wrong, we can submit a ticket to support and chat/call a support person immediately at AWS. We have a real business that actually charges our (multi million dollar business) customers enough to pay for a business level support.

But in your experience, what has “gone wrong” with AWS that you could have fixed yourself if you were hosting on prem?

scarface74 · on April 5, 2019

Basic replication is maybe close to what you could call solved,

Locally hosted basic synchronous read replicas are a solved a problem?

mcguire · on April 5, 2019

"No one said the developers didn’t need to understand how to do it. I said we didn’t have to worry about maintaining infrastructure and overprovisioning."

If you do not do something regularly, you tend to lose the ability to do it at all. Personally, and especially organizationally.

scarface74 · on April 5, 2019

There is a difference between maintaining MySQL servers and the underlying operating system and writing efficient queries, optimizing indexes, knowing how to design a normalized table and knowing when to denormalize, looking at the logs to see which queries are performing slowly etc. using AWS doesn’t absolve you from knowing how to use AWS.

There is no value add in the “undifferentiated heavy lifting”. It is not a companies competitive advantage to know how to do the grunt work of server administration - unless it is. Of course Dropbox or Backblaze have to optimize their low profit margin storage business.

indigo945 · on April 5, 2019

Why not run eight servers all the time? If you are running at a scale where that is a cost you notice at all, you are not only in a very early stage, you're actually not even a company.

zbentley · on April 5, 2019

There are many, MANY software companies whose infrastructure needs are on the order of single digits of normal-strength servers and who are profitable to the tune of millions of dollars a year. These aren’t companies staffed with penny-pinching optimization savants; some software, even at scale, just doesn’t need that kind of gear.

user5994461 · on April 5, 2019

A multi million dollars tech company with tens of employees cannot run a sane infrastructure with a single digit of servers.

For a trivial website in production: 2 web servers + 1 database + 1 replica.

For internal tooling: 1 CI and build server + 2 development and testing servers + 1 storage, file share, ftp server + 1 backup server.

For desktop support: At least 1 server for DHCP, DNS, Active Directory + firewall + router.

That's already 10 servers and not counting networking equipment. Less than that and you're cutting corners.

scarface74 · on April 5, 2019

Web servers - lambda

Build server - CodeBuild you either run with prebuilt Docker containers or you use a custom built Docker container that automatically gets launched when you push your code to GitHub/CodeCommit. No server involved.

Fileshare - a lot of companies just use Dropbox or OneDrive. No server involved

FTP - managed AWS SFTP Service. No server involved.

DHCP - Managed VPN Service by AWS. No server involved.

DNS - Route 53 and with Amazon Certificate Manager it will manage SSL certificates attached to your load balancer and CDN and auto renew. No servers involved.

Active Directory - Managed by AWS no server involved.

Firewall and router - no server to Manage. You create security groups and attach them to your EC2 instances, databases, etc.

You set your routing table up and attach it to your VMs.

Networking equipment and routers - again that’s a CloudFormatiom template or go old school and just configuration on a website.

user5994461 · on April 5, 2019

[flagged]

scarface74 · on April 5, 2019

Yes I realize SFTP is not FTP. But I also realize that no one in their right mind is going to deliver data over something as insecure as FTP in 2019.

We weren’t allowed to use regular old FTP in the early 2000s when I was working for a bill processor. We definitely couldn’t use one now and be compliant with anything.

I was trying to give you the benefit of a doubt.

Amateur mistake that proves you have no experience running any this.

If it doesn’t give you a clue the 74 in my name is the year I was born. I’ve been around for awhile. My first internet enabled app was over the gopher protocol.

How else do you think I got shareware from the info-Mac archives over a 7 bit line using the Kermit protocol if not via ftp? Is that proof enough for you or do I need to start droning on about how to optimize 65C02 assembly language programs by trying to store as much data in the first page of memory because reading from the first page took two clock cycles on 8 bit Apple //e machines instead of 3?

We don’t “share” large files. We share a bunch of Office docs and PDF’s as do most companies.

Yes, you do have to run DNS, Active Directory + VPN. You said you couldn’t do it without running “servers”.

No we don’t have servers called

SFTP-01

ADFS-01

Etc.

either on prem or in the cloud.

Even most companies that I’ve worked for that don’t use a cloud provider have their servers at a colo.

We would still be using shares hosted somewhere not on prem. How is that different from using one of AWS storage gateway products.

scarface74 · on April 5, 2019

9 servers (8 reads and 1 writer) running all of the time with asynchronous replication (as opposed to synchronous replication) with duplicate data - yes the storage is shared between all of the replicas.

Not to mention the four lower environments some of which the databases are automatically spun up from 0 and scaled up as needed (Aurora Serverless)

Should we also maintain those same read replicas servers in our other environments when we want to do performance testing?

Should we maintain servers overseas for our outsourced workers?

Here we are just talking about Aurora/MySQL databases. I haven’t even gotten into our VMs, load balancer, object store (S3), queueing server (or lack there of since we use SQS/SNS), our OLAP database (Redshift - no we are not “locked in” it users standard Postgres drivers), etc.

AWS is not about saving money on like for like resources as you would on bare metal, but in the case of databases where your load is spiky you do. It’s about provisioning resources as needed when needed and not having to either pay as many infrastructure folks. Heck before my manager who hired me and one other person came in, the company had no one onsite that had any formal AWS expertise. They completely relied on a managed service provider - who they pay much less than they would pay for one dedicated infrastructure guy.

I’m first and foremost a developer/lead/software architect (depending on which way the wind is blowing at any given point in my career), but yes I have managed infrastructure on prem as part of my job years ago, including replicated MySQL servers. There is absolutely no way that I could spin up and manage all of the resources I need for a project and develop at the efficacy level at a colo as I can with just a CloudFormation template with AWS.

I’ve worked at a company that rented stacks of servers that sat idle most of the time but we used to simulate thousands of mobile connections to our backend servers - we did large B2B field services deployments. Today, it would be running a Pythom script that spun up an autoscaling group of VMs to whatever number we needed.