Their words, not mine; the first header: "It's already on your machine". We can belabor it, but the domain is 'justuse'. No room for 'except' [unless you're reasonable, of course].
The 'egregious' things are charging to share what will fit very well in SCM (preventing real automation)... and breaking due to Online First/only. It makes sense to require the endpoint I'm talking to. Why would Postman need AWS/us-east-1 [0] for a completely unrelated API? Joyful rent-seeking.
cURL, your suggestion (hurl), or HTTPie all make far more sense. Store what they need in the place where you already store stuff. Profit, for free: never go down. Automate/take people out of the button-pushing pattern for a gold star.
Ditto, enjoy the catharsis. Good advice to not take it personally, I'll try to give a less aggressive point of view. All of this has come to mind [but not repeated out of kindness or laziness, whichever].
So, to start: someone wants me to install Postman/similar and pay real money to share and make a request? Absolutely not. I can read the spec from Swagger, or whatever, too... and write down what was useful [for others]. We all have cURL or some version of Python.
Surely a few phrases of text worth making plans to save, and paying for [at least twice, you to research and them to store], are worth putting into source control. It's free, even gifts dividends. How? Automation that works faster than a human pushing a button. Or creates more buttons!
The wisdom of pipes! I'd share these workflows the exact same way we share others [ie: BASH, Ansible]: Git. Needs nothing more than a directory, though an SSH daemon is quite nice.
Those of us who can survive without desperate monetization plays are worth quite a lot, actually. They say 'jury rig', we say 'engineer'.
Call me crazy, because this is, perhaps it's their "Room 641a". The purpose of a system is what it does, no point arguing 'should' against reality, etc.
They've been charging a premium for, and marketing, "Availability" for decades at this point. I worked for a competitor and made a better product: it could endure any of the zones failing.
It's possible that you really could endure any zone failure. But I take these claims people make all the time with a grain of salt, unless you're working on AWS scale (basically just 3 companies) and have actually run for years and seen every kind of failure mode claiming to be higher availability is not something that's able to be accurately evaluated.
(I'm assuming by zone you mean the equivalent of an AWS region, with multiple connected datacenters)
Yes, equivalent. Did endure, repeatedly. Demonstrated to auditors to maintain compliance. They would pick the zone to cut off. We couldn't bias the test. Literal clockwork.
I'll let people guess for the sport of it, here's the hint: there were at least 30 of them comprised of Real Datacenters. Thanks for the doubt, though. Implied or otherwise.
Just letting you know how this response looks to other people -- Anon1096 raises legitimate objections, and their post seems very measured in their concerns, not even directly criticizing you. But your response here is very defensive, and a bit snarky. Really I don't think you even respond directly to their concerns, they say they'd want to see scale equivalent to AWS because that's the best way to see the wide variety of failure modes, but you mostly emphasize the auditors, which is good but not a replacement for the massive real load and issues that come along with it. It feels miscalibrated to Anon's comment. As a result, I actually trust you less. If you can respond to Anon's comment without being quite as sassy, I think you'd convince more people.
I appreciate the feedback, truly. Defensive and snarky are both fair, though I'm not trying to convince. The business and practices exist, today.
At risk of more snark [well-intentioned]: Clouds aren't the Death Star, they don't have to have an exhaust port. It's fair the first one does... for a while.
Ya, I totally believe that cloud platforms don't need a single point of failure. In fact, seeing the vulnerability makes me excited, because I realize there is _still_ potential for innovation in this area! To be fair it's not my area of expertise, so I'm very unlikely to be involved, but it's still exciting to see more change on the horizon :)
What company did you do it with, can you say? Definitely, they may have been an early mover, but they can (and I'll say will!) still be displaced eventually, that's how business goes.
It's fine if someone guesses the well-known company, but I can't confirm/deny; like privacy a bit too much/post a bit too spicy. This wasn't a darling VC thing, to be fair. Overstated my involvement with 'made' for effect. A lot of us did the building and testing.
Definitely, that makes sense. Ya no worries at all, I think we all know these kinds of things involve 100+ human work-years, so at best we all just have some contribution to them.
> think we all know these kinds of things involve 100+ human work-years
No kidding! The customers differ, business/finance/governments, but the volume [systems/time/effort] was comparable to Amazon. The people involved in audits were consumed practically for a whole quarter, if memory serves. Not necessarily for testing itself: first, planning, sharing the plan, then dreading the plan.
Anyway, I don't miss doing this at all. Didn't mean to imply mitigation is trivial, just feasible :) 'AWS scale' is all the more reason to do business continuity/disaster recovery testing! I guess I find it being surprising, surprising.
Competitors have an easier time avoiding the creation of a Gordian Knot with their services... when they aren't making a new one every week. There are significant degrees to PaaS, a little focus [not bound to a promotion packet] goes a long way.
Yes, it was something we would do to maintain certain contracts. Sounds crazy, isn't: they used a significant portion of the capacity, anyway. They brought the auditors.
Real People would notice/care, but financially, it didn't matter. Contract said the edge had to be lost for a moment/restored. I've played both Incident Manager and SRE in this routine.
edit: Less often we'd do a more thorough test: power loss/full recovery. We'd disconnect more regularly given the simplicity.
If you go far up enough the pyramid, there is always a single point of failure. Also, it's unlikely that 1) all regions have the same power company, 2) all of them are on the same payment schedule, 3) all of them would actually shut off a major customer at the same time without warning, so, in your specific example, things are probably fine.
No. It’s just that in my entire career when anyone claims that they have the perfect solution to a tough problem, it means either that they are selling something, or that they haven’t done their homework. Sometimes it’s both.
For what's left of your career: sometimes it's neither. You're confused, perfection? Where? A past employer, who I've deliberately not named, is selling something: I've moved on. Their cloud was designed with multiple-zone regions, and importantly, realizes the benefit: respects the boundaries. Amazon, and you, apparently have not.
Yes, everything has a weakness. Not every weakness is comparable to 'us-east-1'. Ours was billing/IAM. Guess what? They lived in several places with effective and routinely exercised redundancy. No single zone held this much influence. Service? Yes, that's why they span zones.
Said in the absolute kindest way: please fuck off. I have nothing to prove or, worse, sell. The businesses have done enough.
Yea, let's play along. Our CEO is personally choosing to not pay any entire class of partners across the planet. Are we even still in business? I'm so much more worried about being paid than this line of questioning.
A Cloud with multiple regions, or zones for that matter, that depend on one is a poorly designed Cloud; mine didn't, AWS does. So, let's revisit what brought 'whatever1', here:
> Your experiment proves nothing. Anyone can pull it off.
Fine, our overseas offices are different companies and bills are paid for by different people.
Not that "forgot to pay" is going to result in a cut off - that doesn't happen with the multi-megawatt supplies from multiple suppliers that go into a dedicated data centre. It's far more likely that the receivers will have taken over and will pay the bill by that point.
Was that competitor priced competitively with AWS? I think of the project management triangle here - good, fast, or cheap - pick two. AWS would be fast and cheap.
Yes, good point. Pricing is a bit higher. As another reply pointed out: there's ~three that work on the same scale. This was one, another hint I guess: it's mostly B2B. Normal people don't typically go there.
Azure, from my experience with it has stuff go down a lot and degrades even more. Seems to either not admit the degradation happened or rely on 1000 pages of fine print SLA docs to prove you don't get any credits for it. I suppose that isn't the same as "lose a region resiliency" so it could still be them given the poster said it is B2B focused and Azure is subject to a lot of exercises like this from it's huge enterprise customers. FWIW I worked as a IaC / devops engineer with the largest tenant in one of the non-public Azure clouds.
My $3/mo AWS instance is far cheaper than any DIY solution I could come up with, especially when I have to buy the hardware and supply the power/network/storage/physical space. Not to mention it's not worth my time to DIY something like that in the first place.
False equivalence/moving goalposts IMO... I was only refuting your claim of "AWS is not cheap", as if it's somehow impossible for it to be cheap... which I'm saying isn't the case.
Sorry to jump in y'alls convo :) AWS is cheaper than the Cloud we built... I just don't think it's significant. Ours cost more because businesses/governments would pay it, not because it was optimal.
Price is beside my original point: Amazon has enjoyed decades for arbitrage. This sounds more accusatory than intended: the 'us-south-1' problem exists because it's allowed/chosen. Created in 2006!
Now, to retract that a bit: I could see technical debt/culture making this state of affairs practical, if not inevitable. Correct? No, if I was Papa Bezos I'd be incredibly upset my Supercomputer is so hamstrung. I think even the warehouses were impacted!
The real differentiator was policy/procedure. Nobody was allowed to create a service or integration with this kind of blast area. Design principles, to say the least. Fault zones and availability zones exist for a reason beyond capacity, after all.
Right, like I said: crazy. Anything production with certain other clouds must be multi-AZ. Both reinforced by culture and technical constraints. Sometimes BCDR/contract audits [zones chosen by a third party at random].
The disconnect case was simple: breakage was as expected. The island was lost until we drew it on the map again. Things got really interesting when it was a full power-down and back on.
Were the docs/tooling up to date? Tough bet. Much easier to fix BGP or whatever.
Their auction systems are interesting to dig through, but to your point, everything fails. Especially these older auction systems. Great price/service, though. Less than an hour for more than one ad-hoc RAID card replacement
Yeah, I really want one of their dedicated servers, but it's a bit too expensive for what I use it for. Plus, my server is too much of a pet, so I'm spoiled on the automatic full-machine backups.
Pulp is a popular project for 'one stop shop', I believe. Personally, always used project-specific solutions like 'distribution/distribution' for containers from the CNCF. This allows for pull-through caching with relatively little setup work.
Pull-through caches are still useful even when the upstream is down... assuming the image(s) were pulled recently. The HEAD to upstream will obviously fail [when checking currency], but the software is happy to serve what it has already pulled.
Depends on the implementation, of course: I'm speaking to 'distribution/distribution', the reference. Harbor or whatever else may behave differently, I have no idea.
The 'egregious' things are charging to share what will fit very well in SCM (preventing real automation)... and breaking due to Online First/only. It makes sense to require the endpoint I'm talking to. Why would Postman need AWS/us-east-1 [0] for a completely unrelated API? Joyful rent-seeking.
cURL, your suggestion (hurl), or HTTPie all make far more sense. Store what they need in the place where you already store stuff. Profit, for free: never go down. Automate/take people out of the button-pushing pattern for a gold star.
0: https://news.ycombinator.com/item?id=45645172