Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or you could, you know, host a Docker registry and reupload those images to something you control. Worst case scenario, in 30 days, nothing is gone from Docker and you can just spin it down.

Your job as an SRE is not to look at things and go "oh well, nothing we can do lol".



i didn't read it as that, they were stating the realities, assumptions were made, those assumptions are now invalid, they are working on alternatives, 30days is a short deadline for something like this and docker as an organization is behaving poorly.

all of that seems pretty true, and frankly no one should support a company that does something like this. I get they need to figure out how to make money, but time has shown the worst way to do that is to screw over customers or potential customers.

I like the poster will never trust docker, and will never use their tooling or formats, pod-man all the way.


Earlier events already had us slowly switching out docker for pod man, and the tooling is more similar than I had expected. Half of the work is ensuring the images are explicitly prefixed with docker.io/

And this week it turns out that makes the now problematic spots a lot more greppable


Yes, that involves ripping out Docker Hub everywhere. It's a significant chunk of work, not something easily fit into 30 days on a team that is already strapped for resources with more work than we can do.


Setting up harbor as a docker proxy-cache is actually quite simple



I'm not familiar with how Docker works, so forgive the ignorance. I thought the point of docker images was portability? Is it not just taking the references and pointing to a new instance under your control?


Most production workloads do not use docker directly, but rather use it as sorts of "installation format" that other services schedule (spin up, connect, spin down, upgrade). One of typical defaults is to always try and pull new image even if requested version is available in node-local cache. On one hand it prevents issues where services would fail to start on certain nodes in the event of repository downtime. On the other hand it blocks service startup altogether. With such a set up availability of registry is mission critical for continuous operation.

Some people think it is a perfectly reasonable idea to set up defaults to always pull, point to latest version and not have local cache/mirror. Judging from the number of upvotes on OP, depending on third party remote without any SLA to be always available for production workloads seems to be the default.


I'm not too familiar with docker myself, but gitlab's selfhosted omnibus includes a container registry that Just Works™ for our small team.


That’s unplanned work. There’s other work needing to be done as well.


And a sudden fire is also unplanned work, but that's still your work. If this is such a threat, then maybe shift priorities around.


Are we not allowed to complain about unnecessary unplanned work being foisted on us with 30 days notice?

That seems like an entirely relevant complaint for this forum but from your first reply, you’re acting like somehow it’s the greatest offense in the world that someone pointed this out.


It’s not like using cloud services without suitable contractual agreements isn’t a known risk?


Sure. It’s a risk. But that doesn’t somehow make this work expected and planned, not invalidate the original comment.

It could have happened at any time. But it’s also been running for a decade now so there’s an expectation that things will continue rather than have the rug pulled with 30 days notice.


Come on, 30 days notice is a walk in the park. Additionally, OP was the one complaining that changing a few URLs and eventually spinning up a new server. It's quite literally a one day or two job, unless you're at a company the size of Amazon (in which case, luckily for you, you're not the only SRE, so it's still just a few days).

> The best I can come up with, at the moment, is waiting for each organization to make some sort of announcement with one of "We've paid, don't worry", "We're migrating, here's where", or "We've applied to the open source program". And if organizations don't do that... I mean, 30 days isn't enough time to find alternatives and migrate.

This is the original comment. The best they can come up with is... do nothing and wait to see if the smoke turns into a fire ? I've seen better uses of time. 30 days is enough time to find an alternative, migrate _and_ get regular coffee breaks too.


> Come on, 30 days notice is a walk in the park

Sure, maybe in a small business or startup, and even then I'd content not quite as easy as all that.

When you're dealing with anything larger, say involving multiple teams, organisations, and priorities, 30 days is an insanely short shrift to look at figuring out what your actual route forwards is (and if you're provisioning something new, making sure you're allowed to and have any relevant sign-offs etc.)

This particular situation with Docker doesn't affect us, but if it did this would have some serious knock on implications. The teams in my org are already busy with things that need to GA by certain dates or there will be financial implications. It's not "tire fire" but in most cases it's solid "don't waste time" territory. There's always flex in the schedule, but the closer to a GA date you get the more rigid the schedule has to be.


If you're unable to take on a task that has a 30 day deadline in your org, regardless of size, you're experiencing a good amount of bloat.


You're absolutely right! But bloat is also incredibly common, especially when "task" in this case might describe "multi-team project." Can it get done in 30 days? Of course! Might you already have dozens of high-priority projects to deliver in the next 30 days with stakeholders screaming at you every day for updates? Absolutely!


You should read The Phoenix Project to understand that "have dozens of high-priority projects to deliver in the next 30 days" is a consequence of poor management, not a given even for large organizations.


I've read TPP. What I'm saying is, most companies are Parts Unlimited before their awakening and a lot of us are Brent. I think the reason that book resonated so well the dysfunction described there is a lived reality for most IT workers.


Fair, but I’m also fair to call that out, I think!

And I do try to do less Brent things, I think there was a degree of amnesty given to Brent that we can improve on. We can all be more Bill, even if we’re still SMEs.

Something something managing up, or getting the kinds of Bill jobs to make the changes.


> Sure, maybe in a small business or startup, and even then I'd content not quite as easy as all that.

Not enough time even in this case; for some (apathetic) people, 30 days isn't actually 30 days.

Every now and then I'll come across the kind of small org where the one person (euphemistically and sarcastically) referred to as SRE only checks email once a month because they've "evolved beyond primative tech" or wtf ever, then get mad about it like it was a conspiracy rather than self-sabotage.

90 days minimum for overt assaults on stuff that some ppl may require to keep their doors open. This kind of shit is enough to possibly mess with people's livelyhoods in edge cases. Personally I never used Docker. I'm kinda paranoid so "free" stuff not backed by some kind of legal guarantee like open source licensing always seems sorta shady.


At this size you should have a local registry that acts as a transparent cache. If you don't, then get one right now. What happens if Docker's servers are down for whatever reason? Does your whole process break?


Sorry, didn't mean to imply that it is actually affecting us or even a concern. It isn't. I was just calling out that 30 days isn't just "simple" as the parent poster was asserting.


30 days is nowhere near enough times for people with real jobs that have other things to do rather than drop everything to do this. Once again, completely needlessly.

You’re making a mountain out of an entirely valid complaint.

Quoting your own profile, stay mad.


“Tell me you’ve never worked an enterprise tech job without telling me you’ve never worked an enterprise tech job.”

My next 30 days are already accounted for, and will already include disruptions that actually come from the area of work.


Heh. "But shouldn't an enterprise have all of these things figured out and mirrored and also pay money to Docker Inc"?

Should? Certainly. But guess what kind of emergencies it takes to get these things finally prioritized and what kinda mad scramble ensues from there to kinda hold it together.


Looks like some of the planned work's getting pushed back a little bit. That's the company paying the price for saving money earlier in exchange for taking on more risk. Decent chance the move was still net-beneficial for them.

Either way, it's the company reaping the benefits and paying the costs, not me.

It would have been nice and more confidence-generating for Docker to make this a 90-day notice rather than 30, but I'm not going to get upset that some of the work my company wants to do will get done slightly later for reasons having to do with their own penny-pinching and some 3rd party's somewhat-rude notice period for termination of service to entities that aren't even us and who weren't paying them. Job's a job. You want me to fix this and delay some other thing, or let it break and do some other thing instead? Fix it? Cool, no problem.


So, you're 100% booked with no room for anything? Congratulations on your management for understaffing your team and expecting you to do 120% if something happens, they're saving up quite a bit of money.

You might want to start respecting your free time though, because they clearly don't give a shit about you.


Did you stop reading that sentence halfway through? There's room for disruptions in that schedule. But that room isn't infinite and this is going to add a lot more disruption on top of the existing expected amount.


They didn’t ask for a perfectly reasonable explanation.


What if you already have important planned and unplanned urgent work occupying all your SRE'S for the month? On a team or org that's already running thin? Surely you've been there.


I have. And it was also my job to say to management "hey, there's a very preoccupying fire right there, and it will delay this less important thing. If you're unhappy about it, send me an email explicitly telling me drop the very preoccupying fire."

"Everybody has a plan until you get punched in the mouth" also applies to tech.


I think this thread was started by the manager that has to hear that push back, hence the headaches.


Getting punched in the mouth is pretty far from a walk in the park.


In this case it is entirely planned work: Anyone depending on docker.io chose to make their processes dependent on online endpoints with whose operators you have no business relationship. An unpaid third-party service going offline should be far from unexpected and if you rely on it you better be ready to cope without notice.

This is like complaining that you have to put out a fire because rather than fixing the sparking cables you have been relying on your neighbor to put them out before the become noticeable and he only gave you a short notice that he'd be going on vacation.


> Anyone depending on docker.io chose to make their processes dependent on online endpoints with whose operators you have no business relationship.

This does not somehow make the work “planned”. That has a specific definition and this ain’t it.

Some people may have called it out as a risk when it was implemented. But that still doesn’t mean it’s planned.

Someone may have included an explicit report on how to deal with it at that time. That still doesn’t make it planned.

Also, just because it’s known to be a risk and may have a chance to happen in the future does not make it expected either. Nor planned.


If someone were to set my server room on fire, I'd be equally annoyed about them.


Yeah, but with a sudden fire there's a good chance there's at least a rudimentary disaster recovery plan lying around to kick things off. How many ppl have a DRP for "docker bends open source over and goes smash?"


You’re missing the mark. It’s about risk and expectation management and one the risks just blew up in an unexpected way.


Yes, that’s what unplanned work means.


I'm sure docker will happily hold off on this work until you can fit it into your OKR planning next quarter. /s


This exactly. We have pipelines designed specifically for this reason. We pull, patch, and perform minor edits to images we use. We then version lock all-the-things for consistency.

Not saying this is good news, but in the Enterprise, you have to plan for shit like this.


Imagine you shipped software that included references to docker hub images. That software will no longer work if any of the referenced images are deleted from docker hub. This will be the case with any helm charts that reference images that are deleted from docker hub.

Some of those charts will not have variables that let you override the docker images and tags, so some of those will not be usable without creating a new release.

This is one of the primary reasons to vendor your third party docker images into a docker registry that you control.


"Don't release software that can pull code from random services on the internet then execute it without making that configurable" has been standard since the internet was available, just about.

Vendor your helm charts if they are production critical. Vendor the docker images if they are production critical. Vendor the libraries if they are critical.

As an added bonus, you even help making a saner internet where you don't pull left-pad three billion times a month.


Yes vendor them all too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: