I have to be contrarian here. The students were right. You didn't need to learn to implement backprop in NumPy. Any leakiness in BackProp is addressed by researchers who introduce new optimizers. As a developer, you just pick the best one and find good hparams for it.
> Any leakiness in BackProp is addressed by researchers who introduce new optimizers
> As a developer, you just pick the best one and find good hparams for it
It would be more correct to say: "As a developer, (not researcher), whose main goal is to get a good model working — just pick a proven architecture, hyperparameters, and training loop for it."
Because just picking the best optimizer isn't enough. Some of the issues in the article come from the model design, e.g. sigmoids, relu, RNNs. And some of the issues need to be addressed in the training loop, e.g. gradient clipping isn't enabled by default in most DL frameworks.
And it should be noted that the article is addressing people on the academic / research side, who would benefit from a deeper understanding.
The problem isn't with backprop itself or the optimizer - it's potentially in (the dervatives of) the functions you are building the neural net out of, such as the Sigmoid and ReLU examples that Karpathy gave.
Just because the framework you are using provides things like ReLU doesn't mean you can assume someone else has done all the work and you can just use these and expect them to work all the time. When things go wrong training a neural net you need to know where to look, and what to look for - things like exploding and vanishing gradients.
It's for a CS course at Stanford not a PyTorch boot camp. It seems reasonable to expect some level of academic rigour and need to learn and demonstrate understanding of the fundamentals. If researchers aren't learning the fundamentals in courses like these where are they learning them?
You've also missed the point of the article, if you're building novel model architectures you can't magic away the leakiness. You need to understand the back prop behaviours of the building blocks you use to achieve a good training run. Ignore these and what could be a good model architecture with some tweaks will either entirely fail to train or produce disappointing results.
Perhaps you're working at a level of bolting pre built models together or training existing architectures on new datasets but this course operates below that level to teach you how things actually work.
The leading European ECommerce Company, Zalando with 50m users, is now using the leading European AI platform, Hopsworks, to power their real-time AI. Zalando are Databricks largest EU customer, but they are using Hopsworks instead for operational AI.
You would never hear it, though, as European IT press only promotes SV startups
I don’t understand why both Azure and AWS have local SSDs that are an order of magnitude slower than what I can get in a laptop. If Hetzner can do it, surely so can they!
Not to mention that Azure now exposes local drives as raw NVMe devices mapped straight through to the guest with no virtualisation overheads.
Do you think they will let the Democrats take control given the risk to them if they take control?
I see Gerrymandering after the supreme court annuls the voting rights acts. And then more shennanigans for a third term.
That's why I premised this with "If America survives at all". There is definitely a possibility that the whole country just falls apart. A constitutional convention is more of a best case scenario.
Gerrymandering is only relevant for congressional house elections, it can't protect the senate and doesn't influence the presidency. Usually one party will take control of all three branches in a huge swing in power, the house is the just the first to flip usually because it is re-elected every 2 years.
> constitutional convention is more of a best case scenario
Constitutional Convention is the abort button. It means giving a group of people basically limitless power to amend our Constitution, which in practice, means to do anything to the law. If we called one today, with most states in Republican hands [1], we’d be essentially handing complete control of our government—over and above the Constitution—to the GOP.
> Constitutional Convention is the abort button. It means giving a group of people basically limitless power to amend our Constitution
No, it doesn’t.
It gives a group of people basically limitless power to propose Amendments to the Constitution.
Any Amendments so proposed still require 3/4 of states to ratify them, either by votes of their legislature or by ratification conventions called in the states (at the option of Congress when calling the Convention at the request of states.)
Unless by "group of people" you mean not just the people in the national convention, but the people in the state legislatures or conventions, as well. But, at that point, you might as well say that by including an amendment process, the Constitution itself “gives a group of people basically limitless power to amend our Constitution”.
> It gives a group of people basically limitless power to propose Amendments to the Constitution
Sorry, I actually missed this. Thank you for clarifying. (I mixed it up with the New York State process, where the Convention's proposals go straight to popular ratification.)
Perhaps folks think that is not a "valid point" here because it is off topic, seeking to distract from the topic of whether this particular guilty person should be punished.
Saying "so and so did it too and nothing happened" may be correct, but doesn't address the topic. If you're saying that, how does it apply to the topic (the Binance founder)?
Are you saying that you're ok with the other people getting away with it, and thus you're ok with this guy also getting away with it via this purchased pardon?
Or are you saying those other people should have been punished, and thus this pardon was wrong to sell?
It’s of course on topic to talk about the bigger picture of whether people in general are charged with these specific types of crimes or should be.
I hate the whole fallacy callout stuff in general. God didn’t create them, half barely work, none work in every situation, and they’re just abused to death by people to shut down conversation in a shallow way.
Saying "so and so did a thing too and nothing happened" may be correct, but doesn't address the topic. If you're saying that, how does it apply to the topic (the Binance founder)?
In that scenario, are you saying that you're ok with the other people getting away with it, and thus you're ok with this guy also getting away with it via this purchased pardon?
Or are you saying those other people should have been punished, and thus this pardon was wrong to sell?
Without tying it back to the topic like that, the reply is only tangentially related, like replying "I go to a bank" to any topic that mentions or involves banks. Like, ok, great, at least it's not insulting posters, but not super constructive in discussing the topic (the Binance founder's crimes and pardon).
Controlling the narrative is often the point of using rhetorical fallacies.
Like the intolerance of intolerance there is discretion over what are acceptable intolerances. With whataboutism there is discretion over what are acceptable appeals to hypocrisy.
Whatsboutism is merely saying your appeal to hypocrisy is invalid. Which would hold more weight with me if the side saying it never made appeals to hypocrisy of their own. Otherwise they’re being hypocritical about making appeals to hypocrisy.
On the substance, I hate what Trump has done. I would not take the position that what Trump did is ok because of what Biden did.
An "appeal to hypocrisy" is always invalid because that term is essentially synonymous with whataboutism and 'tu quoque' fallacies
The underlying failure of all of them, and why they are fallacies, is because the guilt of one side of an issue has nothing to do with the hypocrisy of the other. Thus, trying to pivot from the former to the latter is a distraction, rather than a genuine attempt to discuss the topic (which is the former).
> Which would hold more weight with me if the side saying it never made appeals to hypocrisy of their own.
"The side"? Dude, I'm a person, not a side, and you barely know anything about me, much less what I've done and do. The accusations of whataboutism weren't made by nebulous, ethereal concepts, they were made by people.
You can't pick and choose different behaviors of different people and lump them together as if they are the same person, then claim that the differing behaviors indicate some sort of hypocrisy or other conflict. What you're describing is diversity of thought among different people.
Even if you could, it's not even a good discussion, because then others could respond that your meta-criticism is itself hypocritical in the same way that you're responding to criticism by claiming that it is hypocritical. So you keep adding layers until the actual topic (the original criticism) is long forgotten. In fact, that is why whataboutism is used: its users don't want to focus on the original criticism.
Sounds like what I said, a form of narrative control. A charge I made of both appeals to hypocrisy and appeals to whataboutism.
My personal preference would be logical and factual discussions only but I accept that’s not the world we live in.
The deleterious effect of the additional layers is ameliorated by the nesting of information on HN, you don’t have to keep digging if you don’t want to.
> Sounds like what I said, a form of narrative control.
Yes, whataboutism is both a rhetological fallacy and a form of "narrative control".
> A charge I made of both appeals to hypocrisy and appeals to whataboutism.
"Appeal to whataboutism" isn't a thing. It's just called "whataboutism", and since whataboutism and "appeal to hypocrisy" (seems synonymous with whataboutism) are both fallacies, pointing them out is just called "pointing out fallacies". Fallacies don't need any 'appeals' or arguments made against them, because they are already fallacious, that's why we call them fallacies.
And yes, pointing out fallacious arguments could be called "narrative control", too ;) So could be saying anything! After all, anyone saying anything is trying to "control the narrative" to include that thing. What a silly, needlessly conspiratorial neologism for a uselessly vague concept!
The discretion of the validity of pointing out hypocrisy is the core of the issue. I would 100% agree with you if all discussions are purely logical, but they’re not, so I don’t.
The introduction of whataboutism into the lexicon was to counter Russian appeals to hypocrisy. This was linked to Trump in an effort to discredit both. Those of us who have long memories do remember a time when pointing out the hypocrisy of the West was considered a valid thing to do. See the work of Noam Chomsky as an example.
A Tu Quoque fallacy, or as you put it, "pointing out hypocrisy", is a fallacy. Some people do not like logical discussions, so they use that fallacy anyways. That is totally ok, I'm not the boss of them.
But, it will likely be pointed out that their fallacious arguments are fallacious, and at that point, they can choose to make valid arguments, or to continue their string of failures by unconvincingly making more fallacious ones.
> The introduction of whataboutism into the lexicon was to counter Russian appeals to hypocrisy
Cool! "Appeal to hypocrisy", or Tu Quoque, is a fallacy and any arguments invoking it are accordingly fallacious. Coining another synonym ("whataboutism") doesn't change things. Those making fallacious arguments can try to make valid arguments (if there are any) for their case next time.
> Those of us who have long memories do remember a time when pointing out the hypocrisy of the West was considered a valid thing to do
I've got a long memory, too, and the Tu Quoque fallacy was never a valid defense, no matter what you called it. That's what makes it a fallacy.
It's a market regulation failure. Which results in a failed market, with the cloud infra provider also providing data services. 20 years ago, there were 20+ widely used operational databases. Now, it's like DynamoDB with like half the market.
How should this have played out in a regulated market? DynamoDB gets released, then what? Has limits on the market share it's allowed to steal?
Should we similarly cap say Front End frameworks on market penetration / growth? Is react too big to fail? Do we need to force some of it's users to use something else?
Is it throughput and latency that are the etcd bottlenecks?
Our database, RonDB, is an in-memory open-source database (a fork of MySQL Cluster). We have scaled it to 100m reads/sec on AWS hardware (not even top of the line). Might be an interesting project to implement an open-source etcd shim on top of it?
The setting is configurable, but by default, etcd's Raft implementation requires a voting node to write to disk before it makes a vote, as in actually flushing to disk, not just writing to the file cache. Since you need a majority vote before a client can get a response, this is why it's strongly recommended you use the fastest possible disks, keep the nodes geographically close to each other, and etcd's default storage is only 2GB per node.
All in all, it was a poor choice for Kubernetes to use this as its backend in the first place. Apparently, Google uses its own shim, but there is also kine, which was created a long time ago for k3s and allows you to use a RDBMS. k3s used sqlite as its default originally, but any API equivalent database would work.
We should keep in mind etcd was meant to literally be the distributed /etc directory for CoreOS, something you would read from often but perform very few writes to. It's a configuration store. Kubernetes deciding to also use it for /var was never a great idea.
RonDB uses a non-blocking 2PC algorithm - commits in memory, and then does a group commit of transactions to disk every 500ms. This means it can handle insane write throughput, as well as read throughput. However, if both your DB nodes fail, you could lose 500ms of data - which is not the end of the world for k8s. Normally, you would locate DB nodes in different AZes, reducing the probabilty of correlated failures.
At that point it is apples to oranges. One of the main reasons why etcd writes are slow is because they are guaranteed to be durably persisted across the quorum.
If you just turned off file system syncs in etcd you could probably get an order of magnitude better performance as well.
reply