More

jamesblonde · 2025-11-02T08:20:19 1762071619

I have to be contrarian here. The students were right. You didn't need to learn to implement backprop in NumPy. Any leakiness in BackProp is addressed by researchers who introduce new optimizers. As a developer, you just pick the best one and find good hparams for it.

_diyar · 2025-11-02T08:36:22 1762072582

From the perspective of the university, the students are being trained to become researchers, not engineers.

froobius · 2025-11-02T09:01:54 1762074114

> Any leakiness in BackProp is addressed by researchers who introduce new optimizers

> As a developer, you just pick the best one and find good hparams for it

It would be more correct to say: "As a developer, (not researcher), whose main goal is to get a good model working — just pick a proven architecture, hyperparameters, and training loop for it."

Because just picking the best optimizer isn't enough. Some of the issues in the article come from the model design, e.g. sigmoids, relu, RNNs. And some of the issues need to be addressed in the training loop, e.g. gradient clipping isn't enabled by default in most DL frameworks.

And it should be noted that the article is addressing people on the academic / research side, who would benefit from a deeper understanding.

HarHarVeryFunny · 2025-11-02T15:18:10 1762096690

The problem isn't with backprop itself or the optimizer - it's potentially in (the dervatives of) the functions you are building the neural net out of, such as the Sigmoid and ReLU examples that Karpathy gave.

Just because the framework you are using provides things like ReLU doesn't mean you can assume someone else has done all the work and you can just use these and expect them to work all the time. When things go wrong training a neural net you need to know where to look, and what to look for - things like exploding and vanishing gradients.

gchadwick · 2025-11-02T08:58:06 1762073886

It's for a CS course at Stanford not a PyTorch boot camp. It seems reasonable to expect some level of academic rigour and need to learn and demonstrate understanding of the fundamentals. If researchers aren't learning the fundamentals in courses like these where are they learning them?

You've also missed the point of the article, if you're building novel model architectures you can't magic away the leakiness. You need to understand the back prop behaviours of the building blocks you use to achieve a good training run. Ignore these and what could be a good model architecture with some tweaks will either entirely fail to train or produce disappointing results.

Perhaps you're working at a level of bolting pre built models together or training existing architectures on new datasets but this course operates below that level to teach you how things actually work.

PeterStuer · 2025-11-02T08:48:36 1762073316

The problem with your reasoning is you never tackle your "unknown unknowns". You just assume they are "known unknowns".

Diving through the abstraction reveals some of those.

vrighter · 2025-11-05T14:36:08 1762353368

And where do the researchers supposed to come from, exactly?

jamesblonde · 2025-10-30T18:27:10 1761848830

You forgot using your digital infrastructure for extractive rent from your partners, driving them to build their own sovereign digital infrastructure.

chairmansteve · 2025-10-31T00:45:52 1761871552

"driving them to build their own sovereign digital infrastructure".

Is that happening though?

frameset · 2025-11-02T14:50:10 1762095010

Yes. https://www.bmi.bund.de/SharedDocs/pressemitteilungen/EN/202...

jamesblonde · 2025-10-29T10:42:38 1761734558

The leading European ECommerce Company, Zalando with 50m users, is now using the leading European AI platform, Hopsworks, to power their real-time AI. Zalando are Databricks largest EU customer, but they are using Hopsworks instead for operational AI.

You would never hear it, though, as European IT press only promotes SV startups

https://www.youtube.com/watch?v=u8QFiLhnuFg&feature=youtu.be

Disclaimer, i work at Hopsworks.

jamesblonde · 2025-10-27T11:43:25 1761565405

I went on this train as well back in the late 1990s. It was a surreal experience that you get off the train on a ferry!

jamesblonde · 2025-10-24T20:40:21 1761338421

With 2 modern NVMe disks per host (15 GB/s) and pcie 5.0, it should only take 15s to read 30 TB into memory on 63 hosts.

You can find those disks on Hetzner. Not AWS, though.

jiggawatts · 2025-10-25T01:36:19 1761356179

I don’t understand why both Azure and AWS have local SSDs that are an order of magnitude slower than what I can get in a laptop. If Hetzner can do it, surely so can they!

Not to mention that Azure now exposes local drives as raw NVMe devices mapped straight through to the guest with no virtualisation overheads.

jamesblonde · 2025-10-25T13:09:11 1761397751

It would undercut all their higher level services - like DynamoDB, CosmosDB, etc.

Databases would suddenly go BRRR in the cloud and show up cloud-native (S3) based databases for the high latency services they are.

jamesblonde · 2025-10-24T20:37:00 1761338220

Ireland is still pretty much free of mosquitoes. I think it is the wind and exposure that keeps them out

jamesblonde · 2025-10-23T20:03:51 1761249831

Do you think they will let the Democrats take control given the risk to them if they take control? I see Gerrymandering after the supreme court annuls the voting rights acts. And then more shennanigans for a third term.

seanmcdirmid · 2025-10-23T20:39:30 1761251970

That's why I premised this with "If America survives at all". There is definitely a possibility that the whole country just falls apart. A constitutional convention is more of a best case scenario.

Gerrymandering is only relevant for congressional house elections, it can't protect the senate and doesn't influence the presidency. Usually one party will take control of all three branches in a huge swing in power, the house is the just the first to flip usually because it is re-elected every 2 years.

JumpCrisscross · 2025-10-23T22:12:57 1761257577

> constitutional convention is more of a best case scenario

Constitutional Convention is the abort button. It means giving a group of people basically limitless power to amend our Constitution, which in practice, means to do anything to the law. If we called one today, with most states in Republican hands [1], we’d be essentially handing complete control of our government—over and above the Constitution—to the GOP.

[1] https://www.ncsl.org/about-state-legislatures/state-partisan...

dragonwriter · 2025-10-23T22:19:16 1761257956

> Constitutional Convention is the abort button. It means giving a group of people basically limitless power to amend our Constitution

No, it doesn’t.

It gives a group of people basically limitless power to propose Amendments to the Constitution.

Any Amendments so proposed still require 3/4 of states to ratify them, either by votes of their legislature or by ratification conventions called in the states (at the option of Congress when calling the Convention at the request of states.)

Unless by "group of people" you mean not just the people in the national convention, but the people in the state legislatures or conventions, as well. But, at that point, you might as well say that by including an amendment process, the Constitution itself “gives a group of people basically limitless power to amend our Constitution”.

JumpCrisscross · 2025-10-24T02:06:21 1761271581

> It gives a group of people basically limitless power to propose Amendments to the Constitution

Sorry, I actually missed this. Thank you for clarifying. (I mixed it up with the New York State process, where the Convention's proposals go straight to popular ratification.)

jamesblonde · 2025-10-23T20:00:59 1761249659

Whataboutism. Has to be called out.

cjbgkagh · 2025-10-23T20:13:24 1761250404

What about the whataboutism of whataboutism? I.e. meta-whataboutism

The use of whataboutism and the ‘calling out’ of whataboutisms are both mechanisms of narrative control.

actionfromafar · 2025-10-24T06:43:18 1761288198

”Using logic and ethics against me is unfair?”

Friends of ”I thought there would be no fact checking ”

nailer · 2025-10-24T12:24:24 1761308664

The person is making a valid point about the inconsistency in how non compliance was handled in traditional finance and blockchain.

ImPostingOnHN · 2025-10-24T15:47:02 1761320822

Perhaps folks think that is not a "valid point" here because it is off topic, seeking to distract from the topic of whether this particular guilty person should be punished.

Saying "so and so did it too and nothing happened" may be correct, but doesn't address the topic. If you're saying that, how does it apply to the topic (the Binance founder)?

Are you saying that you're ok with the other people getting away with it, and thus you're ok with this guy also getting away with it via this purchased pardon?

Or are you saying those other people should have been punished, and thus this pardon was wrong to sell?

nwienert · 2025-10-25T08:34:08 1761381248

It’s of course on topic to talk about the bigger picture of whether people in general are charged with these specific types of crimes or should be.

I hate the whole fallacy callout stuff in general. God didn’t create them, half barely work, none work in every situation, and they’re just abused to death by people to shut down conversation in a shallow way.

ImPostingOnHN · 2025-10-26T17:16:17 1761498977

Saying "so and so did a thing too and nothing happened" may be correct, but doesn't address the topic. If you're saying that, how does it apply to the topic (the Binance founder)?

In that scenario, are you saying that you're ok with the other people getting away with it, and thus you're ok with this guy also getting away with it via this purchased pardon?

Or are you saying those other people should have been punished, and thus this pardon was wrong to sell?

Without tying it back to the topic like that, the reply is only tangentially related, like replying "I go to a bank" to any topic that mentions or involves banks. Like, ok, great, at least it's not insulting posters, but not super constructive in discussing the topic (the Binance founder's crimes and pardon).

ImPostingOnHN · 2025-10-24T05:54:58 1761285298

Whataboutism is a rhetological fallacy. Making fallacies is bad. Pointing out fallacies (or as you put it, "narrative control") is good.

cjbgkagh · 2025-10-24T13:37:09 1761313029

Controlling the narrative is often the point of using rhetorical fallacies.

Like the intolerance of intolerance there is discretion over what are acceptable intolerances. With whataboutism there is discretion over what are acceptable appeals to hypocrisy.

Whatsboutism is merely saying your appeal to hypocrisy is invalid. Which would hold more weight with me if the side saying it never made appeals to hypocrisy of their own. Otherwise they’re being hypocritical about making appeals to hypocrisy.

On the substance, I hate what Trump has done. I would not take the position that what Trump did is ok because of what Biden did.

ImPostingOnHN · 2025-10-24T15:35:47 1761320147

An "appeal to hypocrisy" is always invalid because that term is essentially synonymous with whataboutism and 'tu quoque' fallacies

The underlying failure of all of them, and why they are fallacies, is because the guilt of one side of an issue has nothing to do with the hypocrisy of the other. Thus, trying to pivot from the former to the latter is a distraction, rather than a genuine attempt to discuss the topic (which is the former).

> Which would hold more weight with me if the side saying it never made appeals to hypocrisy of their own.

"The side"? Dude, I'm a person, not a side, and you barely know anything about me, much less what I've done and do. The accusations of whataboutism weren't made by nebulous, ethereal concepts, they were made by people.

You can't pick and choose different behaviors of different people and lump them together as if they are the same person, then claim that the differing behaviors indicate some sort of hypocrisy or other conflict. What you're describing is diversity of thought among different people.

Even if you could, it's not even a good discussion, because then others could respond that your meta-criticism is itself hypocritical in the same way that you're responding to criticism by claiming that it is hypocritical. So you keep adding layers until the actual topic (the original criticism) is long forgotten. In fact, that is why whataboutism is used: its users don't want to focus on the original criticism.

cjbgkagh · 2025-10-24T16:07:07 1761322027

Sounds like what I said, a form of narrative control. A charge I made of both appeals to hypocrisy and appeals to whataboutism.

My personal preference would be logical and factual discussions only but I accept that’s not the world we live in.

The deleterious effect of the additional layers is ameliorated by the nesting of information on HN, you don’t have to keep digging if you don’t want to.

ImPostingOnHN · 2025-10-24T16:27:44 1761323264

> Sounds like what I said, a form of narrative control.

Yes, whataboutism is both a rhetological fallacy and a form of "narrative control".

> A charge I made of both appeals to hypocrisy and appeals to whataboutism.

"Appeal to whataboutism" isn't a thing. It's just called "whataboutism", and since whataboutism and "appeal to hypocrisy" (seems synonymous with whataboutism) are both fallacies, pointing them out is just called "pointing out fallacies". Fallacies don't need any 'appeals' or arguments made against them, because they are already fallacious, that's why we call them fallacies.

And yes, pointing out fallacious arguments could be called "narrative control", too ;) So could be saying anything! After all, anyone saying anything is trying to "control the narrative" to include that thing. What a silly, needlessly conspiratorial neologism for a uselessly vague concept!

cjbgkagh · 2025-10-24T16:53:14 1761324794

The discretion of the validity of pointing out hypocrisy is the core of the issue. I would 100% agree with you if all discussions are purely logical, but they’re not, so I don’t.

The introduction of whataboutism into the lexicon was to counter Russian appeals to hypocrisy. This was linked to Trump in an effort to discredit both. Those of us who have long memories do remember a time when pointing out the hypocrisy of the West was considered a valid thing to do. See the work of Noam Chomsky as an example.

ImPostingOnHN · 2025-10-24T17:10:25 1761325825

A Tu Quoque fallacy, or as you put it, "pointing out hypocrisy", is a fallacy. Some people do not like logical discussions, so they use that fallacy anyways. That is totally ok, I'm not the boss of them.

But, it will likely be pointed out that their fallacious arguments are fallacious, and at that point, they can choose to make valid arguments, or to continue their string of failures by unconvincingly making more fallacious ones.

> The introduction of whataboutism into the lexicon was to counter Russian appeals to hypocrisy

Cool! "Appeal to hypocrisy", or Tu Quoque, is a fallacy and any arguments invoking it are accordingly fallacious. Coining another synonym ("whataboutism") doesn't change things. Those making fallacious arguments can try to make valid arguments (if there are any) for their case next time.

> Those of us who have long memories do remember a time when pointing out the hypocrisy of the West was considered a valid thing to do

I've got a long memory, too, and the Tu Quoque fallacy was never a valid defense, no matter what you called it. That's what makes it a fallacy.

Onavo · 2025-10-23T20:09:48 1761250188

Doesn't make the double standards any less true.

jamesblonde · 2025-10-20T18:47:11 1760986031

It's a market regulation failure. Which results in a failed market, with the cloud infra provider also providing data services. 20 years ago, there were 20+ widely used operational databases. Now, it's like DynamoDB with like half the market.

conductr · 2025-10-20T19:41:09 1760989269

How should this have played out in a regulated market? DynamoDB gets released, then what? Has limits on the market share it's allowed to steal?

Should we similarly cap say Front End frameworks on market penetration / growth? Is react too big to fail? Do we need to force some of it's users to use something else?

jimbokun · 2025-10-20T20:38:07 1760992687

What would these regulations say, exactly?

jamesblonde · 2025-10-19T08:32:41 1760862761

Is it throughput and latency that are the etcd bottlenecks? Our database, RonDB, is an in-memory open-source database (a fork of MySQL Cluster). We have scaled it to 100m reads/sec on AWS hardware (not even top of the line). Might be an interesting project to implement an open-source etcd shim on top of it?

Reference: https://www.rondb.com/post/100m-key-lookups-sec-with-rest-ap...

davidgl · 2025-10-19T09:56:14 1760867774

See https://github.com/k3s-io/kine, k3s uses this to shim etcd to MySQL, Postgres and sqlite

nonameiguess · 2025-10-19T14:02:55 1760882575

The setting is configurable, but by default, etcd's Raft implementation requires a voting node to write to disk before it makes a vote, as in actually flushing to disk, not just writing to the file cache. Since you need a majority vote before a client can get a response, this is why it's strongly recommended you use the fastest possible disks, keep the nodes geographically close to each other, and etcd's default storage is only 2GB per node.

All in all, it was a poor choice for Kubernetes to use this as its backend in the first place. Apparently, Google uses its own shim, but there is also kine, which was created a long time ago for k3s and allows you to use a RDBMS. k3s used sqlite as its default originally, but any API equivalent database would work.

We should keep in mind etcd was meant to literally be the distributed /etc directory for CoreOS, something you would read from often but perform very few writes to. It's a configuration store. Kubernetes deciding to also use it for /var was never a great idea.

jamesblonde · 2025-10-19T19:10:04 1760901004

RonDB uses a non-blocking 2PC algorithm - commits in memory, and then does a group commit of transactions to disk every 500ms. This means it can handle insane write throughput, as well as read throughput. However, if both your DB nodes fail, you could lose 500ms of data - which is not the end of the world for k8s. Normally, you would locate DB nodes in different AZes, reducing the probabilty of correlated failures.

__turbobrew__ · 2025-10-19T21:17:06 1760908626

At that point it is apples to oranges. One of the main reasons why etcd writes are slow is because they are guaranteed to be durably persisted across the quorum.

If you just turned off file system syncs in etcd you could probably get an order of magnitude better performance as well.