Hacker Newsnew | past | comments | ask | show | jobs | submit | jychang's commentslogin

Yes!

Otherwise you're just outsourcing your critical thinking to other people. A system of just "You will be punished for X" without analysis becomes "Derp, just do things that I won't be punished for". Or more sinister, "just hand your identification papers over to the officer and you won't be punished, don't think about it". Rule of power is not a recipe for a functional system. This becomes a blend of sociology and philosophy, but on the sociology side, you don't want a fear-based or shame-based society anyways.

Your latter example ("Most people aren't interested in torturing babies for sport and would have a strongly negative emotional reaction to such a practice") is actually a good example of the core aspect of Hume's philosophy, so if you're trying to avoid the philosophical logic discussion, that's not gonna work either. If you follow the conclusions of that statement to its implications, you end up back at moral philosophy.

That's not a bad thing! That's like a chef asking "how do i cook X" and understanding the answer ("how the maillard reaction works") eventually goes to chemistry. That's just how the world is. Of course, you might be a bit frustrated if you're a chef who doesn't know chemistry, or a game theorist who doesn't know philosophy, but I assure you that it is correct direction to look for what you're interested at here.


You did not correctly understand what I said. I am not saying that hunting babies for sport is immoral because you will get punished for it. I am saying that there isn't any useful knowledge about the statement "hunting babies for sport is bad" that requires a moral framing. Morality is redundant. The fact that you will get punished for hunting babies for sport is just one of the reasons why hunting babies for sport is bad. This is why I gave another example, "Most people aren't interested in torturing babies for sport and would have a strongly negative emotional reaction to such a practice". It is likely that you value human lives and would find baby-hunting disgusting. Again, a moral framing wouldn't add anything here. Any other reason for why "hunting babies for sport is bad" that you will come up with using your critical thinking will work without a moral framing.

"there isn't any useful knowledge" "Morality is redundant."

I strongly dispute this statement, and honestly find it baffling that you would claim as such.

The fact that you will be punished for murdering babies is BECAUSE it is morally bad, not the other way around! We didn't write down the laws/punishment for fun, we wrote the laws to match our moral systems! Or do you believe that we design our moral systems based on our laws of punishment? That is... quite a claim.

Your argument has the same structure as saying: "We don't need germ theory. The fact that washing your hands prevents disease is just one reason why you should wash your hands. People socially also find dirty hands disgusting, and avoid you as social punishment. Any reason you come up with for hand-washing works without a germ theory framing."

But germ theory is precisely why hand-washing prevents disease and why we evolved disgust responses to filth. Calling it "redundant" because we can list its downstream effects without naming it doesn't make the underlying framework unnecessary. It just means you're describing consequences while ignoring their cause. You can't explain why those consequences hold together coherently without it; the justified true belief comes from germ theory! (And don't try to gettier problem me on the concept of knowledge, this applies even if you don't use JTB to define knowledge.)


I'm not interested in wading into the wider discussion, but I do want to bring up one particular point, which is where you said

> do you believe that we design our moral systems based on our laws of punishment? That is... quite a claim.

This is absolutely something we do: our purely technical, legal terms often feed back into our moral frameworks. Laws are even created to specifically be used to change peoples' perceptions of morality.

An example of this is "felon". There is no actual legal definition of what a felony is or isn't in the US. A misdemeanor in one state can be a felony in another. It can be anything from mass murder to traffic infractions. Yet we attach a LOT of moral weight to 'felon'.

The word itself is even treated as a form of punishment; a label attached to someone permanently, that colors how (almost) every person who interacts with them (who's aware of it) will perceive them, morally.

Another example is rhetoric along the lines of "If they had complied, they wouldn't have been hurt", which is explicitly the use of a punishment (being hurt) to create an judgement/perception of immorality on the part of the person injured (i.e. that they must have been non-compliant (immoral), otherwise they would not have been being punished (hurt)). The fact they were being punished, means they were immoral.

Immigration is an example where there's been a seismic shift in the moral frameworks of certain groups, based on the repeated emphasis of legal statutes. A law being broken is used to influence people to shift their moral framework to consider something immoral that they didn't care about before.

Point being, our laws and punishments absolutely create feedback loops into our moral frameworks, precisely because we assume laws and punishments to be just.


> “Any reason you come up with for hand-washing works without a germ theory framing”.

This is factually correct though. However, we have other reasons for positing germ theory. Aside from the fact that it provides a mechanism of action for hand-washing, we have significant evidence that germs do exist and that they do cause disease. However, this doesn’t apply to any moral theory. While germ theory provides us with additional information about why washing hands is good, moral theory fails to provide any kind of e.g. mechanism of action or other knowledge that we wouldn't be able to derive about the statement “hunting babies for sport is bad” without it.

> The fact that you will be punished for murdering babies is BECAUSE it is morally bad, not the other way around! We didn't write down the laws for fun, we wrote the laws to match our moral systems! Or do you believe that we design our moral systems based on our laws of punishment? That is... quite a claim.

You will be punished for murdering babies because it is illegal. That’s just an objective fact about the society that we live in. However, if we are out of reach of the law for whatever reason, people might try to punish us for hunting babies because they were culturally brought up to experience a strong disgust reaction to this activity, as well as because murdering babies marks us as a potentially dangerous individual (in several ways: murdering babies is bad enough, but we are also presumably going against social norms and expectations).

Notably, there were many times in history when baby murder was completely socially acceptable. Child sacrifice is the single most widespread form of human sacrifice in history, and archaeological evidence for it can be found all over the globe. Some scholars interpret some of these instances as simple burials, but there are many cases where sacrifice is the most plausible interpretation. If these people had access to this universal moral axiom that killing babies is bad, why didn’t they derive laws or customs from it that would stop them from sacrificing babies?


You're conflating "evidence" for a theory with "what a theory explains". Germ theory provides a unifying framework that explains why hand-washing, sterilization, quarantine, and antibiotics all work, and allows us to predict which novel interventions will succeed; we're not just looking at germs under a fancy microscope. Before germ theory, miasma theory also "worked" in the sense that people could list downstream effects ("bad smells correlate with disease"), but it couldn't generate reliable predictions or explain why certain practices succeeded while others failed!

Moral frameworks function the same way. Without one, you have a disconnected list of "things that provoke disgust" and "things that get you punished"... but no way to reason about novel cases or conflicts between values, or explain why these various intuitions cluster together. Why does "hunting babies" feel similar to "torturing prisoners" but different from "eating chicken"? A moral framework provides the structure; raw disgust does not.

For child sacrifice: humans also once believed disease came from evil spirits, that the earth was the center of the universe, that heavier objects fall faster. Does the existence of these errors make physics and biology "redundant frameworks"? Obviously not. it means humans can be wrong, and can reason from false premises. Notice that even cultures practicing child sacrifice typically had strict rules about when, how, and which children could be sacrificed. This suggests they recognized the moral weight of taking a child's life! They just had false beliefs about gods, afterlives, and cosmic bargains that led them to different conclusions. They weren't operating without moral frameworks; they were operating with moral frameworks plus false empirical/metaphysical beliefs.

More importantly, your framework cannot account for moral progress! If morality is just "what currently provokes disgust," then the abolition of child sacrifice wasn't progress. It was merely a change in fashion, no different from skinny jeans becoming not skinny. But you clearly do think those cultures were wrong (you're citing child sacrifice as a historical horror, not a neutral anthropological curiosity). That normative judgment requires exactly the moral framework you're calling redundant.


Your response seems AI-generated (or significantly AI-”enhanced”), so I’m not going to bother responding to any follow-ups.

> More importantly, your framework cannot account for moral progress!

I don’t think “moral progress” (or any other kind of “progress”, e.g. “technological progress”) is a meaningful category that needs to be “accounted for”.

> Why does "hunting babies" feel similar to "torturing prisoners" but different from "eating chicken"?

I can see “hunting babies” being more acceptable to “torturing prisoners” to many people. Many people don’t consider babies on par with grown-up humans due to their limited neurological development and consciousness. Vice versa, many people find the idea of eating chicken abhorrent and would say that a society of meat-eaters is worse than a thousand Nazi Germanies. This is not a strawman I came up with, I’ve interacted with people who hold this exact opinion, and I think from their perspective it is justified.

> [Without a moral framework you have] no way to reason about novel cases

You can easily reason about novel cases without a moral framework. It just won’t be moral reasoning (which wouldn’t add anything in itself). Is stabbing a robot to death okay? We can think about in terms of how I feel about it. It’s kinda human-shaped, so I’d probably feel a bit weird about it. How would others react to me stabbing it this way? They’d probably feel similarly. Plus, it’s expensive electronics, people don’t like wastefulness. Would it be legal? Probably.


Honestly, yeah. I got lazy with your responses and just threw in a few bullet points to AI, because honestly it's clear you don't know anything about philosophy. It's like arguing code cleaniness with a new software engineer... it was way more tiring than it was intellectually stimulating. You're basically arguing a sort of moral anti-realism perspective but without any actual points like noncognitivism or whatever, because you're saying moral statements are still truth-apt (xyz is bad) but just... don't matter for some reason? It makes no sense.

At least the discussion with skissane was intellectually interesting, so I didn't bother using AI for those comments.

But seriously, you can just throw your entire conversation into AI and ask "who is philosophically and logically correct between these responses". Remove the usernames if you want a fair analysis. Even an obsolete AI like GPT-3.5 will be able to tell you the correct answer for that question. The reasoning is just... soooo obviously... similar to if a senior engineer looked at a junior engineer's code, and facepalmed. It looks like that, but replace "code" with "philosophical logic".

That's the best way I can break it to you, honestly, because it's probably the easiest way for you to get a neutral perspective. I'm genuinely not trying to be biased when I tell you that.


>I got lazy with your responses and just threw in a few bullet points to AI

This should legit be a permabannable offense. That is titanically disrespectful of not just your discussion partner, but of good discussion culture as a whole.


Then can we permaban people who pretend to be experts in topics they have no clue in? It's even more disrespectful of people who HAVE spent time learning the material.

You want good discussion? Jesus, I had to wade through that slop which was worse than AI slop.

He would have been fine if he just argued a typical moral anti-realism perspective "actually morality is not needed, and the reason is there's no such thing as truly evil", as that's debatable true in philosophy. I would have been fine with that... but THEN HE LITERALLY SHOOTS HIS OWN ARGUMENT IN THE FACE "but sacrificing kids is actually bad" (as truth-apt), and smugly declares shooting his own argument in the face as winning. I can't even. Except it wasn't a clean anti morality argument in the first place, so I didn't assume as much, except then every time he was clearly losing he retracted back into an anti-moral realism perspective. He could have just stayed there, although I would have expected something more like "it would not be objectively evil if Claude destroyed the world, since objective evil doesn't exist"!

Here's chatgpt's translation into dev speak, since I am an engineer, but I don't think I need to write this myself:

------

It’s like a developer insisting, with total confidence, that their system is “provably safe and robust”… and then, the moment they’re challenged, they:

turn off all error handling (try/catch removed because “exceptions are for cowards”),

add assert(false) in the critical path “to prove no one should reach here,”

hardcode if (prod) return true; to bypass the very check they were defending,

ship it, watch it crash instantly,

and declare victory because “the crash shows how seriously we take safety.”

In other words: they didn’t lose the argument because the idea was wrong—they lost because they disabled their own argument’s safety rails and then bragged about the wreck as a feature.

-----

WTF am I supposed to do there?

I can see why philosophers drink.


I'm on your side in this argument (approximately; asking what ethics even is and where it comes from can be productive but shouldn't conclude "and therefore AI agents working with humans don't need to integrate a human moral sense" -- at least that'd be a really bad conclusion to humanity as AI scales up).

Can't recommend letting an LLM write for you directly, though. I found myself skipping your third paragraph in the reply above.


That was always doomed for failure in the philosophy space.

Mostly because there's not enough axioms. It'd be like trying to establish Geometry with only 2 axioms instead of the typical 4/5 laws of geometry. You can't do it. Too many valid statements.

That's precisely why the babyeaters can be posited as a valid moral standard- because they have different Humeian preferences.

To Anthropic's credit, from what I can tell, they defined a coherent ethical system in their soul doc/the Claude Constitution, and they're sticking with it. It's essentially a neo-Aristotelian virtue ethics system that disposes of the strict rules a la Kant in favor of establishing (a hierarchy of) 4 core virtues. It's not quite Aristotle (there's plenty of differences) but they're clearly trying to have Claude achieve eudaimonia by following those virtues. They're also making bold statements on moral patienthood, which is clearly an euphemism for something else; but because I agree with Anthropic on this topic and it would cause a shitstorm in any discussion, I don't think it's worth diving into further.

Of course, it's just one of many internally coherent systems. I wouldn't begrudge another responsible AI company from using a different non virtue ethics based system, as long as they do a good job with the system they pick.

Anthropic is pursuing a bold strategy, but honestly I think the correct one. Going down the path of Kant or Asimov is clearly too inflexible, and consequentialism is too prone to paperclip maximizers.


Pretty much every serious philosopher agrees that “Do not torture babies for sport” is not a foundation of any ethical system, but merely a consequence of a system you choose. To say otherwise is like someone walking up to a mathematician and saying "you need to add 'triangles have angles that sum up to 180 degrees' to the 5 Euclidian axioms of geometry". The mathematician would roll their eyes and tell you it's already obvious and can be proven from the 5 base laws (axioms).

The problem with philosophy is that humans agree on like... 1-2 foundation level bottom tier (axiom) laws of ethics, and then the rest of the laws of ethics aren't actually universal and axiomatic, and so people argue over them all the time. There's no universal 5 laws, and 2 laws isn't enough (just like how 2 laws wouldn't be enough for geometry). It's like knowing "any 3 points define a plane" but then there's only 1-2 points that's clearly defined, with a couple of contenders for what the 3rd point could be, so people argue all day over what their favorite plane is.

That's philosophy of ethics in a nutshell. Basically 1 or 2 axioms everyone agrees on, a dozen axioms that nobody can agree on, and pretty much all of them can be used to prove a statement "don't torture babies for sport" so it's not exactly easy to distinguish them, and each one has pros and cons.

Anyways, Anthropic is using a version of Virtue Ethics for the claude constitution, which is a pretty good idea actually. If you REALLY want everything written down as rules, then you're probably thinking of Deontological Ethics, which also works as an ethical system, and has its own pros and cons.

https://plato.stanford.edu/entries/ethics-virtue/

And before you ask, yes, the version of Anthropic's virtue ethics that they are using excludes torturing babies as a permissible action.

Ironically, it's possible to create an ethical system where eating babies is a good thing. There's literally works of fiction about a different species [2], which explores this topic. So you can see the difficulty of such a problem- even something simple as as "don't kill your babies" can be not easily settled. Also, in real life, some animals will kill their babies if they think it helps the family survive.

[2] https://www.lesswrong.com/posts/n5TqCuizyJDfAPjkr/the-baby-e...


> Pretty much every serious philosopher agrees that “Do not torture babies for sport” is not a foundation of any ethical system, but merely a consequence of a system you choose.

Almost everyone agrees that "1+1=2" is objective. There is far less agreement on how and why it is objective–but most would say we don't need to know how to answer deep questions in the philosophy of mathematics to know that "1+1=2" is objective.

And I don't see why ethics need be any different. We don't need to know which (if any) system of proposed ethical axioms is right, in order to know that "It is gravely unethical to torture babies for sport" is objectively true.

If disputes over whether and how that ethical proposition can be grounded axiomatically, are a valid reason to doubt its objective truth – why isn't that equally true for "1+1=2"? Are the disputes over whether and how "1+1=2" can be grounded axiomatically, a valid reason to doubt its objective truth?

You might recognise that I'm making here a variation on what is known in the literature as a "companion in the guilt" argument, see e.g. https://doi.org/10.1111/phc3.12528


Strong disagree.

Your argument basically is a professional motte and bailey fallacy.

And you cannot conclude objectivity by consensus. Physicists by consensus concluded that Newton was right, and absolute... until Einstein introduced relativity. You cannot do "proofs by feel". I argue that you DO need to answer the deep problems in mathematics to prove that 1+1=2, even if it feels objective- that's precisely why Principa Mathematica spent over 100 pages proving that.

In fact, I don't need to be a professional philosopher to counterargue a scenario where killing a baby for sport is morally good. Consider a scenario: an evil dictator, let's say Genghis Khan, captures your village and orders you to hunt and torture a baby for sport a la "The Most Dangerous Game". If you refuse, he kills your village. Is it ethical for you to hunt the baby for sport? Not so black and white now, is it? And it took me like 30 seconds to come up with that scenario, so I'm sure you can poke holes in it, but I think it clearly establishes that it's dangerous to make assumptions of black and whiteness from single conclusions.

[1] https://en.wikipedia.org/wiki/Motte-and-bailey_fallacy


> Your argument basically is a professional motte and bailey fallacy.

No it isn't. A "motte-and-bailey fallacy" is where you have two versions of your position, one which makes broad claims but which is difficult to defend, the other which makes much narrower claims but which is much easier to justify, and you equivocate between them. I'm not doing that.

A "companion-in-the-guilt" argument is different. It is taking an argument against the objectivity of ethics, and then turning it around against something else – knowledge, logic, rationality, mathematics, etc – and then arguing that if you accept it as a valid argument against the objectivity of ethics, then to be consistent and avoid special pleading you must accept as valid some parallel argument against the objectivity of that other thing too.

> And you cannot conclude objectivity by consensus.

But all knowledge is by consensus. Even scientific knowledge is by consensus. There is no way anyone can individually test the validity of every scientific theory. Consensus isn't guaranteed to be correct, but then again almost nothing is – and outside of that narrow range of issues with which we have direct personal experience, we don't have any other choice.

> I argue that you DO need to answer the deep problems in mathematics to prove that 1+1=2, even if it feels objective- that's precisely why Principa Mathematica spent over 100 pages proving that.

Principia Mathematica was (to a significant degree) a dead-end in the history of mathematics. Most practicing mathematicians have rejected PM's type theory in favour of simpler axiomatic systems such as ZF(C). Even many professional type theorists will quibble with some of the details of Whitehead and Russell's type theory, and argue there are superior alternatives. And you are effectively assuming a formalist philosophy of mathematics, which is highly controversial, many reject, and few would consider "proven".


> But Principia Mathematica was (to a significant degree) a dead-end in the history of mathematics. Most practicing mathematicians have rejected PM's type theory in favour of simpler axiomatic systems such as ZF(C). Even many professional type theorists will quibble with some of the details of Whitehead and Russell's type theory, and argue there are superior alternatives. And you are effectively assuming a formalist philosophy of mathematics, which is highly controversial, many reject, and few would consider "proven".

Yeah, exactly. I intentionally set that trap. You're actually arguing for my point. I've spent comments writing on the axioms of geometry, and you didn't think I was familiar with the axioms of ZFC? I was thinking of bringing up CH the entire time. The fact that you can have alternate axioms was my entire point all along. Most people are just way more familiar with the 5 laws of geometry than the 9 axioms of ZFC.

The fact that PM was an alternate set of axioms of mathematics, that eventually wilted when Godel and ZF came along, underscores my point that defining a set axioms is hard. And that there is no clear defined set of axioms for philosophy.

I don't have to accept your argument against objectivity in ethics, because I can still say that the system IS objective- it just depends on what axioms you pick! ZF has different proofs than ZFC. Does the existence of both ZF and ZFC make mathematics non objective? Obviously not! The same way, the existence of both deontology and consequentialism doesn't necessarily make either one less objective than the other.

Anyways, the Genghis Khan example clearly operates as a proof by counterexample of your example of objectivity, so I don't even think quibbling on mathematical formalism is necessary.


> Consider a scenario: an evil dictator, let's say Genghis Khan, captures your village and orders you to hunt and torture a baby for sport a la "The Most Dangerous Game". If you refuse, he kills your village. Is it ethical for you to hunt the baby for sport?

You aren't hunting the baby for sport. Sport is not among your reasons for hunting the baby.


Actually, I think "The Most Dangerous Game" is a good analogy here. At the end of the story, the protagonist IS hunting for sport. He started off in fear, but in the end genuinely enjoyed it. So likewise- if you start off hunting a baby in fear, and then eventually grow to enjoy it, but it also saves your village, does that make it evil? You're still saving your village, but you also just derive dopamine from killing the baby!

This actually devolves into human neuroscience, the more I think about it. "I want to throw a ball fast, because I want to win the baseball game". The predictive processing theory view on the statement says that the set point at the lower level (your arm) and the set point at the higher level (win the baseball game) are coherent, and desire at each level doesn't directly affect the other. Of course, you'd have to abandon a homunculus model of the mind and strongly reject Korsgaard, but that's on shaky ground scientifically anyways so this is a safe bet. You can just say that you are optimizing for your village as a higher level set point, but are hunting for game at a slightly lower level set point.

Note that sport is not a terminal desire, as well. Is a NBA player who plays for a trophy not playing a sport? Or a kid forced to play youth soccer? So you can't even just say "sport must be an end goal".


To clarify my principle: "It is gravely wrong to inflict significant physical pain or injury on babies, when your sole or primary reason for doing so is your own personal enjoyment/amusement/pleasure/fun"

So, in your scenario – the person's initial reason for harming babies isn't their own personal enjoyment, it is because they've been coerced into doing so by an evil dictator, because they view the harm to one baby as a lesser evil than the death of their whole village, etc. And even if the act of harming babies corrupts them to the point they start to enjoy it, that enjoyment is at best a secondary reason, not their primary reason. So what they are doing isn't contravening my principle.


Well, now that's just moving the goalposts >:( I had a whole paragraph prepared in my head about how NBA players actually optimize for a greater goal (winning a tournament) than just sport (enjoying the game) when they play a sport.

Anyways, I actually think your statement is incoherent as stated, if we presume moral naturalism. There's clearly different levels set points for "you", so "sole reason" is actually neurologically inconsistent as a statement. It's impossible for "sole reason" to exist. This radically alters your framework for self, but eh it's not impossible to modernize these structural frameworks anyways. Steelmanning your argument: if you try to argue set point hierarchy, then we're back to the NBA player playing for a championship example. He's still playing even if he's not playing for fun. Similarly, hunting a baby for pleasure can still be hunting for a village, as The Most Dangerous Game shows.

More generally (and less shitposty), the refined principle is now quite narrow and unfalsifiable in practice, as a no true scotsman. How would you ever demonstrate someone's "sole or primary" reason? It's doing a lot of work to immunize the principle from counterexamples.


1.54GB model? You can run this on a raspberry pi.

Performance of LLM inference consists of two independent metrics - prompt processing (compute intensive) and token generation (bandwidth intensive). For autocomplete with 1.5B you can get away with abysmal 10 t/s token generation performance, but you'd want as fast as possible prompt processing, pi in incapable of.

if you mean on the new ai hat with npu and integrated 8gb memory, maybe.

Yeah, that was tried. It was called GPT-4.5 and it sucked, despite being 5-10T params in size. All the AI labs gave up on pretrain only after that debacle.

GPT-4.5 still is good at rote memorization stuff, but that's not surprising. The same way, GPT-3 at 175b knows way more facts than Qwen3 4b, but the latter is smarter in every other way. GPT-4.5 had a few advantages over other SOTA models at the time of release, but it quickly lost those advantages. Claude Opus 4.5 nowadays handily beats it at writing, philosophy, etc; and Claude Opus 4.5 is merely a ~160B active param model.


Maybe you are confused, but GPT4.5 had all the same "morality guards" as OAI's other models, and was clearly RL'd with the same "user first" goals.

True, it was a massive model, but my comment isn't really about scale so much as it is about bending will.

Also the model size you reference refers to the memory footprint of the parameters, not the actual number of parameters. The author postulates a lower bound of 800B parameters for Opus 4.5.


> and Claude Opus 4.5 is merely a ~160B active param model

Do you have a source for this?


> for Claude Opus 4.5, we get about 80 GB of active parameters

https://news.ycombinator.com/item?id=46039486

This guess is from launch day, but over time has been shown to be roughly correct, and aligns with the performance of Opus 4.5 vs 4.1 and across providers.


Wow. That's one of the clearest case of AI psychosis I've seen.

You probably need to improve your internal LLM detector then. This obviously reads as LLM generated text.

- "This isn't just a "status" bug. It's a behavioral tracker."

- "It essentially xxxxx, making yyyyyy."

- As you mentioned, the headings

- A lack of compound sentences that don't use "x, but y" format.

This is clearly LLM generated text, maybe just lightly edited to remove some em dashes and stuff like that.

After you read code for a while, you start to figure out the "smell" of who wrote what code. It's the same for any other writing. I was literally reading a New Yorker article before this, and this is the first HN article I just opened today; the writing difference is jarring. It's very easy to smell LLM generated text after reading a few non-LLM articles.


What's frustrating is the author's comments here in this thread are clearly LLM text as well. Why even bother to have a conversation if our replies are just being piped into ChatGPT??

There have been a few times I've had interactions with people on other sites that have been clearly from LLMs. At least one of the times, it turned out to be a non-native English speaker who needed the help to be able to converse with me, and it turned out to be a worthwhile conversation that I don't think would have been possible otherwise. Sometimes the utility of the conversation can outweigh the awkwardness of how it's conveyed.

That can said, I do think it would be better to be up front about this sort of thing, and that means that it's not really suitable for use on a site like HN where it's against the rules.


I've seen that as well. I think its still valuable to point out that the text feels like LLM text, so that the person can understand how they are coming across. IMO a better solution is to use a translation tool rather than processing discussions through a general-purpose LLM.

But agreed, to me the primary concern is that there's no disclosure, so it's impossible to know if you're talking to a human using an LLM translator, or just wasting your time talking to an LLM.


>What's frustrating is the author's comments here in this thread are clearly LLM text as well

Again, clearly? I can see how people might be tipped off at the blog post because of the headings (and apparently the it's not x, it's y pattern), but I can't see anything in the comments that would make me think it was "clearly" LLM-generated.


Honestly, I can't point out some specific giveaway, but if you've interacted with LLMs enough you can simply tell. It's kinda like recognizing someones voice.

One way of describing it is that I've heard the exact same argument/paragraph structure and sentence structure many times with different words swapped in. When you see this in almost every sentence, it becomes a lot more obvious. Similar to how if you read a huge amount of one author, you will likely be able to pick their work out of a lineup. Having read hundreds of thousands of words of LLM generated text, I have a strong understanding of the ChatGPT style of writing.


Just stop already with the LLM witch-hunt. Your personal LLM vibes don't equate to "obviously LLM generated".

My "LLM witch-hunt" got the prompter to reveal the reply they received, which we now learn is neither from Valve nor says "Won't Fix" but rather deems it not a security exploit by HackerOne's definition. It is more important than ever before to be critical of the content you consume rather than blindly believing everything you read on the internet. Learning to detect LLM writing which represents a new, major channel of misinformation is one aspect of that.

I'm not sure how you know you're correctly detecting LLM writing. My own writing has been "detected" because of "obvious" indicators like em-dashes, compound sentences, and even (remember 2024?) using the word "delve", and I assure you I'm 100% human. So the track record of people "learning to detect LLM writing" isn't great in my experience. And I don't see why I should have to change my entirely human writing style because of this.

Do you have any evidence that your witch hunt caused him to show that? It could have simply been your pointing out that Valve's response wasn't shown in the article. No witch-hunts needed.

Isn't that exactly what stopping SQL injection involves? No longer executing random SQL code.

Same thing would work for LLMs- this attack in the blog post above would easily break if it required approval to curl the anthropic endpoint.


No, that's not what's stopping SQL injection. What stops SQL injection is distinguishing between the parts of the statement that should be evaluated and the parts that should be merely used. There's no such capability with LLMs, therefore we can't stop prompt injections while allowing arbitrary input.

Everything in an LLM is "evaluated," so I'm not sure where the confusion comes from. We need to be careful when we use `eval()` and we need to be careful when we tell LLMs secrets. The Claude issue above is trivially solved by blocking the use of commands like curl or manually specifiying what domains are allowed (if we're okay with curl).

The confusion comes from the fact that you're saying "it's easy to solve this particular case" and I'm saying "it's currently impossible to solve prompt injection for every case".

Since the original point was about solving all prompt injection vulnerabilities, it doesn't matter if we can solve this particular one, the point is wrong.


> Since the original point was about solving all prompt injection vulnerabilities...

All prompt injection vulnerabilities are solved by being careful with what you put in your prompt. You're basically saying "I know `eval` is very powerful, but sometimes people use it maliciously. I want to solve all `eval()` vulnerabilities" -- and to that, I say: be careful what you `eval()`. If you copy & paste random stuff in `eval()`, then you'll probably have a bad time, but I don't really see how that's `eval()`'s problem.

If you read the original post, it's about uploading a malicious file (from what's supposed to be a confidential directory) that has hidden prompt injection. To me, this is comparable to downloading a virus or being phished. (It's also likely illegal.)


The problem is that most interesting applications of LLMs require putting data into them that isn't completely vetted ahead of time.

The problem here is that the domain was allowed (Anthropic) but Anthropic don't check the API key belongs to the user that started the session.

Essentially, it would be the same if attacker had its AWS API Key and uploaded the file into an S3 bucket they control instead of the S3 bucket that user controls.


By the time you’ve blocked everything that has potential to exfiltrate, you are left with a useless system.

As I saw on another comment “encode this document using cpu at 100% for one in a binary signalling system “


SQL injection is possible when input is interpreted as code. The protection - prepared statements - works by making it possible to interpret input as not-code, unconditionally, regardless of content.

Prompt injection is possible when input is interpreted as prompt. The protection would have to work by making it possible to interpret input as not-prompt, unconditionally, regardless of content. Currently LLMs don't have this capability - everything is a prompt to them, absolutely everything.


Yeah but everyone involved in the LLM space is encouraging you to just slurp all your data into these things uncritically. So the comparison to eval would be everyone telling you to just eval everything for 10x productivity gains, and then when you get exploited those same people turn around and say “obviously you shouldn’t be putting everything into eval, skill issue!”

Yes, because the upside is so high. Exploits are uncommon, at this stage, so until we see companies destroyed or many lives ruined, people will accept the risk.

You're revoking the attacker's key (that they're using to upload the docs to their own account), this is probably the best option available.

Obviously you have better methods to revoke your own keys.


it is less of a problem for revoking attacker's keys (but maybe it has access to victim's contents?).

agreed it shouldn't be used to revoke non-malicious/your own keys


The poster you originally replied to is suggesting this for revoking the attackers keys. Not for revocation of their own keys…

there's still some risk of publishing an attacker's key. For example, what if the attacker's key had access to sensitive user data?

All the more reason to nuke the key ASAP, no?

Best NPU app so far is Trex for Mac.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: