More

martin-t · 2026-01-11T12:18:08 1768133888

Moving training won't help them if their paying customers are in jurisdictions which do respect copyright as written and intended.

luke5441 · 2026-01-11T12:37:59 1768135079

OPs idea is about having a new GPL like license with a "may not be used for LLM training" clause.

That the LLM itself is not allowed to produce copyrighted work (e.g. just copies of works or too structurally similar) without using a license for that work is something that is probably currently law. They are working around this via content filters. They probably also have checks during/after training that it does not reproduce work that is too similar. There are law suits about this pending if I remember correctly e.g. with the New York Times.

martin-t · 2026-01-11T12:52:04 1768135924

The issue is that everyone is focusing on verbatim (or "too similar") reproduction.

LLMs themselves are compressed models of the training data. The trick is the compression is highly lossy by being able to detect higher-order patterns instead of fucusing on the first-order input tokens (or bytes). If you look at how, for example, any of the Lempel-Ziv algorithms work, they also contain patterns from the input and they also predict the next token (usually byte in their case), except they do it with 100% probability because they are lossless.

So copyright should absolutely apply to the models themselves and if trained on AGPL code, the models have to follow the AGPL license and I have the right to see their "source" by just being their user.

And if you decompress a file from a copyrighted archive, the file is obviously copyrighted. Even if you decompress only a part. What LLMs do is another trick - by being lossy, they decompress probabilistically based on all the training inputs - without seeing the internals, nobody can prove how much their particular work contributed to the particular output.

But it is all mechanical transformation of input data, just like synonym replacement, just more sophisticated, and the same rules regarding plagiarism and copyright infringement should apply.

---

Back to what you said - the LLM companies use fancy language like "artificial intelligence" to distract from this so they can they use more fancy language to claim copyright does not apply. And in that case, no license would help because any such license fundamentally depends on copyright law, which as they claim does not apply.

That's the issue with LLMs - if they get their way, there's no way to opt out. If there was, AGPL would already be sufficient.

luke5441 · 2026-01-11T13:04:48 1768136688

I agree with your view. One just has to go into courts and somehow get the judges to agree as well.

An open question would be if there is some degree of "loss" where copyright no longer applies. There is probably case law about this in different jurisdictions w.r.t. image previews or something.

martin-t · 2026-01-11T19:50:18 1768161018

I don't think copyright should be binary or should work the way it does not. It's just the only tool we have now.

There should be a system which protects all work (intellectual and physical) and makes sure the people doing it get rewarded according to the amount of work and skill level. This is a radical idea and not fully compatible with capitalism as implemented today. I have a lot on my to-read list and I don't think I am the first to come up with this but I haven't found anyone else describing it, yet.

And maybe it's broken by some degenerate case and goes tits up like communism always did. But AFAICT, it's a third option somewhere in between, taking the good parts of each.

For now, I just wanna find ways to stop people already much richer than me from profiting from my work without any kind of compensation for me. I want inequality to stop worsening but OTOH, in the past, large social change usually happened when things got so bad people rejected the status quo and went to the streets, whether with empty hands or not. And that feels like where we're headed and I don't know whether I should be exited or worried.

martin-t · 2026-01-11T12:05:58 1768133158

With LLMs, if you did the first in the past, then no matter what license you chose, your work is now in the second category, except you don't get a dime.

martin-t · 2026-01-11T11:46:16 1768131976

> Not just to the people I agree with, but to anyone who needs to use a computer.

Why not say "... but to the people I disagree with"?

Would you be OK knowing your code is used to cause more harm than good? Would you still continue working on a hypothetical OSS which had no users, other than, say, a totalitarian government in the middle east which executes homosexuals? Would you be OK with your software being a critical directly involved piece of code for example tracking, de-anonymizing and profiling them?

Where is the line for you?

stravant · 2026-01-11T15:30:33 1768145433

As for me that's a risk I'm willing to accept in return for the freedom of the code.

I'm not going to deliberately write code that's LIKELY to do more harm than good, but crippling the potential positive impact just because of some largely hypothetical risk? That feels almost selfish, what would I really be trying to avoid, personally running into a feel-bad outcome?

martin-t · 2026-01-11T17:27:02 1768152422

I think it would be most interesting to find ways to restrict bad usage without crippling the positive impact.

Douglas Crockford[0] tried this with JSON. Now, strictly speaking, this does not satisfy the definition of Open Source (it merely is open source, lowercase). But after 10 years of working on Open Source, I came to the conclusion that Open Source is not the absolute social good we delude ourselves into thinking.

Sure, it's usually better than closed source because the freedoms mean people tend to have more control and it's harder for anyone (including large corporations) to restrict those freedoms. But I think it's a local optimum and we should start looking into better alternatives.

Android, for example, is nominally Open Source but in reality the source is only published by google periodically[1], making any true cooperation between the paid devs and the community difficult. And good luck getting this to actually run on a physical device without giving up things like Google Play or banking apps or your warranty.

There's always ways to fuck people over and there always will be but we should look into further ways to limit and reduce them.

[0]: https://en.wikipedia.org/wiki/Douglas_Crockford

[1]: https://www.androidauthority.com/aosp-source-code-schedule-3...

babarock · 2026-01-14T17:09:11 1768410551

> Open Source is not the absolute social good we delude ourselves into thinking.

Historically the term "Open Source" was specifically developed to divorce the movement from the "social good" ideas that were promoted by Free Software.

That's where I stand. I don't do Open Source to make the world better. I do Open Source because I believe that makes my software better.

I'm not an activist. I'm an engineer. Nothing wrong with activism, all the power to the people doing it, but the licensing I chose for my code doesn't take it into account.

layer8 · 2026-01-11T15:26:51 1768145211

I agree with the GP. While I wouldn’t be happy about such uses, I see the use as detached from the software as-is, given (assuming) that it isn’t purpose-built for the bad uses. If the software is only being used for nefarious purposes, then clearly you have built the wrong thing, not applied the wrong license. The totalitarian government wouldn’t care about your license anyway.

The one thing I do care about is attribution — though maybe actually not in the nefarious cases.

martin-t · 2026-01-11T17:33:10 1768152790

> The totalitarian government wouldn’t care about your license anyway.

I see this a lot and while being technically correct, I think it ignores the costs for them.

In practice such a government doesn't need to have laws and courts either but usually does because the appearance of justice.

Breaking international laws such as copyright also has costs for them. Nobody will probably care about one small project but large scale violations could (or at least should) lead to sanctions.

Similarly, if they want to offer their product in other countries, now they run the risk of having to pay fines.

Finally, see my sibling comment but a lot of people act like Open Source is an absolute good just because it's Open Source. By being explicit about our views about right and wrong, we draw attention to this delusion.

layer8 · 2026-01-11T17:58:02 1768154282

It’s fine to use whatever license you think is right. That includes the choice of using a permissive license. Restrictions are generally an impediment for adoption, due to their legal risk, even for morally immaculate users. I think that not placing usage restrictions on open source is just as natural as not placing usage restrictions on published research papers.

martin-t · 2026-01-11T19:04:13 1768158253

Tragedy of the commons. If all software had (compatible) clauses about permitted usage, then the choice would be to rewrite it inhouse or accept the restrictions. When there are alternatives (copyleft or permissive) which are not significantly worse, those will get used instead, even if taken in isolation, the restricted software was a bigger social good.

martin-t · 2026-01-11T11:42:20 1768131740

During the gold rush, it is said, the only people who made money were the ones selling the pickaxes. A"I" companies are ~selling~ renting the pickaxes of today.

(I didn't come up with this quote but I can't find the source now. If anything good comes out of LLMs, it's making me appreciate other people's more and trying to give credit where it's due.)

netsharc · 2026-01-11T15:29:44 1768145384

Wasn't it shovels?

NVidia is a shovel-maker worth a few trillion dollars...

kapsi · 2026-01-11T12:25:16 1768134316

What about the people who sold gold? Didn't they make money?

martin-t · 2026-01-11T12:30:05 1768134605

To be honest, I haven't looked at any statistics but I imagine a tiny few of those looking for gold found any and got rich, the most either didn't find anything, died of illness or exposure or got robbed. I just like the quote as a comparison. Updated the original comment to reflect I haven't checked if it's correct.

martin-t · 2026-01-11T11:38:27 1768131507

I recall a basics of law class saying that in some countries (e.g. Czech Republic), open source contributors have the right to small compensation if their work is used to a large financial benefit.

At some point, I'll have to look it up because if that's right, the billionaires and wannabe-trillionaires owe me a shitton of money.

martin-t · 2026-01-11T11:35:40 1768131340

If you want, I made a coherent argument about how the mechanics of LLMs mean both their training and inference is plagiarism and should be copyright infringement.[0] TL;DR it's about reproducing higher order patterns instead of word for word.

I haven't seen this argument made elsewhere, it would be interesting to get it into the courtrooms - I am told cases are being fought right now but I don't have the energy to follow them.

Plus as somebody else put it eloquently, it's labor theft - we, working programmers, exchanged out limited lifetime for money (already exploitative) in a world with certain rules. Now the rules changed, our past work has much more value, and we don't get compensated.

[0]: https://news.ycombinator.com/item?id=46187330

williamcotton · 2026-01-11T12:59:47 1768136387

The first thing you need to do is brush up on some IP law around software in the United States. Start here:

https://en.wikipedia.org/wiki/Idea–expression_distinction

https://en.wikipedia.org/wiki/Structure,_sequence_and_organi...

https://en.wikipedia.org/wiki/Abstraction-Filtration-Compari...

In a court of law you're going to have to argue that something is an expression instead of an idea. Most of what LLMs pump out are almost definitionally on the idea side of the spectrum. You'd basically have to show verbatim code or class structure at the expressive level to the courts.

martin-t · 2026-01-11T15:26:01 1768145161

Thanks for the links, I'll read them in more detail later.

There's a couple issues I see:

1) All of the concepts were developed with the idea that only humans are capable of certain kinds of work needed for producing IP. A human would not engage in highly repetitive and menial transformation of other people's material to avoid infringement if he could get the same or better result by working from scratch. This placed, throughout history, an upper limit on how protective copyright had to be.

Say, 100 years ago, synonym replacement and paraphrasing of sentences were SOTA methods to make copies of a book which don't look like copies without putting in more work than the original. Say, 50 years ago, computers could do synonym replacement automatically so it freed up some time for more elaborate restructuring of the original work and the level of protection should have shifted. Say, 10 years ago, one could use automatic replacement of phrases or translation to another language and back, freeing up yet more time.

The law should have adapted with each technological step up and according to your links it has - given the cases cited. It's been 30 years and we have a massive step up in automatic copying capabilities - the law should change again to protect the people who make this advancement possible.

Now with a sufficiently advanced LLM trained on all public and private code, you can prompt them to create a 3D viewer for Quake map files and I am sure it'll most of the time produce a working program which doesn't look like any of the training inputs but does feel vaguely familiar in structure. Then you can prompt it to add a keyboard-controlled character with Quake-like physics and it'll produce something which has the same quirks as Quake movement. Where did bunny hopping, wallrunning, strafing, circlejumps, etc. come from if it did not copy the original and the various forks?

Somebody had to put in creative work to try out various physics systems and figure out what feels good and what leads to interesting gameplay.

Now we have algorithms which can imitate the results but which can only be created by using the product of human work without consent. I think that's an exploitative practice.

2) It's illegal to own humans but legal to own other animals. The USA law uses terms such as "a member of the species Homo sapiens" (e.g. [0]) in these cases.

If the legality of tech in question was not LLMs but remixing of genes (only using a tiny fraction of human DNA) to produce a animals which are as smart as humans with chimpanzee bodies which can be incubated in chimpanzee females but are otherwise as sentient as humans, would (and should) it be legal to own them as slaves and use them for work? It would probably be legal by the current letter of the law but I assure you the law would quickly change because people would not be OK with such overt exploitation.

The difference is the exploitation by LLM companies is not as overt - in fact, mane people refer to LLMs as AIs and use pronouns such as "he" or "she", indicating them believe them to be standalone thinking entities instead of highly compressed lossy archives of other people's work.

3) The goal of copyright is progress, not protection of people who put in work to make that progress possible. I think that's wrong.

I am aware of the "is" vs "should" distinction but since laws are compromises between the monopoly in violence and the people's willingness to revolt instead of being an (attempted) codification of a consistent moral system, the best we can do is try to use the current laws (what is) to achieve what is right (what should be).

[0]: https://en.wikipedia.org/wiki/Unborn_Victims_of_Violence_Act

williamcotton · 2026-01-11T16:47:19 1768150039

But "vaguely familiar in structure" could be argued to be the only reasonable way to do something, depending on the context. This is part of the filtration step in AFC.

The idea of wallrunning should not be protected by copyright.

martin-t · 2026-01-11T19:16:41 1768159001

The thing is a model trained on the same input as current models except Quake and Quake derivatives would not generate such code. (You'd have to prompt it with descriptions of quake physics since it wouldn't know what you mean, depending on whether only code or all mentions were excluded.)

The quake special behaviors are results of essentially bugs which were kept because it led to fun gameplay. The model would almost certainly generate explicit handling for these behaviors because the original quake code is very obviously not the only reasonable way to do it. And in that case the model and its output is derivative work of the training input.

The issue is such an experiment (training a model with specific content excluded) would cost (tens/hundreds of?) millions of dollars and the only companies able to do it are not exactly incentivized to try.

---

And then there's the thing that current LLMs are fundamentally impossible to create without such large amounts of code as training data. I honestly don't care what the letter of the law is, to any reasonable person, that makes them derivative work of the training input and claiming otherwise is a scam and theft.

I always wonder if people arguing otherwise think they're gonna get something out of it when the dust settles or if they genuinely think society should take stuff from a subgroup of people against their will when it can to enrich itself.

williamcotton · 2026-01-11T19:29:06 1768159746

“Exploitative” is not a legal category in copyright. If the concern is labor compensation or market power, that’s a question for labor law, contract law, or antitrust, not idea-expression analysis and questions of derivative works.

ThrowawayR2 · 2026-01-11T17:27:28 1768152448

There was a legal analysis of the copyright implications of Copilot among a set of white papers commissioned by the Free Software Foundation: https://www.fsf.org/licensing/copilot/copyright-implications...

martin-t · 2026-01-11T12:15:52 1768133752

And HN does its thing again - at least 3 downvotes, 0 replies. If you disagree, say why, otherwise I have to assume my argument is correct and nobody has any counterarguments but people who profit from this hate it being seen.

dahart · 2026-01-11T16:22:35 1768148555

I agree that training on copyrighted material is violating the law, but not for the reasons you stated.

That said, this comment is funny to me because I’ve done the same thing too, take some signal of disagreement, and assume the signal means I’m right and there’s a low-key conspiracy to hold me down, when it was far more likely that either I was at least a bit wrong, or said something in an off-putting way. In this case, I tend to agree with the general spirit of the sibling comment by @williamcotton in that it seems like you’re inventing some criteria that are not covered by copyright law. Copyrights cover the “fixation” of a work, meaning they protect only its exact presentation. Copyrights do not cover the Madlibs or Cliff Notes scenarios you proposed. (Do think about Cliff Notes in particular and what it implies about AI - Cliff Notes are explicitly legal.)

Personally, I’ve had a lot of personal forward progress on HN when I assume that downvotes mean I said something wrong, and work through where my own assumptions are bad, and try to update them. This is an important step especially when I think I’m right.

I’m often tempted to ask for downvote explanations too, but FWIW, it never helps, and aside from HN guidelines asking people to avoid complaining about downvotes, I find it also helps to think of downvotes as symmetric to upvotes. We don’t comment on or demand an explanation for an upvote, and an upvote can be given for many reasons - it’s not only used for agreement, it can be given for style, humor, weight, engagement, pity, and many other reasons. Realizing downvotes are similar and don’t only mean disagreement helps me not feel personally attacked, and that can help me stay more open to reflecting on what I did that is earning the downvotes. They don’t always make sense, but over time I can see more places I went wrong.

martin-t · 2026-01-11T19:32:54 1768159974

> or said something in an off-putting way

It shouldn't matter.

Currently, downvote means "I want this to be ranked lower". There really should be 2 options "factually incorrect" and "disagree". For people who think it should matter, there should be a third option, "rude", which others can ignore.

I've actually emailed about this with a mod and it seems he conflated talking about downvotes with having to explain a reason. He also told me (essentially) people should not have the right to defend themselves against incorrect moderator decisions and I honestly didn't know what to say to that, I'll probably message him again to confirm this is what he meant but I don't have high hopes after having similar interactions with mods on several different sites.

> FWIW, it never helps

The way I see it, it helped since I got 2 replies with more stuff to read about. Did you mean it doesn't work for you?

> downvotes as symmetric to upvotes

Yes, and we should have more upvote options too. I am not sure the explanation should be symmetric though.

Imagine a group conversation in which somebody lies (the "factually incorrect" case here). Depending on your social status within the group and group politics, you might call out the lie in public, in private with a subset or not at all. But if you do, you will almost certainly be expected to provide a reasoning or evidence.

Now imagine he says something which is factually correct. If you say you agree, are you expected to provide references why? I don't think so.

---

BTW, on a site which is a more technical alternative to HN, there was recently a post about strange behavior of HN votes. Other people posted their experience with downvotes here and they mirrored mine - organic looking (i.e. gradual) upvotes, then within minutes of each other several downvotes. It could be coincidence but me and others suspect voting rings evading detection.

I also posted a link to my previous comment as an experiment - if people disagree, they are more likely to also downvote that one. But I did not see any change there so I suspect it might be bots (which are unlikely to be instructed to also click through and downvote there). Note sample size is 1 here, for now.

ThrowawayR2 · 2026-01-11T17:30:12 1768152612

Maybe if you constructed your argument in terms of the relevant statutes for your jurisdiction, like an actual copyright attorney does, HN might be more receptive to it?

martin-t · 2026-01-11T19:19:06 1768159146

I argue primarily about morality (right and wrong), not legality. The argument is valid morally, if LLM companies found a loophole ion the law, it should be closed.

ThrowawayR2 · 2026-01-11T19:29:56 1768159796

You literally wrote "it would be interesting to get it into the courtrooms". A court won't give a hoot about your opinions on morality.

martin-t · 2026-01-11T22:52:19 1768171939

1) I appreciate that you differentiate between legality and morality, many people sadly don't.

2) re "hoot": You can say "fuck" here. You've been rudely dismissive twice now, yet you use a veil of politeness. I prefer when people don't hide their displeasure at me.

3) If you think I am wrong, you can say so instead of downvoting, it'll be more productive.

4) If you want me to expend effort on looking up statutes, you can say so instead of downvoting, it'll be more productive.

5) The law can be changed. If a well-reasoned argument is presented publicly, such as in a court room, and the general agreement is that the argument should apply but the court has to reject is because of poorly designed laws, that's a good impetus for changing it.

martin-t · 2026-01-11T11:31:28 1768131088

> programmer who actually do like the actual typing

It's not about the typing, it's about the understanding.

LLM coding is like reading a math textbook without trying to solve any of the problems. You get an overview, you get a sense of what it's about and most importantly you get a false sense of understanding.

But if you try to actually solve the problems, you engage completely different parts of your brain. It's about the self-improvement.

jebarker · 2026-01-11T15:20:16 1768144816

> LLM coding is like reading a math textbook without trying to solve any of the problems.

Most math textbooks provide the solutions too. So you could choose to just read those and move on and you’d have achieved much less. The same is true with coding. Just because LLMs are available doesn’t mean you have to use them for all coding, especially when the goal is to learn foundational knowledge. I still believe there’s a need for humans to learn much of the same foundational knowledge as before LLMs otherwise we’ll end up with a world of technology that is totally inscrutable. Those who choose to just vibe code everything will make themselves irrelevant quickly.

gosub100 · 2026-01-11T15:44:59 1768146299

I haven't used AI yet but I definitely would love a tool that could do the drudgery for me for designs that I already understand. For instance, if I want to store my own structures in an RDBMS, I want to lay the groundwork and say "Hey Jeeves, give me the C++ syntax to commit this structure to a MySQL table using commit/rollback". I believe once I know what I want, futzing over the exact syntax for how to do it is a waste of time. I heard c++ isn't well supported but eventually I'll give it a try.

dehsge · 2026-01-11T15:49:45 1768146585

Most math books do not provide solutions. Outside of calculus, advanced mathematics solutions are left as an exercise for the reader.

jebarker · 2026-01-11T16:09:58 1768147798

The ones I used for the first couple of years of my math PhD had solutions. That's a sufficient level of "advanced" to be applicable in this analogy. It doesn't really matter though - the point still stands that _if_ solutions are available you don't have to use them and doing so will hurt your learning of foundational knowledge.

embedding-shape · 2026-01-11T11:34:55 1768131295

> It's not about the typing, it's about the understanding.

Well, it's both, for different people, seemingly :)

I also like the understanding and solving something difficult, that rewards a really strong part of my brain. But I don't always like to spend 5 hours in doing so, especially when I'm doing that because of some other problem I want to solve. Then I just want it solved ideally.

But then other days I engage in problems that are hard because they are hard, and because I want to spend 5 hours thinking about, designing the perfect solution for it and so on.

Different moments call for different methods, and particularly people seem to widely favor different methods too, which makes sense.

ben_w · 2026-01-11T12:50:40 1768135840

> LLM coding is like reading a math textbook without trying to solve any of the problems. You get an overview, you get a sense of what it's about and most importantly you get a false sense of understanding.

Can be, but… well, the analogy can go wrong both ways.

This is what Brilliant.org and Duolingo sell themselves on: solve problems to learn.

Before I moved to Berlin in 2018, I had turned the whole Duolingo German tree gold more than once, when I arrived I was essentially tourist-level.

Brilliant.org, I did as much as I could before the questions got too hard (latter half of group theory, relativity, vector calculus, that kind of thing); I've looked at it again since then, and get the impressions the new questions they added were the same kind of thing that ultimately turned me off Duolingo, easier questions that teach little, padding out a progressions system that can only be worked through fast enough to learn anything if you pay a lot.

Code… even before LLMs, I've seen and I've worked with confident people with a false sense of understanding about the code they wrote. (Unfortunately for me, one of my weaknesses is the politics of navigating such people).

habinero · 2026-01-11T16:09:46 1768147786

Yeah, there's a big difference between edutainment like Brilliant and Duolingo and actually studying a topic.

I'm not trying to be snobbish here, it's completely fine to enjoy those sorts of products (I consume a lot of pop science, which I put in the same category) but you gotta actually get your hands dirty and do the work.

It's also fine to not want to do that -- I love to doodle and have a reasonable eye for drawing, but to get really good at it, I'd have to practice a lot and develop better technique and skills and make a lot of shitty art and ehhhh. I don't want it badly enough.

williamcotton · 2026-01-11T12:18:18 1768133898

Lately I've been writing DSLs with the help of these LLM assistants. It is definitely not vibe coding as I'm paying a lot of attention to the overall architecture. But most importantly my focus is on the expressiveness and usefulness of the DSLs themselves. I am indeed solving problems and I am very engaged but it is a very different focus. "How can the LSP help orient the developer?" "Do we want to encourage a functional-looking pipeline in this context"? "How should the step debugger operate under these conditions"? etc.

  GET /svg/weather
    |> jq: weatherData
    |> jq: `
      .hourly as $h |
      [$h.time, $h.temperature_2m] | transpose | map({time: .[0], temp: .[1]})
    `
    |> gg({ "type": "svg", "width": 800, "height": 400 }): `
      aes(x: time, y: temp) 
        | line() 
        | point()
    `

I've even started embedding my DSLs inside my other DSLs!

svara · 2026-01-11T11:42:21 1768131741

We've been hearing this a lot, but I don't really get it. A lot of code, most probably, isn't even close to being as challenging as a maths textbook.

It obviously depends a lot on what exactly you're building, but in many projects programming entails a lot of low intellectual effort, repetitive work.

It's the same things over and over with slight variations and little intellectual challenge once you've learnt the basic concepts.

Many projects do have a kernel of non-obvious innovation, some have a lot of it, and by all means, do think deeply about these parts. That's your job.

But if an LLM can do the clerical work for you? What's not to celebrate about that?

To make it concrete with an example: the other day I had Claude make a TUI for a data processing library I made. It's a bunch of rather tedious boilerplate.

I really have no intellectual interest in TUI coding and I would consider doing that myself a terrible use of my time considering all the other things I could be doing.

The alternative wasn't to have a much better TUI, but to not have any.

zahlman · 2026-01-11T11:58:31 1768132711

> It obviously depends a lot on what exactly you're building, but in many projects programming entails a lot of low intellectual effort, repetitive work.

I think I can reasonably describe myself as one of the people telling you the thing you don't really get.

And from my perspective: we hate those projects and only do them if/because they pay well.

> the other day I had Claude make a TUI for a data processing library I made. It's a bunch of rather tedious boilerplate. I really have no intellectual interest in TUI coding...

From my perspective, the core concepts in a TUI event loop are cool, and making one only involves boilerplate insofar as the support libraries you use expect it. And when I encounter that, I naturally add "design a better API for this" to my project list.

Historically, a large part of avoiding the tedium has been making a clearer separation between the expressive code-like things and the repetitive data-like things, to the point where the data-like things can be purely automated or outsourced. AI feels weird because it blurs the line of what can or cannot be automated, at the expense of determinism.

martin-t · 2026-01-11T11:59:49 1768132789

I've also been hearing variations of your comment a lot too and correct me if I am wrong but I think they always implicitly assume that LLMs are more useful for the low-intellectual stuff than solving the high-intellectual core of the problem.

The thing is:

1) A lot of the low-intellectual stuff is not necessarily repetitive, it involved some business logic which is a culmination of knowing the process behind what the uses needs. When you write a prompt, the model makes assumptions which are not necessarily correct for the particular situation. Writing the code yourself forced you to notice the decision points and make more informed choices.

I understand your TUI example and it's better than having none now, but as a result anybody who wants to write "a much better TUI" now faces a higher barrier to entry since a) it's harder to justify an incremental improvement which takes a lot of work b) users will already have processes around the current system c) anybody who wrote a similar library with a better TUI is now competing with you and quality is a much smaller factor than hype/awareness/advertisement.

We'll basically have more but lower quality SW and I am not sure that's an improvement long term.

2) A lot of the high-intellectual stuff ironically can be solved by LLMs because a similar problem is already in the training data, maybe in another language, maybe with slight differences which can be pattern matched by the LLM. It's laundering other people's work and you don't even get to focus on the interesting parts.

svara · 2026-01-11T13:03:47 1768136627

> but I think they always implicitly assume that LLMs are more useful for the low-intellectual stuff than solving the high-intellectual core of the problem.

Yes, this follows from the point the GP was making.

The LLM can produce code for complex problems, but that doesn't save you as much time, because in those cases typing it out isn't the bottleneck, understanding it in detail is.

nkrisc · 2026-01-11T12:20:01 1768134001

And so in the future if you want to add a feature, either the LLM can do it correctly or the feature doesn’t get added? How long will that work as the TUI code base grows?

simonw · 2026-01-11T12:29:15 1768134555

At that point you change your attitude to the project and start treating it like something you care about, take control of the architecture, rewrite bits that don't make sense, etc.

Plus the size of project that an LLM can help maintain keeps growing. I actually think that size may no longer have any realistic limits at all now: the tricks Claude Code uses today with grep and sub-agents mean there's no longer a realistic upper limit to how much code it can help manage, even with Opus's relatively small (by today's standards) 200,000 token limit.

zahlman · 2026-01-11T16:23:38 1768148618

The problem I'm anticipating isn't so much "the codebase grows beyond the agent-system's comprehension" so much as "the agent-system doesn't care about good architecture" (at least unless it's explicitly directed to). So the codebase grows beyond the codebase's natural size when things are redundantly rewritten and stuffed into inappropriate places, or ill-fitting architectural patterns are aped.

svara · 2026-01-11T17:44:13 1768153453

Don't "vibe code". If you don't know what architecture the LLM is producing, you will produce slop.

martin-t · 2026-01-11T11:26:39 1768130799

No, you're absolutely right.

LLMs are labor theft on an industrial scale.

I spent 10 years writing open source, I haven't touched it in the last 2. I wrote for multiple reasons none of which any longer apply:

- I believe every software project should have an open source alternative. But writing open source now means useful patterns can be extracted and incorporated into closed source versions _mechanically_ and with plausible deniability. It's ironically worse if you write useful comments.

- I enjoyed the community aspect of building something bigger than one person can accomplish. But LLMs are trained on the whole history and potentially forum posts / chat logs / emails which went into designing the SW too. With sufficiently advanced models, they effectively use my work to create a simulation of myself and other devs.

- I believe people (not just devs) should own the product they build (an even stronger protection of workers against exploitation than copyright). Now our past work is being used to replace us in the future without any compensation.

- I did it to get credit. Even though it was a small motivation compared to the rest, I enjoyed everyone knowing what I accomplished and I used it during job interviews. If somebody used my work, my name was attached to it. With LLMs, anyone can launder it and nobody knows how useful my work was.

- (not solely LLM related) I believed better technology improves the world and quality of life around me. Now I see it as a tool - neutral - to be used by anyone for both good and bad purposes.

Here's[0] a comment where I described why it's theft based on how LLMs work. I call it higher order plagiarism. I haven't seen this argument made by other people, it might be useful for arguing about those who want to legalize this.

In fact, I wonder if this argument has been made in court and whether the lawyers understand LLMs enough to make it.

[0]: https://news.ycombinator.com/item?id=46187330

martin-t · 2026-01-04T08:49:46 1767516586

I think about better voting systems all the time (one major issue being downvote can mean "I want fewer people to see this", "I disagree", and "This is factually wrong" and you never know which.

But I am not sure if SO's is actually that good, given it led to this toxic behavior.

I think something like slashdot's metamoderation should work best but I never participated there nor have I seen any other website use anything similar.

oofbey · 2026-01-04T18:36:14 1767551774

Arstechnica used to have different kinds of upvotes for "funny" vs "insightful" - I forget exactly all of them. But I found it awesome. I wanted to and could read the insightful comments, not the funny ones. A couple years back they redid the discussion system and got rid of it. Since then the quality of discussion has IMHO completely tanked.

martin-t · 2026-01-04T07:48:42 1767512922

Y'know how "users" of modern tech are the product? And how the developers were completely fine with creating such systems?

Well, turns out developers are now the product too. Good job everyone.