I believe it is the site administrators who have inserted Cloudflare in between their sites and their users.
Usually it is done for rational reasons of establishing a protection against bots. But what is less rational, in my opinion, is when everyone uses the same provider for that.
Because it indirectly turns Cloudflare into a monopoly. And monopolies often converge to a state when they start to abuse their position.
Please do not use the term communist lightly, i.e. as an umbrella term for people who express ideas that e.g. more government control or regulation is in some circumstances reasonable.
The only forms of communism that have ever actually materialized in society have all been authoritarian regimes or outright dictatorships. Where the only "truth" is dictated by the governing leader or party. Where you cannot express your opinion freely. Where you cannot e.g. go to a university or have a slightly better job unless you are loyal to the party establishment. Where people are afraid to talk to their neighbors about politics because they cannot know who is going to report them for anti-government opinions. Where people are persecuted, imprisoned or even killed for their opinions.
To the best of my knowledge, Bernie Sanders has never expressed such ideas.
One might argue that here we are talking about the purely academic definition of communism. But unfortunately, in the real world, there is no such thing as academic communism. So far it has always come with the dictatorship and with people who abuse it. Always.
I'm not proposing anything. I don't think it's the government's remit to be honest. But government seizing the means of production is literally the definition of communism.
In general, it seems to me that an abstract resource like AI cannot possibly be regulated. Even if US forced their hand and took ownership of the controlling stakes in the current major AI companies, what stops the other AI companies from raising up and doing whatever they want?
Perhaps the assumption is that these large AI companies need large datacenters to operate and that is how they will be regulated. But what about the datacenters outside the US jurisdiction? And what about local AI?
In the old days, the computers were huge and there was one per city. Now, several decades later, we all have plenty of our own computers. I cannot imagine why the trend would not continue with AI. Over time, it is in my opinion plausible that most of our common needs would be satisfied by local AI running on one's home servers or even phones.
How is that going to be regulated by owning a controlling stake in a few US AI companies?
I do not see into the details of what Mr. Bernie Sanders is suggesting. It seems to me though that his idea of somehow regulating the AI needs further development. Because the currently discussed approaches seem to me like a hot take that has not been thought over very well.
It is in my opinion reasonable to call out any violations of any law or any violations of the users' or companies' privacy as they are spotted. And everyone is best suited to spot issues in areas or fields in which they operate.
> time spent on reducing memory footprints is seen as wasteful by the business
I think that there is a way to change that.
If an application runs significantly better on lower end hardware while delivering the same results, the customers should prefer it. It is just a matter of promoting it that way.
> but virtually impossible to block the text itself
Why do you believe so?
As long as there is a clear indication somewhere on the webpage (in the metadata or in the text itself) that a specific portion of a text is an ad, a browser extension will be able to block it.
And I assume that there are laws mandating that the ads must be clearly marked in order to be distinguishable from the genuine content.
That's only doable if the ads are artificially injected. But what if they are part of the training, system prompt or the search results that are fed to the AI? What if Google Search bumps up their paying advertiser up in the internal search results for Gemini (as they are basically already doing)? The AI will be biased towards the advertisers without literally embedding an ad into the output text.
They won't be if the models are "free", which is the case for AI Mode in Google Search. That's why common people still use Google despite it being an ad-ridden slopfest, it's "free"!
It's just gonna say "this whole thing might be a big ad" and they will fight the fines in court for years, lose and book those fines as cost of doing business while laughing all the way to the bank
Enshittification of the AI tools has officially begun.
Maybe we will soon find e.g. AI-generated pictures of ourselves in branded clothes or using branded products to appear among our photos, discretely disguised as genuine photos with a little badge in the corner indicating that it is actually a paid "promotion".
And so on. And that would still be, in my opinion, just the beginning.
> hoping the agents will get so good in near future, that there won't be the need for understanding the codebase
Agents might get better. But who will own the code and take responsibility for it? The AI agent? The company who created the AI agent?
If e.g. a car crashes and does not deploy its airbags because the AI agent made a mistake in the airbag code, will the manufacturer be able to shift the blame to OpenAI or Anthropic?
I do not think so.
And therefore I believe that no matter how good the AI agents will ever become, the ultimate responsibility for the code will always remain with the companies that create the code. Regardless of which AI tools they use.
I see no other way to bear that responsibility by the company than to have people internally who will be responsible. And those people, if they actually want to own that responsibility, would need to understand that code themselves, in my opinion. Because relying on a non-deterministic AI agent's vetting is fundamentally unreliable, in my opinion.
The developers signing off on this will be "Human crumple zones" to protect the company from liability. Be very cautious if asked to sign off on anything like this.
This is why nearly all people that write code are not engineers, no "Software Engineer" would be willing to sign off on their code like this, yet this is level of safety guarantees real engineering is about.
> And I haven't written a single line of code myself since what - February maybe?
Have you measured the impact of that on your ability to create good code? From my experience, relying on AI tends to degrade that ability.
Also, you seem to be able to do all of what you say and benefit from AI tools because you seem to understand the overall bigger picture well enough to be able to drive the AI agents to do their work properly. In other words, you operate in a familiar territory where you do not need to learn much new things.
But what about the junior people with little experience? Will they be able to manage such AI workflow? And more importantly, if junior people are given such AI tools, how will they learn?
These are all questions which may not matter in the short term and one might ignore them if they just want to see the profits and efficiency gains during the next cycle. But what about the long term?
I understand what you mean, but in my opinion there's a big difference between writing in natural language and actively engaging your brain with writing code, looking up documentation, etc.
It also sort of feels like "you don't know what you don't know", i.e. would you have considered an alternative better solution if you thought about it yourself, went to the documentation, found a tutorial on the web?
Of course, production is arguably a lot faster but it feels like there's starting to become a trade-off where the models feel so capable that we stop trying to find the solution to the problem ourselves and thus perhaps degrading our personal reasoning capabilities. I say this as something I'm afraid is happening, not something I'm certain of.
A compiler is a predictable, testable, deterministic piece of software.
An LLM is not.
Sure, all abstractions leak; so, at some point in time, for some reason, you may need to check its compiled code ( coughcough gcc 2.96 ). But, if today your code compiles properly, it will properly compile tomorrow as well.
LLMs can be deterministic as well - same prompt on the same model produces the same input. On the other hand, compilers can be quite undeterministic - you get a new version of compiler, or change compiler options (turn on optimizations) - you might get a very different binary. And JIT compilers (and GC languages) even less deterministic, their compilation can depend on the nature of the inputs.
But I think, in the analogy compiler ~ LLM, the issue is more of a trust than determinism. It took decades to assembler programmers to trust compilers enough not to write code in assembler. The similar will happen with AI - some will embrace it sooner than others.
> LLMs can be deterministic as well - same prompt on the same model produces the same input
> compilers can be quite undeterministic - you get a new version of compiler, or change compiler options (turn on optimizations)
That’s a whole other level pf bad faith argument right here. Flags and options are input too.
> It took decades to assembler programmers to trust compilers enough not to write code in assembler.
You do realize that Cobol, Algol, and Lisp are very old, and they were not assembly. And that Unix were written in C shortly after the language was created.
> That’s a whole other level pf bad faith argument right here.
Not sure where you see the bad faith argument. (Btw I mean "same output", not "same input", it was a typo.)
Take for example JVM. It used to be horribly bad and unpredictable, performance wise, in the 90s. Sun tried to base a desktop environment on it - it didn't work.
> You do realize that Cobol, Algol, and Lisp are very old, and they were not assembly.
Of course! But people have been hand-writing assembler until late 2000s, because compilers were simply not that good.
The same will happen with LLMs - some people will not trust it and won't use it for decades, possibly. Some have already embraced it.
You proof for your argument that a compiler is undeterministic is to change the whole compiler to another version and saying it won’t produce the same output as the old one.
> But people have been hand-writing assembler until late 2000s, because compilers were simply not that good.
And we have software like Unix, enacs, ksh, awk… that’s all written in C. I strongly believe that those people who were writing assembly was optimizing stuff or dealing with constraints (like the 640kb of DOS). Just like today, you may still have to write assembly for microcontrollers or video codecs. Compilers were expensive, but people were paying for them.
> You proof for your argument that a compiler is undeterministic is to change the whole compiler to another version and saying it won’t produce the same output as the old one.
Fair enough. What I meant though was that compilation as a process is not deterministic, because often when you recompile couple years later, you're using a different compiler. (In modern world it can be much shorter time, actually.)
> And we have software like Unix, enacs, ksh, awk… that’s all written in C.
So? IIRC, first compiler was FORTRAN, invented in 1958. OpenAI Codex, first coding LLM, came out August 2021. So we are like in a year 1963. For this comparison, we have ten more years to produce (using a coding LLM) a compiler and operating system just from the textual specification, without an intermediate formal programming language. Funny - we have actually already done that (Claude C Compiler, VibexOS).
> So? IIRC, first compiler was FORTRAN, invented in 1958. OpenAI Codex, first coding LLM, came out August 2021. So we are like in a year 1963. For this comparison, we have ten more years to produce (using a coding LLM) a compiler and operating system just from the textual specification, without an intermediate formal programming language.
Nope, the timeframe would have been three years
In 1961, the MCP was the first OS written exclusively in a high-level language (HLL).[0]
So by 2024, we should all have been able to verify that LLMs are reliable to produce a good enough product. Instead, it’s just slop everywhere, where the one producing it does not even care about its creation.
are you saying ai writes code that is semantically wrong? because i dont think humans write deterministic code - they come up with different solutions to the same problem.
This would only be somewhat equivalent if you compiled your code into assembly and committed that output to the repo, and then had to continue development within the assembly codebase using the same method.
How is that relevant to the topic of this discussion?
Compilation from higher order languages to the machine code is deterministic. It is sufficient to review and well-test the tool which does the translation. Given the same input, the output will always be the same.
Transformation of a natural language prompt to code by an AI tool is non-deterministic. The outputs will vary between runs. Therefore, it is always necessary to verify them.
Compilation is not deterministic, see JITs and GCs. What is deterministic is the resulting program output, but not its performance. So with compilers, we traded away the determinism over performance in exchange for ease of programming.
With LLMs, we are trading away the determinism of the program output as well, in exchange for even more easier programming. Is it a good or bad thing? There are ways to mitigate the problem, just like there are with compilers.
You could argue the determinism of the program output was never really there, because the specification at the high enough level was always unclear. So we are not really losing that much, just accepting more messy reality.
Then the only question remains, can these computer programs (LLMs) do a better job (and where) than a SW developer, who is supposed to translate unclear specifications into a formal language (source code). It happened with compilers - eventually they got better than all of assembler programmers. Same happened to chess players.
> Compilation is not deterministic, see JITs and GCs. What is deterministic is the resulting program output, but not its performance.
Does JIT compiles some other program code instead of the one being run? Does it produce bytecodes for a differenr VM? Does it tries to compile parts of the program that have not been executed or aren’t going to be?
Does GC destroy objects being in use? Does it ignores instances and memory that has been properly released?
JITs and GC are deterministic algorithms, you can predict its behavior by just reading their code. LLM tooling involves an actual random generator for its output.
> Does JIT compiles some other program code instead of the one being run? Does it produce bytecodes for a different VM? Does it tries to compile parts of the program that have not been executed or aren’t going to be?
Sure, but the same is true for LLMs - the lead models no longer make trivial mistakes like answering "What is the capital of France?" wrong.
> JITs and GC are deterministic algorithms, you can predict its behavior by just reading their code.
On large enough systems, you can't, just like it's difficult to predict weather. Determinism has little to do with it. At work, I have just witnessed a bug in JIT (it seems to have been fixed in OpenJDK 25). It inlined a wrong method. We weren't able to reproduce the error conditions without a private customer dataset.
And the fact is, historically, there have been many bugs in compilers, or they have been bad at their job, writing performant programs. The output (resulting program) of a good compiler is difficult to understand (because it is written to be efficient). LLMs (for the programming use case) are different quantitatively, not qualitatively.
It’s really weird how you shift the goalposts and your own definitions.
No one is saying that a compiler can’t have bugs. What we have been saying is that if we take the compiler has a blackbox, we’re reasonably certain given we know the input, what the outputs will be. And the output will stay the same if you keep the input the same.
But you can send the LLM the same prompt, and it will gives you a different answer each time. And it’s not even about the verbiage used.
LLM doesn't have to be non-deterministic, it can work just like any other deterministic algorithm.
But I am not sure why the insistence on the relevance of (non)determinism, rather than on the chaotic relation of the output to the input (which is true for both compilers and LLMs). In practice, inputs to the LLM, as well as to the compiler, change. And the fact is, the output can change radically due to that.
I think nobody really sends the same prompt twice to the LLM, so nobody cares about it being deterministic. I think what you're looking for is something different, some form of stability (as opposed to chaotic behavior). Although it's hard to define exactly, because in case of LLMs theory lacks behind praxis. (And as I said - we already gave up on stability with respect to performance by using compilers. We resolve that issue by doing performance testing.)
(I asked AI what's the opposite of "chaotic", I use "stable", but it seems that people use "deterministic" or "predictable" also in that meaning. So if you're using "deterministic" in that meaning, then you don't really care about sampling and temperature, i.e. determinism in the philosophical sense, but rather whether the output is consistent, albeit expressed differently.)
The whole point of technology is about control and consistency. Even with random parameters, we want their value to an item in a specific sets. When I use a tool, I want it to produce the outcome I want, not any other outcome it wants to produce. If it fails at that, it’s a defective tool.
> Compilation from higher order languages to the machine code is deterministic.
but that's not the analogy. there are problems that you can solve better if you can go deeper in the stack, and they can have different solutions.
The usual response to this is the "but high level languages are deterministic blah blah blah" (which IMO would be a good enough argument but well, we know how this goes now)
I posit a different argument. When you install a compiler on your computer, that compiler is "yours" for as long as you have the binary. You are able to completely forget about assembly because of 1. reliable _enough_ compiler 2. reliable access to said compiler.
Let's rewind decades back and pretend that the very first assembly compiler was behind a monthly subscription*. Do you think we'd be in the same place now?
Now the natural follow up to this "but the open models are close to SotA now". Well why aren't we using them? Do we really think we'd have a GNU moment for """open""" models? And are we willing to bet our industry on that?
But my point is, _these are not the same things_ and positing them as such is frankly insulting. How good are you at writing assembly when your compiler is inevitably taken away?
* I'm not a historian so I wouldn't be surprised some version of them were
This is a great point! And not only a compiler behind a subscription, it's also a compiler whose financial interests are not aligned to be the best compiler but the one that makes the most money, which is unclear what it means at this moment. Will it have ads? Will it give preference to some technology over another? Will it steal your code? It's an unreliable and opaque compiler!
We are though? It just depends on the task and the costs.
> Do we really think we'd have a GNU moment for """open""" models? And are we willing to bet our industry on that?
Yes and yes. We're in the mainframe era. But history this time around is passing us by at a ridiculously fast clip. Local models become "good enough" for new tasks by the day, after which they continue to shrink for a given performance level.
I'm not going to bet against either moore's law or relentless increases in model efficiency any time soon.
There is an argument that I’ve been seeing more recently that argues why we should expect open models to eventually reach good enough status that people use them over frontier commercial models.
Basically it boils down to geopolitics, the US economy is currently being propped up by a small subset of companies, and a lot of that is based on proprietary models and speculation in the market around them. China is going to continue to dump better and better free models out to complete. Thus pulling the rug out on all that speculation.
Interactions with agents are conversational, while higher order langs are declarative. Spec driven development has been failing us, because there is no feedback loop from the runtime to the spec.
> you're pushing a behavior-modification scheme onto users
In general I think that your comment is reasonable. I just would like to point out that such "behavior-modification" schemes are sometimes introduced for genuinely good and ethical reasons.
For instance, it is in my opinion desirable to make it more difficult for users to delete all their photos by e.g. having to confirm their decision in a dialog first. Because it prevents them from accidentally doing something they might not want to do and which is potentially impossible to revert.
I am unaware of such a capability of Cloudflare.
I believe it is the site administrators who have inserted Cloudflare in between their sites and their users.
Usually it is done for rational reasons of establishing a protection against bots. But what is less rational, in my opinion, is when everyone uses the same provider for that.
Because it indirectly turns Cloudflare into a monopoly. And monopolies often converge to a state when they start to abuse their position.
reply