I have no idea if it works, but Anthropic in particular spent a lot of time crawling the tar-pit[1] I had running on my domain. They were the reason I set up the tar pit in the first place, as they were at one stage averaging 5 requests per second, for days, on a blog site that probably doesn't even have a hundred pages on it. They've retrieved millions of pages of content from my tar-pit that were texts generated via markov chain from the contents of Moby Dick.
In my experience, Google (among others) plays nice. Just put "disallow: *" in your robots.txt, and they won't bother you again.
My current problem is OpenAI, that scans massively ignoring every limit, 426, 444 and whatever you throw at them, and botnets from East Asia, using one IP per scrap, but thousands of IPs.
It also probably won't work if the person actually wants your content and is checking if the thing they scraped actually makes sense or it just noise. Like, none of these are new things. Site owners send junk/fake data to webscrapers since web scraping was invented.
About two years ago, I made up reference to a nonexistent python library and put code "using" it in just 5 GitHub repos. Several months later the free ChatGPT picked it up. So IMO it works.
Even it did work, I just can't bring myself to care enough. It doesn't feel like anything I could do on my site would make any material difference. I'm tired.
I definitely get this. The thing that gives me hope is that you only need to poison a very small % of content to damage AI models pretty significantly. It helps combat the mass scraping, because a significant chunk of the data they get will be useless, and its very difficult to filter it by hand
The asymmetry is what makes this very interesting. The cost to inject poison is basically zero for the site owner, but the cost to detect and filter it at scale is significant for the scraper. That math gets a lot worse for them as more sites adopt it. It doesn't solve the problem, but it changes the economics.
1. Simple, cheap, easy-to-detect bots will scrape the poison, and feed links to expensive-to-run browser-based bots that you can't detect in any other way.
2. Once you see a browser visit a bullshit link, you insta-ban it, as you can now see that it is a bot because it has been poisoned with the bullshit data.
My personal preference is using iocaine for this purpose though, in order to protect the entire server as opposed to a single site.
The search engine crawlers are sophisticated enough, but Meta's are not. Neither is Anthropic's Claude crawler. Source: personal experience trying garbage generators on Yandex, Blexbot, Meta's and Anthropics crawlers.
I'm completely uncertain that the unsophisticated garbage I generated makes any difference, much less "poisons" the LLMs. A fellow can dream, can't he?
Because the internet is noisy and not up to date all recent LLMs are trained using Reinforcement Learning with Verifiable Rewards, if a model has learned the wrong signature of a function for example it would be apparent when executing the code.
> That is built with React Native for Windows. No, that is not a full JavaScript framework in your start menu.
This is incorrect. It is a full JavaScript framework in your start menu.
I don't see your read that it's about ram-hungry web views either. To me, "Start menu uses React" is a dig that Microsoft is so uncommitted to it's native development platform that they (partially) don't use it in one of the most 'core' parts of the operating system.
Shouldn't devs be allowed to select what they feel is the "best" choice for a given component? While I wouldn't expect to see a SwiftUI in Windows from Microsoft, Microsoft hasn't been adverse to various NIH web frameworks for quite some time now.
If it fits and meets the goals of the project, why not?
If Microsoft developers' "best" choice for a tiny UI component like this is not it's flagship native UI framework, then that's a problem for Microsoft. That is the criticism.
> Shouldn't devs be allowed to select what they feel is the "best" choice for a given component?
To some extent, yes. But if they choose React Native, something's probably wrong, because (despite what the article says) that requires throwing in a Javascript engine, significantly bloating a core Windows component. If they only use it for a small section ("that can be disabled", or in other words is on by default), it seems like an even poorer trade-off, as most users suffer the pain but the devs are making minimal advantage of whatever benefits it provides.
If the developers are correct that this is the best choice, that reflects poorly on the quality of Microsoft's core native development platforms, as madeofpalk said.
If the developers of a core Windows component are incorrect about the best choice, that reflects poorly on this team, and I might be inclined to say no, someone more senior should be making the choice.
There are two possibilities: Either it’s really the best choice among the available frameworks (very questionable), or they picked it regardless. Both reflect badly on Microsoft, given what React Native is, and given how central the Start menu is to the Windows experience.
Here's one: Microsoft management heavily incentivizes their developers to use LLMs for virtually everything (to the "do it or you're fired" level) and the LLM (due to its training data or whatever) is far more able to pump out code with React Native than their own frameworks. This makes it the right choice for them. Not for the user, but you can't have everything.
I don't have any inside information; I'm running with the hypothetical.
I guess the ship sailed a long time ago, but while no one is going to turn off their ad blocker, they could make people not use one in the first place.
In the Apple ecosystem 'just a spec bump' is pretty significant IMHO. So often they will completely disregard products and just let them languish. The Mac Pro still only comes with the M2 chip.
I agree with you... but was it actually a failure? I feel like that would require to have some kind of negative consequences, which I don't think Meta has faced over this. They've still been rewarded handsomely.
The hardware is actually pretty decent, and some VR games work really well. For example this table tennis simulator honestly feels life the real thing, even down to the little "tap" in the bat when you hit the ball :
It’s not wasm?
reply