Hacker Newsnew | past | comments | ask | show | jobs | submit | asattarmd's commentslogin

> And every second I spend trying to do fun free things for the community like this is a second I'm not spending trying to turn the business around and make sure the people who are still here are getting their paychecks every month.

Then step aside as the maintainer of the project then and better yet, make something like Tailwind-foundation etc. which is truly open source. Go spend your time building your business, but you can't become the bottleneck and not do anything for something that has become so foundational for Web Dev.


I urge you to understand what he is going through, he started the project, made it available freely, as more effort was required he added a premium offering to keep the whole thing running and hire more help. Please pause to think before coming to a rush judgement. How would you react if you had done exactly the things he had done, and you just had to lay off most of your team yesterday. We are humans and not robots, for all he has done, he has certainly earned the right to some times focus on what's affecting him first before he can focus on OSS.

Be Kind, we are all born billionaires with billions of "kindness tokens" in the bank, don't use them sparingly.


He gives a gift to the world and you’re telling him to just give it up because somebody did work nobody asked for and he doesn’t want it for his project

Get a grip.


There are videos by Welch labs that go into detail about what he did:

https://www.youtube.com/watch?v=Phscjl0u6TI https://www.youtube.com/watch?v=MprJN5teQxc


What's the fundamental limitation to context size here? Why can't a model be fine-tuned per codebase, taking the entire code into context (and be continuously trained as it's updated)?

Forgive my naivety, I don't now anything about LLMs.


Igalia is a consultancy that specializes in fixing bugs/building features in Browsers. My company uses them regularly


This is the real answer.


I'd remove the else if blocks since we're returning.

    private String generateGuessSentence(char candidate, int count) {
      if (count === 0) return "There are no " + candidate + "s";
      if (count === 1) return "There is 1 " + candidate;

      return "There are " + count + " " + candidate + "s";
    }


I feel like single-line conditionals are harder to read in bigger projects. I remove the brackets in single-statement conditionals and loops, but I still use another line.


Braces _always_.

It just needs one tired / inexperienced / somethingelse coder to "quickly add" an extra call to the topmost if and then you'll have interesting issues.

I'll take the "ugly" braces every day compared to having risky structures like that in the code at all.


Thorough testing should catch all those "interestig" issues.

But easier maintainability I also favor braces everywhere.


Google having so many private photos in Google Photos must be a goldmine for them.


> Google having so many private photos in Google Photos must be a goldmine for them.

While true, it's META who has won that arm's race long ago in my view; hell, they just disclosed that they have private access to DMs to Netflixh [0] in a lawsuit.

If you don;t think they are training their own models on this data over all their platforms you have to be a complete idiot o: Facebook, Instagram, Whatsapp.

That is a much larger treasure trove given the sheer scale of people on those platforms, Google is limited to mainly Android users and those who use it's suite on PC (relatively small compared to social media users), which excludes most Mac users.

The thing they don't tell you about this dark underbelly of AI is just like the (meta)data that is for sale to 3rd parties, it's tiered price structure wherein Mac users are often the premium tier de to their more 'affluent' status and likelihood of impulsive in app purchases.

This is why I think META already won the AI race, they opensource Llama and have the a massive treasure trove of data to refine and train when they see what the OSS community creates that is of actual value: ChatGPT/DALL-e runs at a loss for MS/OpenAI. But if anyone can monetize this gold rush it will be META.

And perhaps more critically from an infrastructure POV, Llamma now runs better on CPU [1] rather than GPU, which means they won't have to be constrained or price pinched on GPUs like Microsoft, Google, Amazon likely will due to demand constraints from Nvidia (see ETH mining craze during COVID). They can focus on optimizing their data centers with more free cash flow which meant they can have a bigger footprint for when they finally figure out how to properly monetize this AI bubble, because it is is a bubble, from now until then.

I think Zuck learned from Libra that staying out of the limelight during a bubble is critical if he wants to undo the Metaverse money-pit/losses.

0: https://www.movieguide.org/news-articles/facebook-allowed-ne...

1: https://news.ycombinator.com/item?id=39890262


> Google is limited to mainly Android users

https://www.appmysite.com/blog/android-vs-ios-mobile-operati...

Random link. Can't vouch for it. But US and RoW have quite different patterns.


> Random link. Can't vouch for it

Seems about right to me, Android dominates the mobile World by sheer numbers.

But what is the value that they can derive from user data? A million Bangladeshi's texts from food delivery is probably a lot less valuable than say a Singaporean using Numbers on Mac OS to layout the next lucrative investment and the data they;d get from the correspondence of say 100 high net worth individuals hidden behind iOS (Pegasus MITM attack notwithstanding).

Again, the name of the game is to derive signal from noise from data, bulk collection is primitive when training models and often incredibly difficult to work around once it is in. I seriously think Gemini had this problem, along with QA/QC issues, rather it going from so-so Bard to total 'woke' Gemini. I may be wrong, but I think this is what happens when you go down the bulk collection and unfiltered/un-curated data route.


> But what is the value that they can derive from user data?

What, are the pictures and videos of people from the global south somehow not good enough to train AI due to their economic situation?


> What, are the pictures and videos of people from the global south somehow not good enough to train AI due to their economic situation?

I don't make the rules, in fact if you are seriously wondering what use 'darker' people's data have had with AI training look no further than the surveillance based platforms that are responsible for tons of false incarcerations of mainly black US citizens [0].

I'm not sure if it's going to change for the plight of the 'Global South's' data either. It's not that I think it's inherently prejudiced, either; it's more like it's optimized to be greedy in order to extract as much value as it possibly can from the current system at all costs.

People need to stop smoking hopium and thinking that this is going to usher some sort of egalitarian renaissance, this is business as usual by the mega corps that bring you this tech.

0: https://innocenceproject.org/artificial-intelligence-is-putt...


Whatsapp chats are encrypted, how can they be used to train the models? Also what kind of training can be done on Instagram data, is there anything of value there?


> Whatsapp chats are encrypted

While they claim E2E encryption, I seriously doubt they would offer this service entirely for free with having some backdoor or potential MITM breach that they likely tucked away in the ToS given the wide use of it it most of the World who pay for SMS/text messages: it just seems so incredibly unlikely to be entirely encrypted from a company that willing gave DMs to Netflix, used Cambridge Analytica etc... But even if it is encrypted, the meta data generated can tell you a lot too--as was the case with Pokemon GO--that may not directly benefit LLMs, but could help with creating dark patterns that make your AI companion (under the guise of an LLM) the 'must own' when deciding who to buy tokens/compute from.

Speculative for sure, but just look at the Twitter file leaks revealing how social media platforms willing work alongside intelligence agencies.


> While they claim E2E encryption, I seriously doubt they would offer this service entirely for free with having some backdoor or potential MITM breach that they likely tucked away in the ToS given the wide use of it it most of the World who pay for SMS/text messages: it just seems so incredibly unlikely

You don't have to trust Metas self-regulation, but you best believe the EU does not fuck around on such issues. Self-preservation is a hell of a motivator.


> Also what kind of training can be done on Instagram data, is there anything of value there?

Billions of comments and private messages; billions of data points on user behavior and (more importantly) how they respond to manipulative UI/UX/content... Nothing useful there??


I'm genuinely curious how does that data help. What would the prompts be like? "Help me design an addictive UX"? How do comments like birthday wishes or people posting their beach pictures and people replying with how good they look add any kind of value to the ML model training? Those conversations would be in larger quantity than any that discuss anything meaningful.


As well as emails, documents, reviews…


One advantage of GET is I can just copy the URL and share it. The article makes no mention of that.

While I love the proposal (apart from the name, I can see the SEARCH verb being used for something that's not search), they should also address the URL share-ablity aspect.

Something like https://google.com/search<some-special-character><query> where query can be arbitrarily large (>2000 URL length restriction) and the browser is smart enough to treat only https://google.com/search as the URL and anything after that as the body. The complete "URL" can be big and shared anywhere else.


Yes, that’s also lost when you do POST. Which is by design though. A HTTP Search seems like only drawbacks.


SEARCH can have a request body.

Many systems limit the length of the URL, so this is significant.


If I send a request body with GET, what modern systems would even have a problem with that? Is there some caching middleware somewhere that I've never heard of or just ignored that will screw it up?

If there was a GET-with-body http verb I'd probably use it at one point or another, but I often wonder where plain GET would start blowing up if I just used it for that.

Honestly, I think rest is a mess, and that everything should just be POST with no values in the url at all.


GET with a body is pretty useless because the standard doesn’t allow the body to affect the results. So, proxies, browsers, etc. are free to ignore the body when caching results.


Quick workaround: GET /search?key=$hash(body)


Reminds me Google Maps: GET /search?pb=$protobuf_to_string.


even though most servers support it AFAIK, many clients don't. Many will even silently discard the body when sending a GET request.


So can POST. It's entirely redundant


Verbs specify safety/idempotency guarantees for API-blind middleware, as well as whether (if either applies) a body needs taken into account; POST is not idempotent, SEARCH/QUERY is safe, and therefore also idempotent, but differs from GET in that that guarantee is body-specific.


Might as well just say "ok now GET can have body" instead of all that noise


Won’t retroactively change things that support GET to support GET-with-body. A new verb makes it much more likely that anything that supports the verb supports the desired semantics.

(Of course, you could instead bump the HTTP version for support of GET-with-body, but given how HTTP/2 and HTTP/3 are defined in terms of HTTP/1.1, you’d need three new versions of HTTP for a change to core verb semantics. A new verb, again, is far simpler.)


From reading the article, one of the key items addressed by HTTP Search is caching. POST requests are usually not safe to cache


Is search safe to cache though?

It would definitely heavily depend on the dataset you're querying... Very little value in cached search results if you're searching through time-sensitive data such as logs or other live-datasets.

Most datasets I've searched also has a concept of permissions, so person a couldn't be served the same cached result as person b... I think search can't be cached at the http level either, its too heavily dependent on the data you're searching through so you'd have to implement it in the application anyway.

the article does make a good point though: a `get` request that supports a `body` would be nice, and thats pretty much all they're arguing for with the `search` verb.


The normal cache controls apply, so you can make it as safe as you need.


The limit of a GET request's length is at least several K on most systems I've used, so it's rarely an issue.


That's not as much as you might think.

There's a reason elasticsearch accepts POST


But what exactly is the difference between POST and SEARCH? Both include the request parameters in the body, so they would be obscured from the user. Unless they aren't, in which case it is a matter of the choices made in the implementation.

Is it implied idempotency and the lack of a confirmation dialog when the user reloads the page?


Cachability, perhaps. You cannot cache a post, but may be able to cache a search or query verb.


The primary use case of SEARCH is programmatic, e.g., making complex requests to a search API in order to render results. Those are API requests, they're not being shared around.


The proposal has changed the name to QUERY, just yesterday.


> One advantage of GET is I can just copy the URL and share it.

Often but not always.

The article is wrong when it says message bodies for GET are defined to be meaningless. They are in fact just not defined to be meaningful, which is very much not the same thing.

Nothing in the spec for GET blocks using message bodies with it. Elastic search famously uses(used?) bodies with GET requests.


unless you control every hop between client and server, GET bodies can be arbitrarily dropped, and can't be relied upon


Guess that needs to be different than the ? query params. I wonder if you could use ?key=value as your signal for GET, and ?value as your signal for SEARCH. I think it's up to the service to parse the string that comes after ? in a URI so maybe that's a way to go.


That would unfortunately break the internet as a GET query without ?key=value and just ?value is perfectly reasonable.


This is why the RFC suggests you redirect to a GET like /resource/queryABCD


Related, I wish there was a more standard way to include things like verb and headers in a URI. I hacked an implementation that parses /something#a:b&c:d to set the headers a and c, I was thinking for verb I could do https+get.


verbs don't identify a resource, so why would they be in a URI?

A separate format to serialize a request spec is a good idea, sure, but it is a distinctly different thing than the URI of the resource referenced by the request.


Is that not just the HTTP 1.1 format? It's a perfectly readable plain text format.


I'm pretty sure you know what the parent means.

Forget about what acronyms stand for. The thing that hyperlinks point to and you can type in your browser bar is called the URL or the link.

And the point is that being able to specify a verb and headers in a link would be super useful in certain situations.

Continue to call it a URL or URI and just change the "R" in those from Resource to Request, semantics problem solved. Or invent a new URA where "A" stands for "action" and it's a valid hyperlink. The naming of it is the least important part here.


We already have a format to identify an entire request including verb and headers. It’s called HTTP.


Yes, but it can't be used as a hyperlink or typed into the browser bar.

HTTP is a two-way messaging protocol. What's being talked about here is the capabilities in hyperlinks. Totally different.


That sounds like a deficiency of browsers, not of URLs. Browsers can already make arbitrary HTTP requests, and there are some rudimentary ways to expose this in hypertext (such as forms or XHR), but there’s nothing stopping a browser from letting you dump a full HTTP request into a text field and sending it.


You're ignoring such a huge part of the HTTP specification there, like request idempotency, caching, the security model, and surely stuff I'm forgetting right now.

HTTP, at its heart, is a way to compose an action from a verb and a noun - such as "get this", or "update that". The request method, or verb, is intertwined with the URI, or resource, the noun - together, they form the action the user agent intends to carry out. "GET /foo" is entirely distinct from "POST /foo", and there are lots of considerations why it has been implemented like that. I cannot recommend reading the spec (or letting ChatGPT summarise it for you) enough, it will really make more sense.

Having said all that, I know what situations you are referring to - say, issuing a PATCH request with an HTML form, or circumventing some redirect bug with a POST request. Still, all of those problems hint at some other, more general issue, and solving such inconvenience would come at the price of a completely broken HTTP specification. Protocols like email, or HTTP, have only been around for so long because they were designed elegantly and carefully. Let's not break that for convenience' sake :)


Believe me, I've read the spec as I've wound up implementing HTTP servers from scratch multiple times a couple decades ago.

None of this suggestion is about incorporating all of HTTP's functionality. It's just the situations that you say you know I'm referring to -- verbs, things like authorization headers, a POST payload.

Expanding the functionality of hyperlinks wouldn't break anything about HTTP. It would just allow more requests to be defined in a single line of text (a hyperlink), rather than requiring lines of JavaScript to define. The browser (or cURL or whatnot) would convert the link to the actual HTTP request. Zero changes to HTTP.


Well. Hyperlinks are part of the spec though, and if you modify them and expect clients to know how to deal with that, you’ll need to modify the specification, and that implies you need to define how those changes affect caching, Proxies, and the security model. There’s a pretty good reason you send credentials in a header, not the URL. What’s my browser supposed to show in the history?

What you’re looking for is a browser extension, not a haphazard URI change.


you can call it https+rpc and be formatted like

https+rpc:||news.ycombinator.com|reply?id=36096485&goto=item%3Fid%3D36095032%2336096485#method=PUT#H-Accept=text/html

with | instead of / to workaround this site encoding


This is an interesting idea for a browser extension. Maybe needs a change in name from URL/URI. Could be a DURI, Discrete Universal Record Interaction. Just spitballing. You could share one-liners similar to how one might share curl command-lines but expect them to work in multiple environments.


Sounds like what curl does. Maybe curl:{curl commandline}


> One advantage of GET is I can just copy the URL and share it.

no, you cant. if the server requires any headers such as Authorization or Cookie, this method will fail.


It’s not a “no you can’t” just because you know of some exceptions.

And even then, they are still correct while you are not. You can copy the GET url even if it ultimately it requires authentication in a way that you can’t do it all for a POST request.


Typical scenario is to redirect to login and after successful authentication return back to the requested URL, so the method doesn’t fail if server is implemented correctly.


They are obviously talking about public urls.


it’s extremely to share urls needing auth with people who have the same access levels as you, such as in your company


Eh, don't worry about it. As long as the job you're working at is paying you enough to live and save enough, change as many hobbies as you want and change as many jobs as you want.


As someone who has just heard the name Web3 over and over again, is there anything tangible that I can try? I have tried IPFS, but I don't know if it's "web3".


Just heavily invest into badly drawn monkey NFTs and you'll soon become a devout web3 enthusiast.


there are a few clones of social media sites like https://orbis.club/, there are others that are like reddit or similar but I don't remember their urls.

As other commenter mentioned below, "web3" is mostly storing transactional info into a blockchain and then linking stuff to ipfs + some layer on top to display this. For example that orbis site, indexes/caches the data in the bc to be able to display it as per their use case. So the only real difference is that the "tweets" are not owned by them but by each user. For anything else it's still a normal web app.

I don't know how this works legally, but there was a link from Vitalik's blog a few days ago explaining this. Say if a user uploads CP or some other illegal/offensive content the site can hide it but can't delete it. I'm not sure where liability would fall on here.

Similar, the case with Twitter blocking Trump a few years ago, maybe another site could still allow him to continue if they wish to do so by displaying the same content

I guess there are more use cases that distill from here, not sure how beneficial they are or might be, and just as many things in technology it is not 100% neccessary that the solution is better than something that already exists, it just needs to get traction and attract people/investment. Whether that happens with web3 is just to be seen I suppose.

Also descentralised exchanges are things coming up, but this gets even muddier with all the compliance requirements so not sure where those will end up


DeFi is pretty cool. Check out some of the strategies on yearn.finance or the simpler ability to be a market maker on uniswap.org.


I believe so as well. Compose is very similar to Electron. Jetbrains Toolbox (which uses compose) uses 500mb memory which is on par with what Electron would use for such a simple app.


Why would it be similar? It doesn’t need all the unnecessary abstraction of a browser/DOM. Also, initial memory usage doesn’t mean it scales the same way from then on.


It uses JVM and JVM tends to hunk all memory for object pools so it can allocate fast if needed.


I just checked. On my Windows laptop from work it uses 18.5 MB in the background. It uses 220-240 MB in the foreground though. Which seems totally unnecessary.

I already ditched the Toolbox app on macOS, because it marks the binaries as Intel only (while they're universal). The IDE's built in updater seems fine if you're only using the stable versions. So I think I'm just going to remove the Toolbox app from all my machines.


I really hope that's not the case - at least the browser has an excuse of being a scripting platform, a 3D and 2D rendering and layout engine, a video streaming platform, a database, a server and who knows what rolled into one - that Toolbox thing hopefully has less than 5% of that in its codebase.


Except Electron is 8 years old, and Compose desktop has started active development half a year ago.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: