Hacker Newsnew | past | comments | ask | show | jobs | submit | karencarits's commentslogin

One possible solution could be to give you an option to send the affected password as a list to the mail address you specify, then only people with access to that mail address will see them

Hash of the affected password? People share these things and don't always run their own mail servers.

That would be a great idea!


Oh wow, this is exactly what I want, but with a server component so it works on mobile too (where I do most of my reading) and gets data from all of my workstations (I have 4-6 at any given time).

Maybe I can hack it into this one.


Getting backups right _is_ difficult and can easily be quite stressful. Yes, having some external drives here and there with files would of course be helpful. But then, should you encrypt them in case of theft? Where to keep them in case of fire? What to do with "old" backups (can I trust the drive to live more than 2 years? 5 years?), copy them over to new drives? But what then with duplicated files? I think having backups in the cloud is currently the best "backup and forget" strategy


> Getting backups right _is_ difficult and can easily be quite stressful.

My point was that something is much better than nothing, and you don't need 99.999% reliability in your setup to greatly reduce risk that you're exposing yourself to when keeping 30 years of data in only one place.

> But then, should you encrypt them in case of theft?

Depends on the nature of the data. I guess that most of that 30 years worth of data didn't need encryption, and copying only insensitive data is an option. On the other side, cloud account, or device logged in to the cloud account could be stolen too.

> Where to keep them in case of fire?

That's irrelevant if we're talking about backing up data stored on cloud service.

> What to do with "old" backups (can I trust the drive to live more than 2 years? 5 years?), copy them over to new drives? But what then with duplicated files?

Aside from some unlikely issues, yes, drives should last at least a couple of years. In the 5+ year timeframes I think you could just buy a new drive (bigger/cheaper/more reliable than the last as the technology improves). If we're talking about a lazy strategy of backing up the data once a year, even deleting everything on a drive and copying everything again isn't that bad. Better than nothing.

> I think having backups in the cloud is currently the best "backup and forget" strategy

But we're not talking about having the cloud as a backup. The issue here is having the files only in the cloud, with no backup. For a non-technical person, cloud as a backup is great, but here we have a case where a person had all their data only on the cloud, and then lost access to the cloud. If the cloud was only a backup (or a way to sync/access the data on other devices), but the data would still be present on some private device, there would be no problem.


I guess the paper would be complete enough to publish as a preprint at the stage where this specific service is most useful


Good point; where I used to work we didn't really do preprints though. And PIs were all incredibly paranoid, making any uploads to third parties a real no-go


I'll hopefully get to test it soon. To me, LLMs have so far been great for proofreading and getting suggestions for alternative - perhaps more fluent - phrasings. One thing that immediately struck me, though: having 'company' in the URL makes me think corporate and made me much more skeptical than a more generic name would.


IMO that's what this focus on. Language. That's what LLMs excel at. Perhaps branch out to providing localized papers to markets like China or France (hah, sorry).

Judging the actual contents may feel like the holy grail but is unlikely to be taken well by the actual academic research community. At least the part that cares about progressing human knowledge instead of performative paper milling.


Haha fair point, domain name was a 5-second, “what’s available for $6” kind of decision. Definitely not trying to go full corporate just yet


Great! Also, checking journal author guidelines is usually very boring and time consuming, so that would be a nice addition! Like, pasting the guidelines in full and getting notified if I am not following some specs


We are already looking into that: https://github.com/robertjakob/rigorous/tree/main/Agent2_Out...

Would be great to see contributions from the community!


What use cases are people using local LLMs for? Have you created any practical tools that actually increase your efficiency? I've been experimenting a bit but find it hard to get inspiration for useful applications


I have a signal tracer that evaluates unusual trading volumes. Given those signals, my local agent receives news items through API to make an assessment what happens. This helps me tremendously. If I would do this through a remote app, I'd have to spend a several dollars per day. So I have this on existing hardware.


Thank you, this is a great example!


Do you want to share it?


Anyone who does not want to leak their data? I am actually surprised that people are ok with trusting their secrets to a random foreign company.


But what do you do with these secrets? Like tagging emails, summarizing documents?


a document management system is an easy example. Let’s say medical, legal, and tax documents.


Thank you, but what do you use the llm for? Writing new documents based on previous ones? Tagging/categorization/summarization/lookup? RAG? Extracting structured data from them?


Me personally, i’m using paperless-ngx to manage documents.

i use ollama to generate a document title, with 8 words or less. I then go through and make any manual edits at my leisure. Saves me time which i appreciate!

Paperless-ngx already does a pretty good job auto-tagging, i think it uses some built in classifiers? not 100% sure.


A random foreign company is far better than a big 5 eyes country, which syphon everything to the NSA, and use it against you.

Whilst the Chinese intelligence agency will have not much power over you.


No one cares about your 'secrets' as much as you think. They're only potentially valuable if you're doing unpatented research or they can tie them back to you as an individual. The rest is paranoia.

Having said that, I'm paranoid too. But if I wasn't they'd have got me by now.


step back for a bit. some people actually work with sensitive documents as part of their JOB. Like accountants, lawyers, people in medical industry, etc.

Sending a document with a social security number to OpenAI is just a dumb idea. As an example.


I do a lot of data cleaning as part of my job, and I've found that small models could be very useful for that, particularly in the face of somewhat messy data.

You can for instance use them to extract some information such as postal codes from strings, or to translate and standardize country names written in various languages (e.g. Spanish, Italian and French to English), etc.

I'm sure people will have more advanced use cases, but I've found them useful for that.


Also worth it for the speed of AI autocomplete in coding tools, the round trip to my graphics card is much faster than going out over the network.


Anyone actually doing this? DeepSeek-R1 32b ollama can't run on an RTX 4090 and the 17b is nowhere near as good at coding as OpenAI or Claude models.


I specified autocomplete, I'm not running a whole model asking it to build something and await an output.

DeepSeek-coder-v2 is fine for this, I occasionally use a smaller Qwen3 (I forget exactly which at the moment... Set and forget) for some larger queries about code, given my fairly light used cases and pretty small contexts it works well enough for me


Any companies with any type of sensitive data will love to have anything to do with LLM done locally.


A recent example: a law firm hired this person [0] to build a private AI system for document summarization and Q&A.

[0] https://xcancel.com/glitchphoton/status/1927682018772672950


I use the local LLM-based autocomplete built into PyCharm and I'm pretty happy with it


It's difficult to assess how typical your experience is; I tried your initial prompt (`Write me a simple todo app on cloudflare with auth0 authentication.` on gemini-2.5-pro-preview-05-06) and didn't get any mentions of @auth0-cloudfare, although I cannot verify if the answer is working as-is

https://pastebin.com/yfg0Zn0u


Shocked you got a different output from the stochastic token generator.


That's not the point. While there is a temperature setting and randomness involved, you can still benchmark and experience significant differences in the output between models and generations. I thus provided more details and the full output to make it easier for people to assess the context of the comment I replied to

When someone uses the same tools as I do but seem to experience problems I do not have - these kind of posts often describes how bad LLMs are or how bad Google search is - I get a bit confused. Is it A/B testing going on? Am I just lucky? Am I inattentive to these weaknesses? Is it about promoting? Or what areas we work in? Do we actually use the same tools (i.e., same models)?


Sorry, I wasn't aware the page parameter when pasting the link so don't read anything into that, I have not personally posted in that thread. But I strongly support H_Express in his case that dated models should not suddenly be redirected to another endpoint


It's a strange thing Discus does loading only a portion of discussion, as if it's prohibitively expensive to pull out a few extra kb to load the entire text of the thread at once.

I do enjoy seeing a Google page load about 30 scripts from AWS Cloudfront, though :)


I think it's also important to recognize that while the Catholic church has values and principles they adhere to and are unlikely to change because they are so deeply founded in tradition and scripture - for example, that marriage (as in the sacrament) is between a man and a woman - the "men of the cloth" are expected to take care of their ministry as caring and loving sheperds. But that process is often based on personal and individual relationships and they will not reach headlines in the media.


The elephant in the room is the AIDs crisis. They already had a chance to demonstrate that they were capable of disagreeing with homosexuality but still treating people with live. Instead they left them to die.

What we have now is just saying "we super duper pinky promise that we've learned our lesson and won't do the exact same thing next time even though we totally are with MAGA."


> For latex, you choose your target at the start

Yes, sometimes, but I would say that one of the benefits of latex is how easy you can switch to another layout. But I guess the point is that you typically render to a set of outputs with fixed dimensions (pdf)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: