Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok I understand what you're saying, you'd like full transparency into how the limitations are configured. However I'd have to reiterate, I wouldn't normally publish this information because it increases the chances of workarounds being discovered in case my solution isn't bullet proof. I'd say the same goes for OpenAI.


> [...] I wouldn't normally publish this information because it increases the chances of workarounds being discovered in case my solution isn't bullet proof. I'd say the same goes for OpenAI.

This is commonly known as "security through obscurity"[1] and has been shown to be ineffective most of the time.

[1]: https://en.wikipedia.org/wiki/Security_through_obscurity


Thanks for the link. I'm very familiar with this already though.

I don't rely on obscurity for 'security', i just don't think implementation details are required for most users so I don't publish them.

I'm very familiar with security through obscurity, ultimately I like to think the systems I build are secure but I can't always be sure, so why give people a head start? Not publishing details gives me time improve security.

Security through obscurity might not be the best approach, but you should know it's fairly common. For example when I generate a link to a Google Doc and "only those with the link" can access the document, I think that's a form of seurity through obscurity. No one is going to guess the link in any practical time frame...


At the same time you don't post a list of your valuables and what means you use to lock them up either.

Obscurity is a layer, but cannot be the only one.


I totally get this, since we (collectively) are still trying to figure out how to "program" LLMs. There is definitely a risk that too much transparency leads to attacks.

At the same time, security by obscurity does not work in the long run. In fact, the existence of this repo of reverse engineered prompts maybe means that secrecy is impossible.

Even worse, we won't necessarily know when the information leaks out, so we don't even know what compromises are out in the wild.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: