Ok I understand what you're saying, you'd like full transparency into how the limitations are configured. However I'd have to reiterate, I wouldn't normally publish this information because it increases the chances of workarounds being discovered in case my solution isn't bullet proof. I'd say the same goes for OpenAI.
> [...] I wouldn't normally publish this information because it increases the chances of workarounds being discovered in case my solution isn't bullet proof. I'd say the same goes for OpenAI.
This is commonly known as "security through obscurity"[1] and has been shown to be ineffective most of the time.
Thanks for the link. I'm very familiar with this already though.
I don't rely on obscurity for 'security', i just don't think implementation details are required for most users so I don't publish them.
I'm very familiar with security through obscurity, ultimately I like to think the systems I build are secure but I can't always be sure, so why give people a head start? Not publishing details gives me time improve security.
Security through obscurity might not be the best approach, but you should know it's fairly common. For example when I generate a link to a Google Doc and "only those with the link" can access the document, I think that's a form of seurity through obscurity. No one is going to guess the link in any practical time frame...
I totally get this, since we (collectively) are still trying to figure out how to "program" LLMs. There is definitely a risk that too much transparency leads to attacks.
At the same time, security by obscurity does not work in the long run. In fact, the existence of this repo of reverse engineered prompts maybe means that secrecy is impossible.
Even worse, we won't necessarily know when the information leaks out, so we don't even know what compromises are out in the wild.