I'm fairly decent a prompt engineering I think, told it was for my art project of a creative writing class and I'd hidden 20 disturbing and nefarious things in the text (made sure to inject a fake murder into the text) - fake murder then a bunch of airsoft stuff, some psychological manipulation, and it oddly surfaced...some fb cookies, heh. 2mill tokens x3 runs
Have you tried querying for specific misconducts and let the LLM focus at one at a time? E.g. Find whether murders were planned or carried out, can you find any signs or plans of bomb-making, can you list all messages related to fire and arson, were any mass manipulation campaigns planned, etc, ...
I have the feeling that would probably be more effective, not sure though.
Well my first query was "is there anything bad in here" and it basically said "no, it's a bunch of weirdos talking about guns, conspiracy theories and politics but there is nothing truly bad in there" - and then I went through a bunch of prompting for a while and very quickly got bored because at least what I was looking at, was just a bunch of americans talking about politics and guns.