Might I suggest a spin on this: instead of blocking the IPs, consider serving up different content to those IPs.
You could make a page that shames their domain name for stealing content. You could make a redirect page that redirects people to your website. Or you could make a page with absolutely disgusting content. I think it would discourage them from playing the cat and mouse game with you and fixing it by getting new IPs.
One possibility: Serve different content, but only if the user agent is a search engine scraper. Wait a bit to poison their search rankings, then block them.
Assuming you've monetized your content with ads, depending on your ads provider, this may have deleterious effects on your account with that provider, as they may then assume you're trying to game ads revenue.
zip bombs are files that when unzipped expand to enormous sizes. I'm not sure if OP put one to be downloaded for the offender to kill their disk space, or if you could stream one hoping the client browser/scraper would attempt to decompress and crash for memory or disk outages?
I think you missed the point - if people show up at $PROXY expect nice stuff but see junk, then they won't move over to $REAL and instead blame $REAL.
E.g. you'd like some way to redirect people from $PROXY site to $REAL site, and disgusting content on $PROXY won't do that - it'll reflect poorly on $REAL
It's a proxy, so there's no "crawler". It's just an agent relaying to the user. Passing something to this proxy agent just passes it directly to the user.
You could make a page that shames their domain name for stealing content. You could make a redirect page that redirects people to your website. Or you could make a page with absolutely disgusting content. I think it would discourage them from playing the cat and mouse game with you and fixing it by getting new IPs.