Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Any more good archiving tools?

For backing up websites, I released https://github.com/ludios/grab-site. Compared to HTTrack, it makes it easy to add ignores to skip unwanted URLs after a crawl has already started. It also saves to WARC instead of trying to fit the site to an on-disk directory structure, which is not always possible or useful (e.g. directory with > 100K files).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: