Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm planning on burning time on building a full failure solution. Records snapshotted at least daily and any single node/service can die entirely and there is an exact, tested manual recovery checklist or automatic rollover option in place for each permutation.

This runs counter to the more cavalier "release early, polish later" advice I keep seeing. Maybe I am doubly freaked out because the things I'm storing are not easily recovered or re-imported by the users themselves or any kind of algorithm/redux.



It also runs counter to "do the simplest thing that could possibly work" and "KISS" and "YAGNI".

Doesn't mean it's a bad idea though. But if it's that good, make sure you announce it and market it as a significant feature...


It'll be a time sink. Why not do it as dirty as possible now and go back and tune it as time goes on?


See the original submission, that's why not :-)

I want to not only have a backup scheme but also make sure it's restore-tested. Maybe I wasn't totally clear, not planning on a beautiful failover in each place in the beginning (planning failover for the DB at least). Just a tested (even if manual) restore procedure in each situation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: