Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That number sounds like really bad advise to me. Should be more like 99.99% in my experience.

Internal services have extremely low response time during normal operation (p99 around a second) but then the database will start a snapshot or a large analytics query hits on the week end (high IO) and the latency is through the roof for a short while. Too bad if services have short timeouts, they're all failing all requests now for no reason.

p99 is normal operation. Services shouldn't be configured to systematically fail for 1% of operations.



fair enough, that's why they call out that you need to load test it and actually determine that the value you set meets expectations. Agreed that blindly setting a value is problematic




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: