I really enjoyed one of this paper's citations, "Why do computers stop and what can be done about it" written by Jim Gray in 1985. A really nice and early look at how to build reliable systems from unreliable but independent parts. http://www.hpl.hp.com/techreports/tandem/TR-85.7.html