This book is great, but if your stats background isn't quite up to snuff, it can be an intimidating first-read.
Personally, I studied Duda & Hart's pattern recognition [1] and Casella & Berger's statistics text [2] simultaneously. This took about the equivalent of 2 semesters. Duda's text gets the main ideas across without being as heavy on the probability theory / stats.
Afterwards, I studied "Elements ..." by Hastie et al., which was far more readable after going through Casella & Berger's text. Now Hastie et al. is my go-to reference. I also should note that this all assumes that you also have the requisite math background: up to calc 3, linear algebra, and maybe some exposure to numerical methods (in particular, optimization).
Everyone keeps linking ESL, but really ISLr is much easier to understand, provides more important clarifying context, and covers more or less the same information.
ESL is more like a reference and prototype for ILSr
If you don't understand something in the book, back up and learn the pre-reqs as needed.
http://web.stanford.edu/~hastie/ElemStatLearn/printings/ESLI...