Introduction to Parallel Programming
Parallel programming is most easily misunderstood as "just throw the program on more cores and it will be faster." Once you actually start writing code, you'll find that how to split tasks, how to partition data, how to arrange communication, and how to control synchronization are often more important than a single kernel or a single command.
This book is placed next to the server white paper because the two happen to correspond to the two sides of supercomputing learning: one is the hardware conditions of the machine itself, the other is how programs utilize these hardware conditions. Reading it won't immediately let you write high-performance programs, but it will make it easier to understand the subsequent discussions about MPI, OpenMP, GPU parallelism, and performance bottlenecks.