So for my series of System Engineering books, I will proceed with a short review of PostgreSQL Replication by Packt. The reason this book came to be a part of my collection is that while there is a lot of information regarding PostgreSQL replication out there, a lot of it is out of date, given the overhaul of the replication system in PostgreSQL 9.X. Without further ado, here is the list of contents of the book.
- Understanding Replication Concepts
- Understanding the PostgreSQL Transaction Log
- Understanding Point-In-Time Recovery
- Setting up asynchronous replication
- Setting up synchronous replication
- Monitoring your setup
- Understanding Linux High-Availability
- Working with pgbouncer
- Working with PgPool
- Configuring Slony
- Using Skytools
- Working with Postgres-XC
- Scaling with PL/Proxy
The book gets straight into business with an introduction of replication concepts, and why this is a hard problem that cannot be a one-size fits all solution. Topics such as master-master replication and sharding are addressed as well. After this short introduction, specifics of PostgreSQL are examined, with a heavy focus on XLOG and related internals. The book goes into a nice balanced amount of detail, detailed enough to surpass the trivial level but not overwhelming (and thank $DEITY, we are spared source code excerpts, although a few references would be nice for those that are willing to dig further into implementation details), providing a healthy amount of background information. With that out of the way, a whole chapter is devoted to the topic of Point-In-Time-Recover (PITR for now on). PITR is an invaluable weapon in the arsenal of any DBA and gets a fair and actionable treatise, actionable meaning that you will walk away from this chapter with techniques you can start implementing right away.With the theory and basic vocabulary defined, the book then dives into replication. Concepts are explained, as well as drawbacks of each technique, alongside with specific technical instructions on how to get there, including a Q&A on common issues that you may encounter in the field.
PostgreSQL has a complex ecosystem and once the actual built-in replication mechanisms are explained, common tools are presented (with the glaring omission of Bucardo unfortunately). This is where the book falters a bit, given the excellent quality of the replication related chapters. The presentation of the tools is not even nor deep in all cases – my gripe is that the Linux-HA chapter stops when it starts to get interesting. Having pointed this out, still these chapters can be better written and more concise than information scattered around in the web. I have paid particular attention to the PgPool chapter, which does not cover PgPool-HA (hint: there is more than one way to do it). These chapters assume no previous exposure to the ecosystem so they serve as a gentle (and again, actionable) introduction to the specific tools but I would have preferred them to be 10-15 pages longer each, providing some additional information, especially on the topic of high-availability. Even as-is, these chapters will save you a lot of time searching and compiling information, filling in a few blanks along the way, so, make no mistake, they are still useful. Bonus points for covering PostgreSQL-XC, which is somewhat of an underdog.
A small detail is that examples in the book tend to focus on Debian-based systems so if you are administering a Red Hat derivative you should adapt the examples slightly, taking into consideration the differences in the packaging of PostgreSQL. Overall, the book goes for a broad as opposed to deep approach and can server as a more than solid introductory volume. Inevitably, there is an overlap with the official PostgreSQL manuals, which is to be expected given that they are great. The quality of the book is on par with other Packt Publishing titles, making this an easy to read book that will save you a lot of time for certain use cases.