Book Review: PostgreSQL Replication

So for my series of System Engineering books, I will proceed with a short review of PostgreSQL Replication by Packt. The reason this book came to be a part of my collection is that while there is a lot of information regarding PostgreSQL replication out there, a lot of it is out of date, given the overhaul of the replication system in PostgreSQL 9.X. Without further ado, here is the list of contents of the book.

  • Understanding Replication Concepts
  • Understanding the PostgreSQL Transaction Log
  • Understanding Point-In-Time Recovery
  • Setting up asynchronous replication
  • Setting up synchronous replication
  • Monitoring your setup
  • Understanding Linux High-Availability
  • Working with pgbouncer
  • Working with PgPool
  • Configuring Slony
  • Using Skytools
  • Working with Postgres-XC
  • Scaling with PL/Proxy
    The book gets straight into business with an introduction of replication concepts, and why this is a hard problem that cannot be a one-size fits all solution. Topics such as master-master replication and sharding are addressed as well. After this short introduction, specifics of PostgreSQL are examined, with a heavy focus on XLOG and related internals. The book goes into a nice balanced amount of detail, detailed enough to surpass the trivial level but not overwhelming (and thank $DEITY, we are spared source code excerpts, although a few references would be nice for those that are willing to dig further into implementation details), providing a healthy amount of background information. With that out of the way, a whole chapter is devoted to the topic of Point-In-Time-Recover (PITR for now on). PITR is an invaluable weapon in the arsenal of any DBA and gets a fair and actionable treatise, actionable meaning that you will walk away from this chapter with techniques you can start implementing right away.With the theory and basic vocabulary defined, the book then dives into replication. Concepts are explained, as well as drawbacks of each technique, alongside with specific technical instructions on how to get there, including a Q&A on common issues that you may encounter in the field.
    PostgreSQL has a complex ecosystem and once the actual built-in replication mechanisms are explained, common tools are presented (with the glaring omission of Bucardo unfortunately). This is where the book falters a bit, given the excellent quality of the replication related chapters. The presentation of the tools is not even nor deep in all cases – my gripe is that the Linux-HA chapter stops when it starts to get interesting. Having pointed this out, still these chapters can be better written and more concise than information scattered around in the web. I have paid particular attention to the PgPool chapter, which does not cover PgPool-HA (hint: there is more than one way to do it). These chapters assume no previous exposure to the ecosystem so they serve as a gentle (and again, actionable) introduction to the specific tools but I would have preferred them to be 10-15 pages longer each, providing some additional information, especially on the topic of high-availability. Even as-is, these chapters will save you a lot of time searching and compiling information, filling in a few blanks along the way, so, make no mistake, they are still useful. Bonus points for covering PostgreSQL-XC, which is somewhat of an underdog.
    A small detail is that examples in the book tend to focus on Debian-based systems so if you are administering a Red Hat derivative you should adapt the examples slightly, taking into consideration the differences in the packaging of PostgreSQL. Overall, the book goes for a broad as opposed to deep approach and can server as a more than solid introductory volume. Inevitably, there is an overlap with the official PostgreSQL manuals, which is to be expected given that they are great. The quality of the book is on par with other Packt Publishing titles, making this an easy to read book that will save you a lot of time for certain use cases.

Book Review: Web Operations: Keeping the Data on Time

For my kickoff of systems engineering book reviews I have chosen this book. While not being technical in the strict sense of the term (if you are looking for code snippets or ready-to-use architecture ideas, look elsewhere), this collection of 17 essays provides a birds-eye view of the relatively new principle of Web Operations. As you will see from the short TOC below, no stone is left unturned and broad coverage is given to a range of subject ranging from NoSQL databases to community management (and all the points in between). This is what you will be getting:

  1. Web Operations: The career
  2. How Picnik Uses Cloud Computing: Lessons Learned
  3. Infrastructure and Application Metrics
  4. Continuous Deployment
  5. Infrastructure As Code
  6. Monitoring
  7. How Complex Systems Fail
  8. Community Management and Web Operations
  9. Dealing with Unexpected Traffic Spikes
  10. Dev and Ops Collaboration and Cooperation
  11. How Your Visitors Feel: User-Facing Metrics
  12. Relational Database Strategy and Tactics for the Web
  13. How to Make Failure Beautiful: The Art and Science of Postmortems
  14. Storage
  15. Nonrelational Databases
  16. Agine Infrastructure
  17. Thing That Go Bump in the Night (and How to Sleep Through Them)

Where can someone starts? Giving a chapter-by-chapter play is not the preferred way – chapters are short and to the point and use a variety of formats – one of them is a long interview for example, so I am going to talk about the overall feel of the book.
The roll-call of the book is impressive. I am sure that if you worked in the field for a little while, names like Theo Schlossnagle, Baron Schwartz, Adam Jacob, Paul Hammond et al, speak for themselves. Every chapter serves as a gentle introduction to the relevant subject matter – this is to be expected as the topics are quite deep and each one carries a huge assorted bibliography. What I particularly like about this book is not only the gentle introduction, it is also written in a way that makes in approachable to technical managers, team leaders and CTOs – chapters such as the one on postmortems and the ones on metrics are prime examples of this. What is awesome is that the book helps you identify problem areas with your current business (for example the lack of using configuration management such as Puppet or Chef) and provide you with actionable information. Extra points for openly acknowledging failure, there are more than two chapters related to it (as the saying goes, if you operate at internet scale, something is always on fire, someplace), including a chapter on how to conduct efficient postmortems. Even non-technical areas such as community management are covered, illustrating that not everything is technology oriented only in the area of running an internet business today.
Your experience with this book will greatly vary. If you are new to the topics at hand, then you might benefit by reading each and every chapter of the book and then revisit it from time to time – as your experience grows, so the number of useful ideas you can get out of this book will increase too. If you are an experiences professional, while this book might not be an epiphany, there is still useful content to apply and perhaps a few additional viewpoints might present themselves.
Overall? An excellent book for everyone involved in running an internet business with a lot of value and a long shelf life.
A final nice point is that proceedings from this book go to a charity, that is a nice touch.