Adventures with Linux Outline Client and aws-iam-authenticator

Hi all,

below is a small engineering puzzle that I had to solve recently. The essential components:

  • a Linux Laptop (in my case, running the excellent ClearLinux distribution)
  • aws-iam-authenticator
  • Outline client (A shadowSocks client)

The setup was the following. A Kubernetes cluster, a bastion host using Outline as the means to connect and access the cluster. In the ~/.kube/config you can see the following stanza:



exec: apiVersion: args: [“token”, “–cache”, “-i”, “”] command: aws-iam-authenticator


Issuing commands such as kubectl get pods would fail, with a DNS resolution error Outline Client was enabled. The root cause for this was that our setup, UDP traffic was disabled over Outline. However, Outline would take over /etc/resolv.conf and add a options use-vc line, indicating that ALL DNS resolutions should happen over TCP.

aws-iam-authenticator communicates under the hood with and attempts to resolve this hostname using UDP. This does not play well with the existing Outline Client setup and eventually will fail with an i/o timeout along the lines of ->

The easiest way I have found to fix this was the following: modify the routing table AFTER Outline client takes over. For my home network this can look along the lines of:

sudo route add -host gw wlp2s0

and Presto! DNS resolution works again for aws-iam-authenticator and kubectl workflow can proceed as normal. I tried experimenting with

export GODEBUG=netdns=cgo
export GODEBUG=netdns=go

but with both flavors of the resolver, it did not honor the options-vc.

Hope this is helpful to other people! Until next time!

Book Review Uncategorized

Book Review: Managing Kubernetes

So as 2018 comes to a close soon, one fact can be pointed out: Kubernetes is the winner of the container orchestration frameworks “war”, short lived as it was. The popularity of the project is growing steadily and it is being adopted in a variety of businesses, from the small, but technologically adept startup, to large, multi-department enterprises. The “big-three” cloud providers are offering managed Kubernetes versions and there is a steadily growing ecosystem, taking care of various needs that end-users might have. Having said that, this is reflected in the relevant technical literature were most major publishing houses have circulated Kubernetes related books. The focus of these books is usually migrating an application stack to Kubernetes, or Kubernetes basic building blocks. The book chosen for this review is different: it is aimed towards system engineers (or “DevOps” or “SREs” or whatever term/methodology your organization is using) and strives to provide actionable information on managing Kubernetes clusters. Let’s start with a ToC

  • Introduction
  • An Overview of Kubernetes
  • Kubernetes Architecture
  • The Kubernetes API server
  • Scheduler
  • Installing Kubernetes
  • Authentication and User Management
  • Authorization
  • Admission Control
  • Networking
  • Monitoring Kubernetes
  • Disaster Recovery
  • Extending Kubernetes
  • Conclusions

As it can already be seen from the ToC, this book is a departure from the usual Kubernetes literature – most of the literature out there concerns itself with application deployment and application lifecycle on the cluster. Another interesting fact for this book is that it assumes that you run a self-managed Kubernetes cluster – the peculiarities of running Kubernetes-as-a-service are not covered – however, the knowledge contained in this book will be useful even when troubleshooting managed installations. From the introduction, it becomes evident that the focus of the book is to prepare the reader to be able to respond when things do not work as expected (or plainly go wrong), be able to finetune and optimize clusters and finally to be able to extend the system with custom or new functionality.

Once the objectives are set, the book gives an essential introduction to Kubernetes application building blocks, starting from the bare essentials, like Pods and ReplicaSets and progressing to more advanced topics. However, as discussed, this is not the focus of the book so the section is quite short.

In the next chapter, the basics of Kubernetes architecture (which in itself is a distributed system) are laid down. The chapter is broken down in Concepts, Structure and Components, each of which gets a concise, yet packed with information, treatise. With that out of the way, the book moves on to the Kubernetes API Server, describes in detail its structure (which should not be that unfamiliar to those that have worked with well-designed REST APIs before) and ends with some debugging tips. This is followed by a short section on scheduling, while only a high level overview of the algorithm itself is given, at least we are given a treatment on affinities, taints and the rest.

As stated before, Kubernetes is itself a distributed system. There are more than one way to install it, the book chooses to focus on kubeadm – a sane choice given that quite a few installation tools use kubeadm under the hood.

The next two chapters deal with Authentication and Authorization (A1 and A2 for all you infosec acronym fans). Kubernetes itself supports a few Authentication mechanisms, with client certificates likely to be the most common, and the most important ones are examined. With that out of the way, the book discusses, again in a concise way, Kubernetes authorization and admission control. While authorization has kind of settled down by now, admission control is a rapidly moving target – this is pointed out properly by the authors, who pass along enough knowledge for someone to be able to follow into this topic.

The next chapter dives into one of the most important, yet often overlooked, aspects of Kubernetes: networking. As stated already, Kubernetes is a distributed system and networking between components is one crucial aspect. Again, we get a concise treatment about Container Runtime Interface, Service Discovery and even service meshes get a mention at the end of the chapter.

Running a cluster means you should have an x-ray into the cluster, thus the next cluster is concerned with monitoring a Kubernetes cluster, as well as an introduction to monitoring applications inside the cluster. This should give the operator enough insight about the cluster health at any given moment. In case the cluster health is not nominal, is the topic of the penultimate chapter, namely disaster recovery. When running a distributed system a lot of different aspects of the system can fail and, as an operator, you should be prepared to respond and restore the functionality of the cluster to a well defined state. Certain failure points and tools for recovery are discussed. The book closes off with a chapter on extending Kubernetes, somewhat surprisingly using Javascript as the language for the hands on example.

Overall, this is a must-have purchase if you operate Kubernetes clusters or if you are interested in how it operates under the hood. However, the book tries to cover a lot of material in a relatively short length of 170 pages so it remains concise at all times – the material covered within could easily expand to twice or even thrice the size of the book. However, given that this is a pioneering book and that it is written with a higher level of abstraction that extends its lifetime (something really important when dealing with a rapidly moving target, such as Kubernetes), then I would recommend it, and not just on the basis of the impressive credentials of the authors. I really look forward for this book to be used as a seed for a series of further literature (printed or electronic) dealing more with the operational dimensions of Kubernetes.


Hello world! (for the 3rd time)

Hope you enjoyed Commodity, and Cyberpunk as a Commodity before that, plus a number of personal blogs that thankfully are now presumed lost. Welcome back!


Book Review: Systems Performance: Enterprise and the cloud

Welcome back for another book review. This time, I am going to review a book that I have bought when it came out, in late 2013. I have always wanted to do a review of this one but it seems I had two options:

  1.  Write a short review that probably does not do the book justice.
  2. Postpone the review for a more suitable time, when $IRL and $DAYJOB allow …

I opted for the second option, as I consider this book to be indispensable (yes, this is going to be a positive review). So, here is the table of contents:

  1. Introduction
  2. Methodology
  3. Operating Systems
  4. Observability Tools
  5. Applications
  6. CPUs
  7. Memory
  8. File Systems
  9. Disks
  10. Network
  11. Cloud Computing
  12. Benchmarking
  13. Case Study
  14. Appendices (which you SHOULD read)

Wow, a lot of contect, huh? (something to be expected, given that the book is more than 700+ pages). Do not let the size daunts you however. Chapters are self-contained, as the author understands that the book might be read under pressure, and contain useful exercises at the end.
What really makes this book stands out, is not the top-notch technical writing or abundance of useful one-liners, is the fact that the author moves forward and suggests a methodology for troubleshooting and performance analysis, as opposed to the ad-hoc methods of the past (or best case scenario a checklist and $DEITY forbid the use of “blame someone else methodology”). In particular the author suggests the USE methodology, USE standing for Utilization – Saturation – Errors, to methodically and accurately analyze and diagnose problems. This methodology (which can be adapted/expanded at will, last time I checked the book was not written in stone), is worth the price of the book alone.
The author correctly maintains that you must have an X-ray (so to speak) of the system at all times. By utilizing tools such as DTrace (available for Solaris and BSD) or the Linux equivalent SystemTap, much insight can be gained from the internals of a system.
Chapters 5-10 are self-explanatory: the author presents what the chapter is about, common errors and common one-liners used to diagnose possible problems. As said before, chapters aim to be self contained and can be read while actually troubleshooting a live system so no lengthy explanations there. At the end of the chapter, the bibliography section provides useful pointers towards resources for further study, something that is greatly appreciated. Finally, the exercises can be easily transformed to interview questions, which is another bonus.
Cloud computing and the special considerations that is presenting is getting its own chapter and the author tries to keep it platform agnostic (even if employed by a “Cloud Computing” company), which is a nice touch. This is followed by a chapter on useful advice on how to actually benchmark systems and the book ends with a, sadly too short, case study.
The appendices that follow should be read, as they contain a lot of useful one-liners (as if the ones in the book were not enough), concrete examples of the USE method, a guide of porting dtrace to systemtap and a who-is-who in the world of systems performance.
So how to sum up the book? “Incredible value” is one thought that comes to mind, “timeless classic” is another. If you are a systems {operator|engineer|administrator|architect}, this book is a must-have and should be kept within reach at all times. Even if your $DAYJOB does not have systems on the title, the book is going to be useful, if you have to interact with Unix-like systems on a frequent basis.
PS. Some reviews of this book complain about the binding of the book. In three physical copies that I have seen before my eyes, binding was of the highest quality so I do not know if this complain is still valid.


Conference review: Distributed Matters Berlin 2015

“Kept you waiting, huh?” – to start the post with a pop culture reference.
Yesterday, I was privileged enough to attend Distributed Matters Berlin 2015. The focus of the conference is, you guessed it, distributed systems, often within a NoSQL context. It was hosted at the awesome KulturBrauerei, a refurbished brewery. The format of the conference was 45 minute presentations, including Q&A, thankfully followed by a 15 minute break between talks, in two tracks. The overall level of the presentations was above the average and given that you could only attend one at a time, it made for a hard choice.
Owing to the greatness of Berlin taxi drivers (you know what I am talking about if you used a taxi in Berlin recently), I managed to attend only half of the keynote by @aphyr, so I am not going to comment on this one. My main takeaway is “always, always read the documentation carefully”.
The next presentation I attended was NoSQL meets Microservices, by Michael Hackstein. This one was labelled as beginner. It presented the main paradigms of the NoSQL landscape (KV/Graph/Document), certain topologies and then a presentation of the new-ish ArangoDB, a NoSQL based on V8 Javascript that claims to support all three paradigms at once, eliminating the need for multiple network hops. Overall, it was well presented, if a tad on the product side, and it served nicely to kickoff my conference experience.
After the coffee break, where I was lucky enough to meet some old colleagues from $DAYJOB-1, I attended A tale of queues, from ActiveMQ over Hazelcast to Disque. @xeraa presented his journey with various queueing solutions. He kicked off by stating that the hard problem in distributed systems is exactly once delivery and guaranteed delivery. He then presented the landscape of existing message queues, giving the rationale behind deciding what to use and, more importantly, what not to use. The talk was quite technical, giving me a lot of pointers for future research, overall a solid talk, well done!
It was followed by @pcalcado and No Free Lunch, Indeed: Three Years of Microservices at Soundcloud. Phil has amazing presentation skills and described the journey of Soundcloud from a monolithic Ruby on Rails app, towards a microservices oriented architecture. What I liked most about this presentation was not just the great technical content but also the honestly. Evolving your architecture is no trivial task and the road to it is full of potential pitfalls. Phil was kind enough to share some of his hard gained experience with us, greatly appreciated.
The lunch break was BAD, ’nuff said. Too long a queue and the food, by the time I got there, the good stuff was gone.
After the lunch, I attended Scalable and Cost Efficient Server Architecture by Matti Palosuo. One of the more solid talks, this no-frills presentation did what said on the tin: presented the service infrastructure behind EA’s Sim City Build It mobile title. Dealing with mobile, casual games  presents a unique challenge service-wise and Matti covered all angles in his presentation, diving deep into specifics of their implementation.
The next presentation was Containers! Containers! Containers! And now? by Michael Hausenblas. I am not going to comment a lot on this one, since it had no slides and it was more like a tech demo. Mesos is an AMAZING product and I would have preferred some technical discussion, as opposed to a hands-on demo, but hey! this is just me.
Microservices with Netflix OSS and Spring Cloud by Arnaud Cogoluegnes was the next presentation that I attended. It focused on FOSS software by Netflix and how it can be utilized by the form of Java decorators within an application context. Useful and well presented, the only thing I personally did not like was certain slides full of code but this does not take away from the value of the presentation. Bonus point is that, for a Java engineer, this presentation was immediately actionable, with some nice coding takeaways.
Before proceeding with the next presentation, the astute reader of this blog should have noticed by now a pattern forming: microservices. The topic of the next talk was no exception Microservices – stress-free and without increased heart-attack risk by Uwe Friedrichsen. I really loved this talk. Uwe has a strong opinion regarding microservices (and the experience to back it up). In a nutshell, while microservices can be viable, one should keep a clear head and not fall into the trap of hype-driven architecture. This was my favorite talk of the conference and without further ado, here are the slides. I cannot speak more highly about this presentation so please, have a look at the slides. It was extremely nice to deconstruct the microservices hype and present a realistic case.
It was time for the last talk. The choice was between Antirez’s disque implementation talk and Just Queue it! by Marcos Placona. I decided to give the underdog a chance, given that almost everyone went to Antirez’s presentation (which I am sure it was excellent) and went to Marcos’ presentation instead. I was not disappointed, Marcos described his experience with using MQ while migrating a project and gave another overview of the MQ landscape.
After that, I had some food and some orange juice and decided to call it a day. Overall, it was quite a nice conference, good talks, not a lot of marketing and I will definitely visit the next one, if I am able. Met some interesting people as well and grabbed a lot of pointers for future research. Kudos to the organizers.
See you in DevOps Days Berlin 2015.


Book Review: DevOps Troubleshooting

Hello everyone and welcome back for another book review at woktime. Today’s edition is a short review of a short book called “DevOps Troubleshooting: Linux Server Best Practices”. Without further ado, below is the Table Of Contents

    1. Troubleshooting best practices
    1. Why is the server so slow? Running out of CPU, RAM and Disk I/O
    1. Why won’t the system boot? Solving boot problems
    1. Why can’t write to the disk? Solving full or corrupt disk issues
    1. Is the server down? Tracking down the source of network problems
    1. Why won’t the hostnames resolve? Solving DNS server issues
    1. Why didn’t my email go through? Tracing email problems
    1. Is the website down? Tracking down web server problems
    1. Why is the database slow? Tracking down database problems
  1. It’s the hardware’s fault? Diagnosing common hardware problems

So let’s start at the title. “DevOps” can be an overloaded term – it means different things to different people and unfortunately an “according-to-Hoyle” definition does not exists. I belong in the train of thought that DevOps is more of a cultural movement within an organization than say, a specific job title, so the title of the book “DevOps troubleshooting” is meaningless (I would have strongly preferred the term “Linux Systems Troubleshooting”, as it would have been more accurate for reasons that I am going to explain below).
The author is clearly experienced within the realm of Linux administration and he attempts to cover a broad range of topics. The book is approximately 205 pages long, which means that it will never get too deep within a subject, opting instead to cover as many topics as possible. The writing style of the author is quite readable and he goes out of his way to explain things in relative detail and on the really plus side of the book, there are no glaring errors – proofreaders and the author really did went the extra mile to ensure that content was accurate in the vast number of examples this book is providing.
However, my gripe with the book is that the material covered is really basic. Granted, the intended audience is not a veteran system administrator or engineer – this book by its own admission is aimed towards developers or QA personnel that, owing to some definition of DevOps, are thrown into operational duties. The author makes an effort NOT to use random based troubleshooting, however a complete methodology is never introduced.
Overall, this is a well-written book that provides value to a non-operations member of a team doing operations or for a novice system administrator. Its small size makes it portable enough to be carried around as a level-1 reference, however for system level debugging there are better options out there (keep watching this space for the definite follow up on this sentence).

Book Review Uncategorized

Book Review: PostgreSQL Replication

So for my series of System Engineering books, I will proceed with a short review of PostgreSQL Replication by Packt. The reason this book came to be a part of my collection is that while there is a lot of information regarding PostgreSQL replication out there, a lot of it is out of date, given the overhaul of the replication system in PostgreSQL 9.X. Without further ado, here is the list of contents of the book.

  • Understanding Replication Concepts
  • Understanding the PostgreSQL Transaction Log
  • Understanding Point-In-Time Recovery
  • Setting up asynchronous replication
  • Setting up synchronous replication
  • Monitoring your setup
  • Understanding Linux High-Availability
  • Working with pgbouncer
  • Working with PgPool
  • Configuring Slony
  • Using Skytools
  • Working with Postgres-XC
  • Scaling with PL/Proxy
    The book gets straight into business with an introduction of replication concepts, and why this is a hard problem that cannot be a one-size fits all solution. Topics such as master-master replication and sharding are addressed as well. After this short introduction, specifics of PostgreSQL are examined, with a heavy focus on XLOG and related internals. The book goes into a nice balanced amount of detail, detailed enough to surpass the trivial level but not overwhelming (and thank $DEITY, we are spared source code excerpts, although a few references would be nice for those that are willing to dig further into implementation details), providing a healthy amount of background information. With that out of the way, a whole chapter is devoted to the topic of Point-In-Time-Recover (PITR for now on). PITR is an invaluable weapon in the arsenal of any DBA and gets a fair and actionable treatise, actionable meaning that you will walk away from this chapter with techniques you can start implementing right away.With the theory and basic vocabulary defined, the book then dives into replication. Concepts are explained, as well as drawbacks of each technique, alongside with specific technical instructions on how to get there, including a Q&A on common issues that you may encounter in the field.
    PostgreSQL has a complex ecosystem and once the actual built-in replication mechanisms are explained, common tools are presented (with the glaring omission of Bucardo unfortunately). This is where the book falters a bit, given the excellent quality of the replication related chapters. The presentation of the tools is not even nor deep in all cases – my gripe is that the Linux-HA chapter stops when it starts to get interesting. Having pointed this out, still these chapters can be better written and more concise than information scattered around in the web. I have paid particular attention to the PgPool chapter, which does not cover PgPool-HA (hint: there is more than one way to do it). These chapters assume no previous exposure to the ecosystem so they serve as a gentle (and again, actionable) introduction to the specific tools but I would have preferred them to be 10-15 pages longer each, providing some additional information, especially on the topic of high-availability. Even as-is, these chapters will save you a lot of time searching and compiling information, filling in a few blanks along the way, so, make no mistake, they are still useful. Bonus points for covering PostgreSQL-XC, which is somewhat of an underdog.
    A small detail is that examples in the book tend to focus on Debian-based systems so if you are administering a Red Hat derivative you should adapt the examples slightly, taking into consideration the differences in the packaging of PostgreSQL. Overall, the book goes for a broad as opposed to deep approach and can server as a more than solid introductory volume. Inevitably, there is an overlap with the official PostgreSQL manuals, which is to be expected given that they are great. The quality of the book is on par with other Packt Publishing titles, making this an easy to read book that will save you a lot of time for certain use cases.

Book Review: Web Operations: Keeping the Data on Time

For my kickoff of systems engineering book reviews I have chosen this book. While not being technical in the strict sense of the term (if you are looking for code snippets or ready-to-use architecture ideas, look elsewhere), this collection of 17 essays provides a birds-eye view of the relatively new principle of Web Operations. As you will see from the short TOC below, no stone is left unturned and broad coverage is given to a range of subject ranging from NoSQL databases to community management (and all the points in between). This is what you will be getting:

  1. Web Operations: The career
  2. How Picnik Uses Cloud Computing: Lessons Learned
  3. Infrastructure and Application Metrics
  4. Continuous Deployment
  5. Infrastructure As Code
  6. Monitoring
  7. How Complex Systems Fail
  8. Community Management and Web Operations
  9. Dealing with Unexpected Traffic Spikes
  10. Dev and Ops Collaboration and Cooperation
  11. How Your Visitors Feel: User-Facing Metrics
  12. Relational Database Strategy and Tactics for the Web
  13. How to Make Failure Beautiful: The Art and Science of Postmortems
  14. Storage
  15. Nonrelational Databases
  16. Agine Infrastructure
  17. Thing That Go Bump in the Night (and How to Sleep Through Them)

Where can someone starts? Giving a chapter-by-chapter play is not the preferred way – chapters are short and to the point and use a variety of formats – one of them is a long interview for example, so I am going to talk about the overall feel of the book.
The roll-call of the book is impressive. I am sure that if you worked in the field for a little while, names like Theo Schlossnagle, Baron Schwartz, Adam Jacob, Paul Hammond et al, speak for themselves. Every chapter serves as a gentle introduction to the relevant subject matter – this is to be expected as the topics are quite deep and each one carries a huge assorted bibliography. What I particularly like about this book is not only the gentle introduction, it is also written in a way that makes in approachable to technical managers, team leaders and CTOs – chapters such as the one on postmortems and the ones on metrics are prime examples of this. What is awesome is that the book helps you identify problem areas with your current business (for example the lack of using configuration management such as Puppet or Chef) and provide you with actionable information. Extra points for openly acknowledging failure, there are more than two chapters related to it (as the saying goes, if you operate at internet scale, something is always on fire, someplace), including a chapter on how to conduct efficient postmortems. Even non-technical areas such as community management are covered, illustrating that not everything is technology oriented only in the area of running an internet business today.
Your experience with this book will greatly vary. If you are new to the topics at hand, then you might benefit by reading each and every chapter of the book and then revisit it from time to time – as your experience grows, so the number of useful ideas you can get out of this book will increase too. If you are an experiences professional, while this book might not be an epiphany, there is still useful content to apply and perhaps a few additional viewpoints might present themselves.
Overall? An excellent book for everyone involved in running an internet business with a lot of value and a long shelf life.
A final nice point is that proceedings from this book go to a charity, that is a nice touch.


Coming up on Commodity

For the past few months I have been silent, with the last entry being a re-blog from xorl’s (defunct?) blog. That is quite a long time for a writer’s block, eh? Well, here is some insight: professionally I have somewhat moved away from security to towards a systems engineering paradigm. While security still plays an important part both professionally and on my personal time, it is not the dominant focus. Building systems engineering skills is hard work, especially of focus on the engineering part as opposed to the systems part (e.g. systems administrator and systems engineer should not be interchangeable terms). My plan is to publish reviews of books and other resources that I found helpful during my journey, as well as some original hacks that I have made. I have a strict policy of not posting stuff related to $DAYJOB but I am more than willing to share some nuggets of experience. So stay tuned and say hi to the revitalized Commodity blog!


Introduction to Sensu

Introduction to Sensu

Slide deck for an internal presentation I gave on Sensu a few months ago.