Debugging distributed systems
Nowadays, most software projects are distributed systems: components are located on different networked computers which communicate and coordinate their actions by passing messages. Debugging distributed systems is not easy. When two components don’t play nice together, the cause could be virtually anything: software, DNS, routing, firewalls, proxies, load balancers, TLS.. and more! In this talk, I’ll share my experience with debugging distributed systems. We’ll look at typical issues and I’ll introduce a structured ways to debug those issues and find their root causes. We’ll dive into networking, infrastructure, logging/tracing/metrics, testing, remote debugging and more. I’ll share lots of examples and war stories along the way. After this talk, you’ll have practical knowledge on how and where to get started with debugging distributed systems yourself!