- Brendan Gregg observability: “Tools, data sources, methods of understanding how a system is operating.” Without having to change code. He talks about observing vs experimenting which comes down to tools that read only vs those that perturb the system.
- Dropbox disaster recovery planning: Story of how dropbox came to their active-passive model of architecting services and how they test their system to make sure it can survive a full datacenter loss. Good stuff! It took years of planning, engineering and tries to get to where they are …
- Solving hard problems: Sounds like pseudocode driven dev from code complete. Nice article about his process. I probably do something similar for harder things … :)