• The Complexity of Simplicity: Bryan Cantrill talk about the different kinds of complexity and some is intrinsic to a problem and some is introduced by a well intentioned developers and operators sometimes. Great talk.
  • P99 CONF 2025 Finding Performance Needles in Haystacks with APerf by Geoffrey Blake: Aperf sounds like a neat little tool. Grabs metrics from most of the places mentioned in Systems Performance by Brendan Gregg all at once?
  • Strange Loop Phenomena: System component A depends on system component B that also depends on A. Most of the time this can be ok but during an incident weird behaviour may emerge!
  • Cloudflare outage on November 18, 2025: Cloudflare went down hard last week and took us with them. We’ll have to think about how to survive this kind of major service provider outage in the future. Sounds like they auto-regen configuration for their proxy layer and that was buggered up.
  • Dependency cooldowns: We’ve talked about doing something like this at work. For the kind of 0-day vulnerability that we see most, it turns out just waiting a bit before picking up a new version well get you a long way to safety.
  • What Now? Handling Errors in Large Systems: When should we choose to crash a process on error vs try to carry on is an interesting question. Mostly carrying on is a good thing, but you should be able to get to a stable state if that’s what you decide to do …