- Postmortem writeups: The netflix postmortem template is interesting. :) Reads like a novel or play. Story telling is a theme that came up in the devops / production support / incident response community lately as a way to make this info more approachable and fun to read by others
- Scaling Kafka @ Honeycomb: Honeycomb’s journey of reliability and growth with kafka which by the sound of it is a critical part of their system. (Allows them to buffer events coming from clients before being accepted by long term storage). They use aws graviton2 servers for this and own kafka uptime. ie They’re not using a managed service for it. Storage tiering is a neat idea - hot data on near, fast disks (nvme) -> ebs -> s3
- Data consistency
- Data reconciliation