• Prompt caching: 10x cheaper LLM tokens, but how?: A lot of compute is waste when generating output token by token. Llms work by feeding the same prompt into the model over and over again but 1 token longer than the last time and each time doing the same amount of work + a bit more for the next token.
  • How to think about durable execution: A transaction is automated as a series of idempotent steps (sometimes steps must be taken in a specific order) with state stored along the way so an orchestrator can ensure every step completes successfully and retried if needed. Nice programming model this. ( Temporal is similar)
  • Cooking with Claude: Nice use of llm to help cook 2 meal kit dinners at once