- Prompt caching: 10x cheaper LLM tokens, but how?: A lot of compute is waste
when generating output token by token. Llms work by feeding the same prompt into the model over and over again but 1
token longer than the last time and each time doing the same amount of work + a bit more for the next token.
- How to think about durable execution: A transaction is automated as a
series of idempotent steps (sometimes steps must be taken in a specific order) with state stored along the way so an
orchestrator can ensure every step completes successfully and retried if needed. Nice programming model this. (
Temporal is similar)
- Cooking with Claude: Nice use of llm to
help cook 2 meal kit dinners at once