Asynch

How do you decide if an operation should be sync or async?

Backend requests should return < 30s. Server side logic should wind down relatively quickly.
Might the op be interrupted? If you have to stop something and then get it started again you’ll have to write code to figure out where to get started again.
CRUD should not be async. Reach into db, return. Usually quick. Not always true but most of the time this is a good rule of thumb.
Switching an api from sync to async is expensive for clients. They have to re-write for a different messaging pattern. Async op usually returns an id that the client uses to poll for status of the requested work.

Distinguish between client / server errors

If a client asks for a resource that doesn’t exist, this should be 4xx. (Don’t want it to affect our slo)
Insufficient resources to process op is a server error.
Default to being an infra error if you’re not sure. You’ll be notified and can decide if it should be client later.