• Fixing UUIDv7 (for database use-cases): Sometimes it makes sense to insert db records with primary keys that are increasing for quicker index adds and better spatial locality in terms of clustering data in close together on a disk page. UUID4 is randomly generated like a hash function. When we save data like that, retrieval can appear like random access to disk for range queries. UUID7 starts with a prefix based on a timestamp. There are security concerns about using timestamps around reducing entropy. (Measure of predictability of a soemthing like an id or a password.)
  • OAuth 2.0 Threat Model and Security Considerations: A threat model of the oauth 2 protocol for secure authentication / authorization between a client and server. Included in this thread model is a list of system components that are in scope, assumptions about a system setup, and a list of brainstormed potential attacks per component as well as suggested remediations and criticality of concern. This sort of thing takes time to produce and you get better results by having a group involved.
  • Threat model examples: This is a nice list of different models that can be used as examples.
  • Owasp Threat Modeling Cheat Sheet: Consice guide to how to think about this exercise, when to do it, and when to stop with pointers to deeper material.
  • How to approach threat modeling: Aws article with a high level overview of the phases of threat modelling and how to think about the exercise.
  • Simplify Your Code: Functional Core, Imperative Shell: Short bit of advice about how to structure logic in the small which resonates a bit. Separating the control flow of a business transaction like this make testing easier.
// Bad: Logic and side effects are mixed
function sendUserExpiryEmail(): void {
  for (const user of db.getUsers()) {
    if (user.subscriptionEndDate > Date.now()) continue;
    if (user.isFreeTrial) continue;
    email.send(user.email, "Your account has expired " + user.name + .);
  }
}

// Better
function getExpiredUsers(users: User[], cutoff: Date): User[] {
    return users.filter(user => user.subscriptionEndDate <= cutoff && !user.isFreeTrial);
}

function generateExpiryEmails(users: User[]): Array<[string, string]> {
    return users.map(user =>
        ([user.email, Your account has expired  +user.name + .])
    );
}

// and
email.bulkSend(generateExpiryEmails(getExpiredUsers(db.getUsers(), Date.now())));
  • How Discord Indexes Trillions of Messages: A large multitenant elasticsearch cluster is split up into smaller ones based on how much data should be indexed in a single cluster / primary shard. There is a consideration for very populer / busy discord communities that need special handling. (> 2bil messages to index get their very own elastic cluster) Painful learnings in here but sound understanding of system and usage that helped the team muddle through this …
  • The Journey Before main()_: Talks about the entrypoint in the kernel to start a program that includes how an application binary is loading, interpreted and memory laid out initially before control is passed to main() in user space

Goal of platform engineering

When I’m thinking about a large, fundamental change, “What is the behavioural change we’re looking for …”, “It’s not a tools thing”, “How do you empower a group of people to be better”, “How do preserve trust in a group”, “Not taking something away”, “How to create confidence in making changes and taking risks by creating safety”

  • Platform engineering “creates a space for people to live and work in”
  • Want to preserve sense of ownership of an application / product in the product + development team

More links :)

  • uv is the best thing to happen to the Python ecosystem in a decade: I have used uv a little bit but wow doesn’t it indeed ever feel amazing.
  • Oxide’s hiring process: As we’re starting to get into a hiring frame of mind, I’m on the lookout for how other teams think about this. Some interesting ideas in here:
    • A formal candidate submission package include past work, and past thinking about how work is done (for more senior candidates) is asked for
      • How does this work in an era of ai slop?
    • There is a series of interviews (9 1h slots are made available by the candidate that interviewers book into) that don’t have to be in the same style - the interviewer should do whatever they need to to learn about a candidate
    • And a review with the interviewer group to determine fit (includes the chance to bring up lingering questions that must be answered somehow before a final go/no go decision can be made)
    • Follow up with the candidate is delivered in a respectful, timely manner
  • Bluesky thread about centralization of power in hyperscaler public clouds (eg amazon) by Meredith Whitaker, Signal president: Very important read. She’s speaking to the surprise by customers and other 3rd parties that a large-scale impact to aws infrastructure could bring down Signal. Kinda silly that this has to be said but wow does she say it well. I’ll pull out a few choice quotes:

📣THREAD: It’s surprising to me that so many people were surprised to learn that Signal runs partly on AWS (something we can do because we use encryption to make sure no one but you–not AWS, not Signal, not anyone–can access your comms).

Concerning, bc it indicates that the extent of the concentration of power in the hands of a few hyperscalers is way less widely understood than I’d assumed. Which bodes poorly for our ability to craft reality-based strategies capable of contesting this concentration & solving the real problem. 2/

Running a low-latency platform for instant comms capable of carrying millions of concurrent audio/video calls requires a pre-built, planet-spanning network of compute, storage and edge presence that requires constant maintenance, significant electricity and persistent attention and monitoring. 4/

Instant messaging demands near-zero latency. Voice and video in particular require complex global signaling & regional relays to manage jitter and packet loss. These are things that AWS, Azure, and GCP provide at global scale that, practically speaking, others (in the western context) don’t. 5/

This isn’t ‘’renting a server.’ It’s leasing access to a whole sprawling, capital-intensive, technically-capable system that must be just as available in Cairo as in Capetown, just as functional in Bangkok as Berlin. Particularly given the high stakes use cases of many who rely on Signal. 6/

Such infrastructure costs billions and billions of dollars to provision and maintain, and it’s highly depreciable. In the case of the hyperscalers, the staggering cost is cross-subsidized by other businesses–themselves also massive platforms with significant lockin. 7/

But even if Signal had the billions needed to recreate AWS, it’s not just about money. The talent to run these systems is rare & concentrated. The expertise, the tooling, the playbooks, the very language of modern SRE came out of these hyperscalers, and is now synonymous with ‘the cloud.’ 9/

In short, the problem here is not that Signal ‘chose’ to run on AWS. The problem is the concentration of power in the infrastructure space that means there isn’t really another choice: the entire stack, practically speaking, is owned by 3-4 players. 11/