- AWS re:Invent 2021 - DynamoDB deep dive: Advanced design patterns: Talk by Rick Houlihan about the evolution of databases from the general ledger (written by hand hundreds of years ago) to the relation and nosql systems we have today
- Note nosql is not non-relational. All data is relational
- oltp systems can be a good fit for nosql
- tradeoff in cpu time vs storage. When cpu was cheaper than disk we’d de-dupe data with 3rd normal form (Codd) by stuffing it in homogeneous buckets (1 table per entity eg order_items) and joining to recover a complete entity which would burn a ton of cpu
- cpu was cheaper than disk and getting cheaper quicker (moore’s law)
- These days storage is getting cheaper every year. Processors have stopped scaling in the same way (as of ~2014)
- SQL stores are good for olap type applications where you don’t understand the access patterns
- When designing a nosql system you want your data model to line up well with how your app will be using it (use cases, access patterns, etc)
- Reads start with indexes
- Create as many indexes as you need
- Indexes represent a novel partitional / grouping of your underlying data
- Indexes will duplicate data
- This is fine
- Operational characteristics are pretty cool
- Scale by partitioning data across storage nodes
- Sharding baked in by use of partition key (primary + secondary key for an index?)
- burst read / write credits for a table (any provisioned capacity not used in the previous 5m becomes available now for surge handling)
- ER Diagraming Crow’s Foot notation: I’ve bumped into er diagrams a few times recently. Here’s a note to remind myself how to read them