Databases Memory Tricks

ACID Properties

ACID: Atomicity Consistency Isolation Durability

ACID Properties

Four properties that guarantee reliable database transactions

Atomicity: all or nothing. Consistency: valid state to valid state. Isolation: concurrent transactions don't interfere. Durability: committed transactions survive crashes.

A

Atomicity — all or nothing

C

Consistency — valid states only

I

Isolation — concurrent transactions safe

D

Durability — survives crashes

SQL Execution Order

SQL clause order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY

SQL Execution Order

The order clauses are executed — not the order they're written

Written: SELECT...FROM...WHERE...GROUP BY...ORDER BY. Executed: FROM first, WHERE filters rows, GROUP BY aggregates, HAVING filters groups, SELECT outputs columns, ORDER BY sorts.

Database Normalization

1NF: atomic values. 2NF: full key dependency. 3NF: no transitive dependency.

Database Normalization

Organize data to eliminate redundancy and anomalies

1NF: eliminate repeating groups, each cell one value. 2NF: non-key attributes depend on whole primary key. 3NF: non-key attributes depend only on the primary key — no A→B→C chains.

1NF

Atomic values, no repeating groups

2NF

Full functional dependency on PK

3NF

No transitive dependencies

Database Indexes

Index = fast lookup at cost of slower writes

Database Indexes

Indexes speed reads but add overhead to writes

Like a book's index — precomputed pointers to data. B-tree indexes: good for range queries. Hash indexes: good for exact matches. Add indexes on columns you frequently search, sort, or join on.

CAP Theorem

CAP theorem: Consistency, Availability, Partition tolerance — distributed systems can only guarantee 2 of 3

CAP Theorem

A fundamental trade-off in distributed database design

Consistency: all nodes see same data simultaneously. Availability: every request gets a response. Partition tolerance: system works despite network failures. In a distributed system, partition tolerance is usually required — choose C or A.

NoSQL Database Types

NoSQL types: document (MongoDB), key-value (Redis), column-family (Cassandra), graph (Neo4j)

NoSQL Database Types

Four categories of non-relational databases

Document: stores JSON-like documents, flexible schema — MongoDB. Key-value: simple lookup by key, extremely fast — Redis (caching), DynamoDB. Column-family: stores data in column groups, good for time-series — Cassandra. Graph: nodes and edges, optimized for relationship queries — Neo4j (social networks, fraud detection).

Document

JSON-like, flexible — MongoDB

Key-value

Fast lookups — Redis

Column-family

Time-series — Cassandra

Graph

Relationships — Neo4j

SQL vs NoSQL

SQL vs NoSQL: SQL = structured, ACID, relational. NoSQL = flexible schema, horizontal scaling, eventual consistency.

SQL vs NoSQL

Choosing the right database for the job

SQL: fixed schema, powerful querying (JOINs), ACID transactions, vertical scaling. Best for: financial systems, complex relationships, strong consistency. NoSQL: flexible/no schema, horizontal scaling (sharding), eventual consistency, optimized for specific access patterns. Best for: big data, high write volume, unstructured data.

Database Transactions

Transactions: BEGIN → statements → COMMIT (or ROLLBACK). All or nothing.

Database Transactions

Ensuring data integrity through atomic operations

Transaction: group of SQL statements treated as a single unit. BEGIN marks start. COMMIT saves all changes permanently. ROLLBACK undoes all changes if anything fails. Example: bank transfer — debit one account AND credit another must both succeed or both fail. Isolation levels control how concurrent transactions interact.

Stored Procedures and Triggers

Stored procedure: precompiled SQL code stored in database. Trigger: automatically executes on data changes.

Stored Procedures and Triggers

Database-side logic for performance and automation

Stored procedure: named, precompiled SQL code — called by applications. Faster than sending raw SQL (precompiled, less network traffic). Triggers: automatically execute when specified database event occurs (INSERT, UPDATE, DELETE). Use cases: audit logging, enforcing business rules, derived value calculation.

Query Optimization

Query optimization: use EXPLAIN to see execution plan. Index on WHERE and JOIN columns.

Query Optimization

How to make slow database queries fast

EXPLAIN (MySQL) or EXPLAIN ANALYZE (PostgreSQL): shows query execution plan. Look for: full table scans (bad), index seeks (good). Optimization: add indexes on columns in WHERE, JOIN, and ORDER BY clauses. Avoid SELECT * — specify columns. Avoid functions on indexed columns in WHERE (prevents index use). Batch operations.

Data Warehouses

Data warehouse vs OLTP: warehouse for analytics (read-heavy). OLTP for transactions (write-heavy).

Data Warehouses

Databases optimized for analytics rather than transactions

OLTP (Online Transaction Processing): optimized for many small, fast transactions — normalized, row-oriented storage. Data warehouse (OLAP): optimized for complex analytical queries across large datasets — denormalized (star schema), column-oriented storage. ETL: Extract, Transform, Load data from OLTP to warehouse. Examples: Snowflake, Redshift.

Database Sharding

Sharding: split database horizontally across multiple servers. Each shard holds a subset of rows.

Database Sharding

Scaling databases horizontally for massive data volumes

Vertical scaling (scale up): more powerful server — has limits. Horizontal scaling (scale out): multiple servers. Sharding: partition data across servers by shard key (user ID range, geography). Challenges: joins across shards are expensive, resharding is painful, transactions across shards are complex. Used by Facebook, Twitter, Uber.

Database tricks that make SQL stick

Memory tricks