💻 Computer Science · Databases

Database tricks that make SQL stick

SQL, normalization, ACID, and indexing — memorized.

🗄️ Databases

Memory tricks

Proven mnemonics — fast to learn, hard to forget.

SQL Execution Order
SQL clause order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
SQL Execution Order
The order clauses are executed — not the order they're written
Written: SELECT...FROM...WHERE...GROUP BY...ORDER BY. Executed: FROM first, WHERE filters rows, GROUP BY aggregates, HAVING filters groups, SELECT outputs columns, ORDER BY sorts.
Database Normalization
1NF: atomic values. 2NF: full key dependency. 3NF: no transitive dependency.
Database Normalization
Organize data to eliminate redundancy and anomalies
1NF: eliminate repeating groups, each cell one value. 2NF: non-key attributes depend on whole primary key. 3NF: non-key attributes depend only on the primary key — no A→B→C chains.
1NF
Atomic values, no repeating groups
2NF
Full functional dependency on PK
3NF
No transitive dependencies
Database Indexes
Index = fast lookup at cost of slower writes
Database Indexes
Indexes speed reads but add overhead to writes
Like a book's index — precomputed pointers to data. B-tree indexes: good for range queries. Hash indexes: good for exact matches. Add indexes on columns you frequently search, sort, or join on.
CAP Theorem
CAP theorem: Consistency, Availability, Partition tolerance — distributed systems can only guarantee 2 of 3
CAP Theorem
A fundamental trade-off in distributed database design
Consistency: all nodes see same data simultaneously. Availability: every request gets a response. Partition tolerance: system works despite network failures. In a distributed system, partition tolerance is usually required — choose C or A.
NoSQL Database Types
NoSQL types: document (MongoDB), key-value (Redis), column-family (Cassandra), graph (Neo4j)
NoSQL Database Types
Four categories of non-relational databases
Document: stores JSON-like documents, flexible schema — MongoDB. Key-value: simple lookup by key, extremely fast — Redis (caching), DynamoDB. Column-family: stores data in column groups, good for time-series — Cassandra. Graph: nodes and edges, optimized for relationship queries — Neo4j (social networks, fraud detection).
Document
JSON-like, flexible — MongoDB
Key-value
Fast lookups — Redis
Column-family
Time-series — Cassandra
Graph
Relationships — Neo4j
SQL vs NoSQL
SQL vs NoSQL: SQL = structured, ACID, relational. NoSQL = flexible schema, horizontal scaling, eventual consistency.
SQL vs NoSQL
Choosing the right database for the job
SQL: fixed schema, powerful querying (JOINs), ACID transactions, vertical scaling. Best for: financial systems, complex relationships, strong consistency. NoSQL: flexible/no schema, horizontal scaling (sharding), eventual consistency, optimized for specific access patterns. Best for: big data, high write volume, unstructured data.
Database Transactions
Transactions: BEGIN → statements → COMMIT (or ROLLBACK). All or nothing.
Database Transactions
Ensuring data integrity through atomic operations
Transaction: group of SQL statements treated as a single unit. BEGIN marks start. COMMIT saves all changes permanently. ROLLBACK undoes all changes if anything fails. Example: bank transfer — debit one account AND credit another must both succeed or both fail. Isolation levels control how concurrent transactions interact.
Stored Procedures and Triggers
Stored procedure: precompiled SQL code stored in database. Trigger: automatically executes on data changes.
Stored Procedures and Triggers
Database-side logic for performance and automation
Stored procedure: named, precompiled SQL code — called by applications. Faster than sending raw SQL (precompiled, less network traffic). Triggers: automatically execute when specified database event occurs (INSERT, UPDATE, DELETE). Use cases: audit logging, enforcing business rules, derived value calculation.
Query Optimization
Query optimization: use EXPLAIN to see execution plan. Index on WHERE and JOIN columns.
Query Optimization
How to make slow database queries fast
EXPLAIN (MySQL) or EXPLAIN ANALYZE (PostgreSQL): shows query execution plan. Look for: full table scans (bad), index seeks (good). Optimization: add indexes on columns in WHERE, JOIN, and ORDER BY clauses. Avoid SELECT * — specify columns. Avoid functions on indexed columns in WHERE (prevents index use). Batch operations.
Data Warehouses
Data warehouse vs OLTP: warehouse for analytics (read-heavy). OLTP for transactions (write-heavy).
Data Warehouses
Databases optimized for analytics rather than transactions
OLTP (Online Transaction Processing): optimized for many small, fast transactions — normalized, row-oriented storage. Data warehouse (OLAP): optimized for complex analytical queries across large datasets — denormalized (star schema), column-oriented storage. ETL: Extract, Transform, Load data from OLTP to warehouse. Examples: Snowflake, Redshift.
Database Sharding
Sharding: split database horizontally across multiple servers. Each shard holds a subset of rows.
Database Sharding
Scaling databases horizontally for massive data volumes
Vertical scaling (scale up): more powerful server — has limits. Horizontal scaling (scale out): multiple servers. Sharding: partition data across servers by shard key (user ID range, geography). Challenges: joins across shards are expensive, resharding is painful, transactions across shards are complex. Used by Facebook, Twitter, Uber.