awesome-database-learning

Database learning resources

A curated list of learning materials for understanding database internals

A list of learning materials to understand databases internals

GitHub

9k stars
206 watching
1k forks
last commit: 3 months ago
Linked from 1 awesome list

awesomeawesome-listblogsbookscoursedatabasepapers

Database Systems (15-445/645) CMU , thanks to
Advanced Database Systems (15-721) CMU , thanks to
Introduction to Database Systems UC Berkeley
Database System Implementation Stanford
Introduction to Database Systems Cornell by Prof. Trummer
Let's Build a Simple Database , thanks to
Database Systems: The Complete Book Stanford
Designing Data-Intensive Applications ,
Database Internals
Foundations of Databases
Readings in Database Systems, 5th Edition
Database Design and Implementation: Second Edition (Data-Centric Systems and Applications)
Principles of Distributed Database Systems, 4th ed
Inside SQLite
Architecture of a Database System
Relational Database Index Design and the Optimizers
Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control
Data Structures and Algorithms for Big Databases
A Journey From A Quick HackTo A High-Reliability Database Engine
How does a relational database work
The Internals of PostgreSQL
Books propose
what is database and its types

Awesome Database Learning / SQL & Relation Algebra

Course Introduction and the Relational Model
Advanced SQL

Awesome Database Learning / Query Optimizer

Database Systems (15-445/645) CMU , thanks to

Awesome Database Learning / Query Optimizer / Database Systems (15-445/645)

Query Planning & Optimization I
Query Planning & Optimization II

Awesome Database Learning / Query Optimizer

数据库内核杂谈 , thanks to

Awesome Database Learning / Query Optimizer / 数据库内核杂谈

数据库内核杂谈(七):数据库优化器(上)
数据库内核杂谈(八):数据库优化器(下)

Awesome Database Learning / Query Optimizer

SQL优化器原理 - 查询优化器综述 , thanks to

Awesome Database Learning / Query Optimizer / Planner Models

数据库内核杂谈 , thanks to

Awesome Database Learning / Query Optimizer / Planner Models / 数据库内核杂谈

数据库内核杂谈(九):开源优化器 ORCA

Awesome Database Learning / Query Optimizer / Planner Models

SQL 查询优化原理与 Volcano Optimizer 介绍 , thanks to
Cascades Optimizer , thanks to
Access Path Selection in a Relational Database Management System 1979, , SIGMOD
Query Processing in Main Memory Database Management Systems 1979, , VLDB
Query Optimization by Simulated Annealing 1987, , SIGMOD
Grammar-like Functional Rules for Representing Query Optimization Alternatives 1988, , SIGMOD
The Volcano Optimizer Generator- Extensibility and Efficient Search 1993, , ICDE
The Cascades Framework for Query Optimization 1995, , IEEE Data engineering Bulltin
An Overview of Query Optimization in Relational Systems 1998, , PODS
LEO – DB2’s LEarning Optimizer 2001, , VLDB
Robust Query Processing through Progressive Optimization 2004, , SIGMOD
Orca: A Modular Query Optimizer Architecture for Big Data 2014, , SIGMOD
Parallelizing Query Optimization on Shared-Nothing Architectures 2016, , VLDB
The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database 2016, , VLDB

Awesome Database Learning / Query Optimizer / Subquery Optimization

SQL 子查询的优化 , thanks to
Calcite 子查询处理 - I (RemoveSubQuery) , thanks to
Calcite 子查询处理 - II (Decorrelate) , thanks to
Orthogonal Optimization of Subqueries and Aggregation 2001, , SIGMOD
Enhanced subquery optimizations in Oracle 2009, , VLDB
Unnesting Arbitrary Queries 2015, , BTW

Awesome Database Learning / Query Optimizer / Join Order Optimization

Analysis of Two Existing and One New Dynamic Programming Algorithm for the Generation of Optimal Bushy Join Trees without Cross Products 2006, , VLDB
How Good Are Query Optimizers, Really? 2015, , VLDB
Adaptive Optimization of Very Large Join Queries 2018, , SIGMOD

Awesome Database Learning / Query Optimizer / Functional Dependency & Physical Properties

Exploiting Functional Dependence in Query Optimization 2000,
Fundamental Techniques for Order Optimization 1996, , SIGMOD
An Efficient Framework for Order Optimization 2004, , ICDE
Incorporating Partitioning and Parallel Plans into the SCOPE Optimizer 2010, , ICDE

Awesome Database Learning / Query Optimizer / Cost Model

Modelling Costs for a MM-DBMS 1996, , in Real-Time Databases
Approximation Schemes for Many-Objective Query Optimization 2014, , SIGMOD
Multi-Objective Parametric Query Optimization 2015, , VLDB

Awesome Database Learning / Query Optimizer / Statistics

Accurate Estimation of the Number of Tuples Satisfying a Condition 1984, , SIGMOD
Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results 1993, , ACM Trans. on Database Systems
Universality of Serial Histograms 1993, , VLDB
Balancing Histogram Optimality and Practicality for Query Result Size Estimation 1995, , SIGMOD
Improved Histograms for Selectivity Estimation of Range Predicates 1996, , SIGMOD
SEEKing the truth about ad hoc join costs 1997, , VLDB
Towards Estimation Error Guarantees for Distinct Values 2000, , SIGMOD/PODS
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports 2001, , VLDB
The History of Histograms 2003, , VLDB
An Improved Data Stream Summary: The Count-Min Sketch and its Applications 2005, , Journal of Algorithms
New Estimation Algorithms for Streaming Data: Count-min Can Do More 2007,
Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors 2009, , VLDB
Histograms Reloaded: The Merits of Bucket Diversity 2010, , SIGMOD
Exploiting Ordered Dictionaries to Efficiently Construct Histograms with Q-Error Guarantees in SAP HANA 2014, , SIGMOD
Adaptive Statistics in Oracle 12c 2017, , VLDB
Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities 2019, , SIGMOD
Deep Unsupervised Cardinality Estimation 2019, , VLDB
NeuroCard: One Cardinality Estimator for All Tables 2020, , VLDB
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Awesome Database Learning / Query Execution

Database Systems (15-445/645) CMU , thanks to

Awesome Database Learning / Query Execution / Database Systems (15-445/645)

Query Execution I
Query Execution II

Awesome Database Learning / Query Execution / Execution Framework

Volcano-An Extensible and Parallel Query Evaluation System 1994, , IEEE Transactions on Knowledge and Data EngineeringFebruary
Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age 2014, , SIGMOD

Awesome Database Learning / Query Execution / Vectorization vs Compilization

Overhead of a Generalized Query Execution Engine 40 over 3 years ago , from , thanks to the Pivotal Engineering team
MonetDB/X100: Hyper-Pipelining Query Execution 2005, , CIDR
Efficiently Compiling Efficient Query Plans for Modern Hardware 2011, , VLDB
Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last 2017, , VLDB
Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask 2018, , VLDB
Adaptive Execution of Compiled Queries 2018, , ICDE

Awesome Database Learning / Query Execution / Join

Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited 2013, , VLDB
Looking Ahead Makes Query Plans Robust 2017, , VLDB

Awesome Database Learning / Query Execution / Hash Table

Database Systems (15-445/645) CMU , thanks to

Awesome Database Learning / Query Execution / Hash Table / Database Systems (15-445/645)

Hash Tables

Awesome Database Learning / Query Execution / Hash Table

Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo) , thanks to
All hash table sizes you will ever need , thanks to

Awesome Database Learning / Query Execution / Bloom Filter

SuRF: Practical Range Query Filtering with Fast Succinct Tries 2018, , SIGMOD

Awesome Database Learning / DDL

Online, Asynchronous Schema Change in F1 2013, , VLDB

Awesome Database Learning / Relational Model

What is a Relational Database? , thanks to
What is a Relational Database? ,thank to

Awesome Database Learning / Relational Model / Codd's Rules

Codd’s Rules for Relational Database Systems , thanks to

Awesome Database Learning / Relational Model / Relational Data Model

Relational model , thanks to

Awesome Database Learning / Relational Model / Relational Algebra

Introduction of Relational Algebra in DBMS , thanks to

Awesome Database Learning / Relational Model / ER to Relational Model

ER Model to Relational Model , thanks to

Awesome Database Learning / Relational Model / SQL - Overview

An Overview of SQL Text Functions , thanks to

Awesome Database Learning / Transaction / Isolation Levels

一致性模型 , thanks to
A Critique of ANSI SQL Isolation Levels 1995, , SIGMOD
Generalized Isolation Level Definitions 2000, , Proceedings of 16th International Conference on Data Engineering

Awesome Database Learning / Transaction / Concurrency Control

Concurrency Control Theory
Two-Phase Locking Concurrency Control
Timestamp Ordering Concurrency Control
Multi-Version Concurrency Control
Multi-Version Concurrency Control (Design Decisions)
Multi-Version Concurrency Control (Protocols)
Multi-Version Concurrency Control (Garbage Collection)
The Notions of Consistency and Predicate Locks in a Database System 1976, , Communications of the ACM
Concurrency Control in Distributed Database Systems 1981, , ACM Computing Surveys
On Optimistic Methods for Concurrency Control 1981, , ACM Transactions on Database Systems
Multiversion Concurrency Control - Theory and Algorithms 1983, , ACM Transactions on Database Systems
Serializable Snapshot Isolation in PostgreSQL 2012, , VLDB
Calvin: Fast Distributed Transactions for Partitioned Database Systems 2012, , SIGMOD
MaaT: effective and scalable coordination of distributed transactions in the cloud 2014, , VLDB
Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores 2014, , VLDB
An Evaluation of the Advantages and Disadvantages of Deterministic Database Systems 2014, , VLDB
Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems 2015, , SIGMOD
An Empirical Evaluation of In-Memory Multi-Version Concurrency Control 2017, , VLDB
An Evaluation of Distributed Concurrency Control 2017, , VLDB
Scalable Garbage Collection for In-Memory MVCC Systems 2019, , VLDB

Awesome Database Learning / Network

Advanced Database Systems (15-721) CMU , thanks to

Awesome Database Learning / Network / Advanced Database Systems (15-721)

Networking Protocols

Awesome Database Learning / Network

The End of Slow Networks: It's Time for a Redesign 2016, , VLDB
Accelerating Relational Databases by Leveraging Remote Memory and RDMA 2016, , SIGMOD
Don't Hold My Data Hostage: A Case for Client Protocol Redesign 2017, , VLDB

Awesome Database Learning / Storage / NoSQL Systems

Bigtable: A Distributed Storage System for Structured Data 2006, , OSDI
Dynamo: Amazon’s Highly Available Key-value Store 2007, , SOSP
PNUTS: Yahoo!’s Hosted Data Serving Platform 2008, , VLDB
Cassandra - A Decentralized Structured Storage System 2010, , SOSP
PNUTS to Sherpa: Lessons from Yahoo!’s Cloud Database 2019, , VLDB

Awesome Database Learning / Storage / Buffer Management

Database Systems (15-445/645) CMU , thanks to

Awesome Database Learning / Storage / Buffer Management / Database Systems (15-445/645)

Buffer Pools

Awesome Database Learning / Storage / Buffer Management

The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time 1987, , SIGMOD
The Five Minute Rule 20 Years Later and How Flash Memory Changes the Rules 2008, , ACM Queue
Managing Non-Volatile Memory in Database Systems 2018, , SIGMOD
LeanStore: In-Memory Data Management Beyond Main Memory 2018, , ICDE
Umbra: A Disk-Based System with In-Memory Performance 2020, , CIDR

Awesome Database Learning / Storage / Disk IO

On Disk IO, Part 1: Flavors of IO , thanks to
On Disk IO, Part 2: More Flavours of IO , thanks to
On Disk IO, Part 3: LSM Trees , thanks to
On Disk IO, Part 4: B-Trees and RUM Conjecture , thanks to
On Disk IO, Part 5: Access Patterns in LSM Trees , thanks to
Ensuring data reaches disk(LWN)
Read, write & space amplification - pick 2 , thanks to
Design Tradeoffs of Data Access Methods 2016, , SIGMOD
Designing Access Methods: The RUM Conjecture 2016, , EDBT

Awesome Database Learning / Storage / B-Tree

B树、B+树索引算法原理(上) thanks to
B树、B+树索引算法原理(下)
Trees Indexes I
Trees Indexes II
OLTP Indexes (B+Tree Data Structures)
The Ubiquitous B-Tree 1979,

Awesome Database Learning / Storage / LSM-Tree

The Log-Structured Merge-Tree (LSM-Tree) 1996, ,
A Comparison of Fractal Trees to Log-Structured Merge (LSM) Trees 2014,
WiscKey: Separating Keys from Values in SSD-conscious Storage 2017, , TOS
LSM-based Storage Techniques: A Survey 2019,

Awesome Database Learning / Storage / Learned Indexes Structures

The Case for Learned Index Structures 2018,
Learning Multi-dimensional Indexes 2019,
XIndex: A Scalable Learned Index for Multicore Data Storage 2020,
RadixSpline: A Single-Pass Learned Index 2020, , , aiDM@SIGMOD
The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds 2020, , , VLDB
From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees 2020,

Awesome Database Learning / Serializing & RPC

Protocol Buffers Developer Guide
gRPC Documntation

Awesome Database Learning / Data Partitioning

TiDB Internal (I) - Data Storage
Partitioning Behavior of DynamoDB , thanks to
Dynamo: Amazon’s Highly Available Key-value Store 2007, , SOSP

Awesome Database Learning / Replication & Consistency

Tick or Tock? Keeping Time and Order in Distributed Databases , thanks to
Consistency Tradeoffs in Modern Distributed Database System Design 2012,
Strong and Efficient Consistency with Consistency-Aware Durability 2020, , FAST 2020

Awesome Database Learning / Consensus

Distributed consensus revised University of Cambridge , a great paper about Consenssus especially Paxos and Paxos-Related algorithms, by Heidi Howard
Ark: A Real-World Consensus Implementation 2014, , CoRR

Awesome Database Learning / Scheduling

Building a Large-scale Distributed Storage System Based on Raft , by Ed Huang
Automated Demand-driven Resource Scaling in Relational Database-as-a-Service 2016, , SIGMOD
Autoscaling Tiered Cloud Storage in Anna 2019, , VLDB
Adaptive HTAP through Elastic Resource Scheduling 2020, , SIGMOD
MorphoSys: Automatic Physical Design Metamorphosis for Distributed Database Systems 2020, , VLDB

Awesome Database Learning / Benchmark & Testing

Use go-ycsb to benchmark different databases (1) , thanks to
Chaos Tools and Techniques for Testing the TiDB Distributed NewSQL Database , thanks to
Creating Custom Sysbench Scripts , thanks to
Benchmarking Cloud Serving Systems with YCSB 2010, , SOCC

Awesome Database Learning / HTAP

TiDB: A Raft-based HTAP Database 2020, , VLDB
F1 Lightning: HTAP as a Service 2020, , VLDB

Awesome Database Learning / TLA+

The TLA+ Video Course

Backlinks from these awesome lists: