awesome-system-design

System design resource

A curated collection of resources and articles on designing distributed systems, microservices, and related topics for software development.

A curated list of awesome System Design (A.K.A. Distributed Systems) resources.

GitHub

10k stars
199 watching
1k forks
last commit: 7 months ago
Linked from 1 awesome list

distributed-systemshadoop-ecosysteminterviewmessage-brokermicroservicesmicroservices-architecturenosqlrelational-databasestream-processing

Videos / Advanced

The evolution of Reddit Architecture Overview of how Reddit system design scaled
6.824 Distributed Systems by MIT Graduate level course on distributed systems from MIT (2020)
CSE138 Distributed Systems by UCSC Undergraduate course on distributed systems from UCSC (2020)

Tools / Resource Management

Kubernetes Highly popular way to deploy, manage and automatically scale a cluster of containers on bare-metal or virtual servers

Tools / Hadoop Ecosystem / Dashboard

Ambari Dashboard that integrates most of hadoop related technologies for easy management and executions

Tools / Hadoop Ecosystem / Workflow Scheduler

Oozie Create workflows in xml to execute jobs (from other hadoop-ecosystem applications) in steps, allows for parallel execution as well

Tools / Hadoop Ecosystem / Query

Hive Query hadoop stored data in SQL
Pig Scriping language that looks like SQL to query hadoop data

Tools / Hadoop Ecosystem / Processing

Tez Solves a similar problem to Spark and MapReduce, it's more efficient than MapReduce because it calculates the most efficient way of doing it
Map Reduce MapReduce, as the name implies, maps data and reduce the results
Spark Powerful data processing to not only process data like Tez (and MapReduce), it can process streams of data in real time, apply regression analysis algorithms in ML and much more
Apex *Retired project, it's a YARN-native platform that unifies stream and batch processing

Tools / Hadoop Ecosystem / DB

HBase [3.6k ⭐] - Modeled after Google's Bigtable and written in Java. Developed as a part of Apache Hadoop project

Tools / Hadoop Ecosystem / Resource Management

YARN 'Yet Another Resource Negotiator', works like a kernel to manage computer resources across the clusters
MESOS Works like a Linux Kernel by managing CPU, memory, storage and other resources across the cluster

Backlinks from these awesome lists: