awesome-system-design
System design resource
A curated collection of resources and articles on designing distributed systems, microservices, and related topics for software development.
A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
10k stars
199 watching
1k forks
last commit: over 1 year ago
Linked from 1 awesome list
distributed-systemshadoop-ecosysteminterviewmessage-brokermicroservicesmicroservices-architecturenosqlrelational-databasestream-processing
Videos / Advanced | |||
| The evolution of Reddit Architecture | Overview of how Reddit system design scaled | ||
| 6.824 Distributed Systems by MIT | Graduate level course on distributed systems from MIT (2020) | ||
| CSE138 Distributed Systems by UCSC | Undergraduate course on distributed systems from UCSC (2020) | ||
Tools / Resource Management | |||
| Kubernetes | Highly popular way to deploy, manage and automatically scale a cluster of containers on bare-metal or virtual servers | ||
Tools / Hadoop Ecosystem / Dashboard | |||
| Ambari | Dashboard that integrates most of hadoop related technologies for easy management and executions | ||
Tools / Hadoop Ecosystem / Workflow Scheduler | |||
| Oozie | Create workflows in xml to execute jobs (from other hadoop-ecosystem applications) in steps, allows for parallel execution as well | ||
Tools / Hadoop Ecosystem / Query | |||
| Hive | Query hadoop stored data in SQL | ||
| Pig | Scriping language that looks like SQL to query hadoop data | ||
Tools / Hadoop Ecosystem / Processing | |||
| Tez | Solves a similar problem to Spark and MapReduce, it's more efficient than MapReduce because it calculates the most efficient way of doing it | ||
| Map Reduce | MapReduce, as the name implies, maps data and reduce the results | ||
| Spark | Powerful data processing to not only process data like Tez (and MapReduce), it can process streams of data in real time, apply regression analysis algorithms in ML and much more | ||
| Apex | *Retired project, it's a YARN-native platform that unifies stream and batch processing | ||
Tools / Hadoop Ecosystem / DB | |||
| HBase | [3.6k ⭐] - Modeled after Google's Bigtable and written in Java. Developed as a part of Apache Hadoop project | ||
Tools / Hadoop Ecosystem / Resource Management | |||
| YARN | 'Yet Another Resource Negotiator', works like a kernel to manage computer resources across the clusters | ||
| MESOS | Works like a Linux Kernel by managing CPU, memory, storage and other resources across the cluster | ||