SparkInternals

Spark Analysis

An in-depth analysis of Apache Spark's design and implementation

Notes talking about the design and implementation of Apache Spark

GitHub

5k stars
618 watching
2k forks
last commit: 9 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
spark-notebook/spark-notebook An interactive web-based editor for exploring and analyzing large datasets using Scala, Apache Spark, and other data science tools 3,155
spark-jobserver/spark-jobserver Provides a RESTful interface for managing Apache Spark jobs and contexts 2,839
dotnet/spark Provides high-performance APIs for using Apache Spark with .NET 2,032
kubeflow/spark-operator An operator that automates the lifecycle of Apache Spark applications on Kubernetes 2,816
johnsnowlabs/spark-nlp Provides a set of pre-trained models and libraries for natural language processing tasks on top of Apache Spark 3,889
databricks/koalas A Python package that allows users to work with pandas DataFrames on top of Apache Spark 3,343
apache/spark An analytics engine designed to handle large-scale data processing and analysis 40,170
ankurchavda/sparklearning A comprehensive resource for learning Apache Spark, covering its core concepts, components, and advanced topics. 655
perwendel/spark A lightweight Java web framework with a Kotlin DSL for building simple web applications. 9,655
microsoft/mobius Provides a C# API for interacting with Apache Spark 941
databricks/learning-spark Examples and tutorials for learning Spark using Java and Scala 3,892
kotlin/kotlin-spark-api Provides compatibility and extensions between Kotlin and Apache Spark for big data processing 463
mrpowers-io/spark-fast-tests A testing helper library for Apache Spark applications. 437
sw1sh/frege-spark An effort to integrate Apache Spark with the Frege programming language 5
nchammas/flintrock A command-line tool for launching and managing Apache Spark clusters on AWS 637