SparkLearning
Spark guide
A comprehensive resource for learning Apache Spark, covering its core concepts, components, and advanced topics.
A comprehensive Spark guide collated from multiple sources that can be referred to learn more about Spark or as an interview refresher.
649 stars
19 watching
73 forks
last commit: over 2 years ago
Linked from 1 awesome list
big-datapysparkspark
Related projects:
Repository | Description | Stars |
---|---|---|
sparklyr/sparklyr | An R interface to Apache Spark for distributed data analysis and machine learning | 957 |
apache/spark | An analytics engine designed to handle large-scale data processing and analysis | 40,002 |
dotnet/spark | Provides high-performance APIs for using Apache Spark with .NET | 2,026 |
tweag/sparkle | A tool for creating resilient, scalable analytics applications with Haskell on top of Apache Spark | 447 |
tubular/sparkly | A set of Python libraries and tools to simplify interactions with various data sources using Apache Spark. | 60 |
amplab-extras/sparkr-pkg | Provides a lightweight R interface to Apache Spark for data processing | 641 |
kotlin/kotlin-spark-api | Provides compatibility and extensions between Kotlin and Apache Spark for big data processing | 463 |
gorillalabs/sparkling | A Clojure API for interacting with Apache Spark | 448 |
lensacom/sparkit-learn | A Python library that integrates PySpark and scikit-learn for distributed machine learning | 1,155 |
sw1sh/frege-spark | An effort to integrate Apache Spark with the Frege programming language | 5 |
tkych/cl-spark | A utility for generating simple, visually appealing data visualizations from numeric data sets. | 96 |
dmmiller612/sparktorch | A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. | 339 |
nchammas/flintrock | A command-line tool for launching and managing Apache Spark clusters on AWS | 638 |
ondra-m/ruby-spark | A Ruby wrapper around Apache Spark's functionality for large-scale data processing | 227 |
sorenmacbeth/flambo | A Clojure-based interface to Apache Spark, enabling efficient data processing and manipulation in cluster computing environments. | 606 |