joblib-spark

Task parallelizer

Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library.

Joblib Apache Spark Backend

243 stars

9 watching

26 forks

Language: Python

last commit: almost 2 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

awesome-spark/awesome-spark

Related projects:

Repository	Description	Stars
shomali11/parallelizer	Simplifies creating multiple worker threads to execute tasks in parallel	72
dmmiller612/sparktorch	A PyTorch implementation on Apache Spark for distributed deep learning model training and inference.	339
clin99/cpp-taskflow	A library providing a simple and expressive way to write parallel programs with complex task dependencies.	8
kcrandall/emr_spark_automation	Automates deployment of an AWS EMR cluster and execution of Spark jobs	8
amplab/sparknet	Distributed neural network framework for Apache Spark	604
apache/spark	An analytics engine designed to handle large-scale data processing and analysis	40,170
tugdualsarazin/spark-clustering	Implementations of clustering algorithms using Spark in Scala	18
lensacom/sparkit-learn	A Python library that integrates PySpark and scikit-learn for distributed machine learning	1,154
instaclustr/sample-sparkjobservercassandra	Demonstrates using Spark Jobserver to run Apache Spark analytics with Cassandra	2
yaooqinn/itachi	A library that brings useful functions from various modern database management systems to Apache Spark	56
stevenjl/parex	An Elixir module that executes multiple processes in parallel to speed up slow computations	63
janeliascicomp/nextflow-spark	Provides a reusable set of Nextflow subworkflows and processes for creating transient Apache Spark clusters on any infrastructure.	14
svenkreiss/pysparkling	A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets	262
microsoft/mobius	Provides a C# API for interacting with Apache Spark	941
kotlin/kotlin-spark-api	Provides compatibility and extensions between Kotlin and Apache Spark for big data processing	463