joblib-spark
Task parallelizer
Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library.
Joblib Apache Spark Backend
243 stars
9 watching
26 forks
Language: Python
last commit: 7 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| Simplifies creating multiple worker threads to execute tasks in parallel | 72 |
| A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. | 339 |
| A library providing a simple and expressive way to write parallel programs with complex task dependencies. | 8 |
| Automates deployment of an AWS EMR cluster and execution of Spark jobs | 8 |
| Distributed neural network framework for Apache Spark | 604 |
| An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
| Implementations of clustering algorithms using Spark in Scala | 18 |
| A Python library that integrates PySpark and scikit-learn for distributed machine learning | 1,154 |
| Demonstrates using Spark Jobserver to run Apache Spark analytics with Cassandra | 2 |
| A library that brings useful functions from various modern database management systems to Apache Spark | 56 |
| An Elixir module that executes multiple processes in parallel to speed up slow computations | 63 |
| Provides a reusable set of Nextflow subworkflows and processes for creating transient Apache Spark clusters on any infrastructure. | 14 |
| A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
| Provides a C# API for interacting with Apache Spark | 941 |
| Provides compatibility and extensions between Kotlin and Apache Spark for big data processing | 463 |