spark-clustering
Clustering algorithms library
Implementations of clustering algorithms using Spark in Scala
Some Spark implementations of clustering algorithms.
18 stars
6 watching
8 forks
Language: Scala
last commit: about 6 years ago Related projects:
Repository | Description | Stars |
---|---|---|
spark-clustering-notebook/g-stream | An implementation of data stream clustering algorithms using Spark Streaming. | 3 |
irvingc/dbscan-on-spark | An implementation of the DBSCAN clustering algorithm on top of Apache Spark | 184 |
databricks/spark-xml | A library that parses and queries XML data in Apache Spark | 504 |
databricks/tensorframes | Enables manipulation of Apache Spark DataFrames using TensorFlow programs | 749 |
kotlin/kotlin-spark-api | Provides compatibility and extensions between Kotlin and Apache Spark for big data processing | 463 |
e-xpertsolutions/go-cluster | Implementation of k-modes and k-prototypes clustering algorithms in Go. | 43 |
dutrevis/spark-resources-metrics-plugin | A Spark plugin that registers metrics from operational system resources | 0 |
joblib/joblib-spark | Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library. | 243 |
emilbayes/clustering.js | A collection of clustering algorithms implemented in JavaScript | 30 |
sw1sh/frege-spark | An effort to integrate Apache Spark with the Frege programming language | 5 |
xuyxu/clustering | This repository provides implementations of various clustering and subspace clustering algorithms in MATLAB, including K-means, ISODATA, Mean Shift, DBSCAN, Gaussian Mixture Model, LVQ, Subspace Clustering Algorithms like Subspace K-means and Entropy-Weighting Subspace K-means. | 227 |
iralabdisco/pso-clustering | An implementation of a clustering algorithm using Particle Swarm Optimization (PSO), specifically designed to group similar data points together. | 68 |
youweiliang/multi-view_clustering | Provides implementations of various multi-view spectral clustering algorithms for data analysis and visualization. | 87 |
databricks/spark-corenlp | Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks | 422 |
twosigma/flint | A highly optimized time series library for Apache Spark | 1,006 |