spark-clustering
Clustering algorithms library
Implementations of clustering algorithms using Spark in Scala
Some Spark implementations of clustering algorithms.
18 stars
6 watching
8 forks
Language: Scala
last commit: about 6 years ago Related projects:
Repository | Description | Stars |
---|---|---|
spark-clustering-notebook/g-stream | An implementation of data stream clustering algorithms using Spark Streaming. | 3 |
irvingc/dbscan-on-spark | An implementation of the DBSCAN clustering algorithm on top of Apache Spark | 184 |
databricks/spark-xml | A library that parses and queries XML data in Apache Spark | 505 |
databricks/tensorframes | Enables manipulation of Apache Spark DataFrames using TensorFlow programs | 749 |
kotlin/kotlin-spark-api | Provides compatibility and extensions between Kotlin and Apache Spark for big data processing | 461 |
e-xpertsolutions/go-cluster | Implementation of k-modes and k-prototypes clustering algorithms in Go. | 43 |
dutrevis/spark-resources-metrics-plugin | A Spark plugin that registers metrics from operational system resources | 0 |
joblib/joblib-spark | Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library. | 242 |
emilbayes/clustering.js | Provides implementations of clustering algorithms in JavaScript | 30 |
sw1sh/frege-spark | An effort to integrate Apache Spark with the Frege programming language | 5 |
xuyxu/clustering | This repository provides implementations of various clustering and subspace clustering algorithms in MATLAB, including K-means, ISODATA, Mean Shift, DBSCAN, Gaussian Mixture Model, LVQ, Subspace Clustering Algorithms like Subspace K-means and Entropy-Weighting Subspace K-means. | 224 |
iralabdisco/pso-clustering | An algorithm for unsupervised machine learning tasks involving grouping similar data points into clusters. | 68 |
youweiliang/multi-view_clustering | Provides implementations of various multi-view spectral clustering algorithms for data analysis and visualization. | 85 |
databricks/spark-corenlp | Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks | 422 |
twosigma/flint | A highly optimized time series library for Apache Spark | 1,003 |