G-stream

Data streaming clustering algorithm

An implementation of data stream clustering algorithms using Spark Streaming.

GitHub

3 stars
3 watching
0 forks
Language: Scala
last commit: about 8 years ago

Related projects:

Repository Description Stars
tugdualsarazin/spark-clustering Implementations of clustering algorithms using Spark in Scala 18
irvingc/dbscan-on-spark An implementation of the DBSCAN clustering algorithm on top of Apache Spark 184
huawei-noah/streamdm A library that enables efficient big data stream mining using Spark Streaming 492
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
databricks/spark-xml A library that parses and queries XML data in Apache Spark 505
instaclustr/sample-kafkasparkcassandra An introductory Scala app using Apache Spark Streaming to process data from Kafka and write summaries to Cassandra. 23
alexgkendall/optics_clustering A MATLAB implementation of an unsupervised clustering algorithm that groups data points based on their density and reachability distances 58
iralabdisco/pso-clustering An algorithm for unsupervised machine learning tasks involving grouping similar data points into clusters. 68
youweiliang/multi-view_clustering Provides implementations of various multi-view spectral clustering algorithms for data analysis and visualization. 85
percyliang/brown-cluster A C++ implementation of the Brown word clustering algorithm for hierarchical word grouping 424
xuyxu/clustering This repository provides implementations of various clustering and subspace clustering algorithms in MATLAB, including K-means, ISODATA, Mean Shift, DBSCAN, Gaussian Mixture Model, LVQ, Subspace Clustering Algorithms like Subspace K-means and Entropy-Weighting Subspace K-means. 224
khadidjam/dc-dpm A Distributed Clustering algorithm based on Dirichlet Process Mixture Model using Apache Spark 4
e-xpertsolutions/go-cluster Implementation of k-modes and k-prototypes clustering algorithms in Go. 43
databricks/spark-csv A library for parsing and querying CSV data with Apache Spark 1,053
mosaicml/streaming A library for efficient data streaming and training of neural networks on large datasets 1,141