datasketches-java
Data Processing Library
A software library of stochastic streaming algorithms, providing efficient data processing and analysis tools
A software library of stochastic streaming algorithms, a.k.a. sketches.
896 stars
58 watching
209 forks
Language: Java
last commit: 19 days ago
Linked from 1 awesome list
datasketches
Related projects:
Repository | Description | Stars |
---|---|---|
apache/streampipes | A toolbox for industrial data analytics and stream processing | 607 |
apache/systemds | An end-to-end data science platform that integrates data integration, machine learning model training, and deployment | 1,036 |
apache/spark | An analytics engine designed to handle large-scale data processing and analysis | 40,002 |
netflix/staash | A tool to abstract storage details and automate common data access patterns for developers working with relational technologies | 209 |
svenkreiss/pysparkling | A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
apache/samza | A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees | 819 |
yoshuawuyts/normcore | A JavaScript library that enables the creation of stable, decentralized data streams using hypercore | 28 |
skyhacks/nerds | An API that provides random data from various nerdy franchises. | 109 |
datastax/spark-cassandra-connector | A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. | 1,942 |
deepak-malik/data-structures-in-java | A collection of Java implementations of various data structures and algorithms used in computer science | 145 |
evilsoft/crocks | A collection of well-known Algebraic Data Types and their associated helper functions for functional programming in JavaScript. | 1,592 |
jason-kerney/peelandslice.java | A Java implementation of a self-contained, serverless, and zero-configuration data processing framework | 1 |
apache/tez | A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks | 480 |
apache/druid | A high-performance real-time analytics database for fast queries and ingest | 13,523 |
joshsh/ripple | A programming language and runtime environment for creating data-driven programs with a focus on Linked Data and RDF data sources | 101 |