spark
Data processor
An analytics engine designed to handle large-scale data processing and analysis
Apache Spark - A unified analytics engine for large-scale data processing
40k stars
2k watching
28k forks
Language: Scala
last commit: 2 months ago
Linked from 9 awesome lists
big-datajavajdbcpythonrscalasparksql
Related projects:
Repository | Description | Stars |
---|---|---|
| A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. | 1,944 |
| A data processing library built on top of Apache Spark to handle temporal web data | 11 |
| Provides high-performance APIs for using Apache Spark with .NET | 2,032 |
| A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
| A research-focused implementation of Apache Spark with homomorphic encryption support | 3 |
| A library that parses and queries XML data in Apache Spark | 504 |
| An introductory Scala app using Apache Spark Streaming to process data from Kafka and write summaries to Cassandra. | 23 |
| A tool for creating resilient, scalable analytics applications with Haskell on top of Apache Spark | 447 |
| A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees | 817 |
| Provides compatibility and extensions between Kotlin and Apache Spark for big data processing | 463 |
| A library for parsing and querying CSV data with Apache Spark | 1,052 |
| An implementation of the DBSCAN clustering algorithm on top of Apache Spark | 184 |
| Provides a C# API for interacting with Apache Spark | 941 |
| A framework for efficient data processing and extraction from archival collections, enabling the transformation of raw data into more accessible formats. | 145 |
| An R interface to Apache Spark for distributed data analysis and machine learning | 955 |