sample-KafkaSparkCassandra
Spark Data Processor
An introductory Scala app using Apache Spark Streaming to process data from Kafka and write summaries to Cassandra.
Introductory sample scala app using Apache Spark Streaming to accept data from Kafka and write a summary to Cassandra.
23 stars
38 watching
23 forks
Language: Scala
last commit: almost 6 years ago
Linked from 1 awesome list
netapp-public
Related projects:
Repository | Description | Stars |
---|---|---|
instaclustr/sample-sparkjobservercassandra | Demonstrates using Spark Jobserver to run Apache Spark analytics with Cassandra | 2 |
instaclustr/sample-sparkcassandrawithssl | A Spark job demonstrating Cassandra data analytics with SSL encryption | 1 |
apache/spark | An analytics engine designed to handle large-scale data processing and analysis | 39,916 |
datastax/spark-cassandra-connector | A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. | 1,943 |
svenkreiss/pysparkling | A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
ibmstreams/streamsx.kafka | A toolkit for integrating Apache Kafka with Stream Processor SPL applications | 13 |
microsoft/mobius | Provides a C# API for interacting with Apache Spark | 942 |
internetarchive/sparkling | A data processing library built on top of Apache Spark to handle temporal web data | 11 |
dibbhatt/kafka-spark-consumer | A tool for consuming messages from Kafka topics using Spark Streaming while maintaining reliability and handling failures | 635 |
spiritlab/spark | A research-focused implementation of Apache Spark with homomorphic encryption support | 3 |
irvingc/dbscan-on-spark | An implementation of the DBSCAN clustering algorithm on top of Apache Spark | 184 |
yannael/kafka-sparkstreaming-cassandra | An environment for experimenting with real-time data processing using Kafka, Spark streaming, and Cassandra | 97 |
databricks/spark-corenlp | Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks | 422 |
databricks/spark-csv | A library for parsing and querying CSV data with Apache Spark | 1,053 |
janeliascicomp/nextflow-spark | Provides a reusable set of Nextflow subworkflows and processes for creating transient Apache Spark clusters on any infrastructure. | 14 |