sample-KafkaSparkCassandra

Spark Data Processor

An introductory Scala app using Apache Spark Streaming to process data from Kafka and write summaries to Cassandra.

Introductory sample scala app using Apache Spark Streaming to accept data from Kafka and write a summary to Cassandra.

GitHub

23 stars
38 watching
23 forks
Language: Scala
last commit: almost 6 years ago
Linked from 1 awesome list

netapp-public

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
instaclustr/sample-sparkjobservercassandra Demonstrates using Spark Jobserver to run Apache Spark analytics with Cassandra 2
instaclustr/sample-sparkcassandrawithssl A Spark job demonstrating Cassandra data analytics with SSL encryption 1
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
datastax/spark-cassandra-connector A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. 1,943
svenkreiss/pysparkling A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets 262
ibmstreams/streamsx.kafka A toolkit for integrating Apache Kafka with Stream Processor SPL applications 13
microsoft/mobius Provides a C# API for interacting with Apache Spark 942
internetarchive/sparkling A data processing library built on top of Apache Spark to handle temporal web data 11
dibbhatt/kafka-spark-consumer A tool for consuming messages from Kafka topics using Spark Streaming while maintaining reliability and handling failures 635
spiritlab/spark A research-focused implementation of Apache Spark with homomorphic encryption support 3
irvingc/dbscan-on-spark An implementation of the DBSCAN clustering algorithm on top of Apache Spark 184
yannael/kafka-sparkstreaming-cassandra An environment for experimenting with real-time data processing using Kafka, Spark streaming, and Cassandra 97
databricks/spark-corenlp Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks 422
databricks/spark-csv A library for parsing and querying CSV data with Apache Spark 1,053
janeliascicomp/nextflow-spark Provides a reusable set of Nextflow subworkflows and processes for creating transient Apache Spark clusters on any infrastructure. 14